# Random Benefit Units for Households II: Generating the Number of Subrows

In my previous post, I assumed my household data would give me the number of children each household has. But suppose I had to generate those numbers too? This is just a note to say that one can do this using the base-R function `sample.int` .

If I understand its documentation correctly, then the call

```sample.int( 4, size=100, replace=TRUE, prob=c(0.1,0.4,0.2,0.1) )
```
will give me a vector of 100 elements. Each element is an integer between 1 and 4: that’s what the first argument determines. And the probabilities of their occurrence are given by the `prob` argument.

This seems to work. Let me generate such a vector (but much bigger to reduce sampling error) and tabulate the frequencies of its elements using `table`:

```x <- sample.int( 4, size=1000000, replace=TRUE, prob=c(0.1,0.4,0.2,0.1) )
t <- table(x)
t/t[1]
```

Then my first runs give me:

```        1         2         3         4
1.0000000 3.9331806 1.9725140 0.9925756
1.0000000 3.9757329 1.9855526 0.9899258
1.0000000 3.9984735 2.0017902 0.9916804
1.0000000 3.9942205 1.9904766 0.9979963
1.0000000 3.9952263 2.0040621 0.9968735
```

Are these close enough? The second and fourth sets of frequencies are always slightly below what I’d expect. So I may be missing some sublety. On the other hand, it’s good enough for the testing I’m doing, as this mainly has to certify that my joins and other data-handling operations are correct.