Geometric Random Variable:
It can be shown that a Geometric random variable can be
simulated using the following argument (int(ln(u)/ln(1-p)) + 1)
where u is a uniform(0,1) random variable and p is the probability
of observing a success (Simulation by Ross, 2003). In this example we are going to generate
a Geometric
random variable with 1000 observations with probability of success p = 0.25.
clear set obs 1000 local p = .25 gen u = uniform( ) gen g = int(ln(u)/ln(1-`p')) + 1
Negative Binomial Random Variable as a sum of independent Geometric Random
Variables:
To generate a Negative Binomial random variable we make use
of the fact that a Negative Binomial random variable is sum of r
independent Geometric random variables, where r is the of trials required
to observe the rth success and p is the probability of a success. In this
example, we are going to simulate p = 0.25 and let r = 4. Note
that unlike in the geometric case, the uniform(0,1) is defined within the
foreach argument so that the r trials are independent.
clear set obs 1000 local r = 4 local p = .25 foreach var of newlist nb1-nb`r' { gen `var' = int(ln(uniform( ))/ln(1-`p')) + 1 } egen nb = rsum(nb1-nb`r') drop nb1-nb`r'
Poisson and Zero-Inflated Poisson Random Variable:
We can simulate a Poisson and Zero-Inflated Poisson random variable
by recoding a uniform(0,1) random variable in terms of the cumulative
distribution. What allows us to simulate both types of variables in the same
code fragment is when we specify p0–the percent of zeros not explained
by a Poisson distribution–to be 0, the Zero-Inflated Poisson reduces to the
traditional Poisson distribution. The while loop continues until the
cumulative probability exceeds 0.99999 (a problem may exist if a value from the
uniform random variable is greater than 0.99999). What is required for this code
fragment is the specification of the Poisson parameter lambda and p0.
We need to note that the program was stable for values of
lambda less than 100. We recommend checking that x was recoded for all
values before proceeding.
clear set obs 1000 gen x = uniform( ) gen u = x local lambda = 10 local p0 = .40 local cp=0 local n =0 while `cp'<=.99999 { local p = `p0'*(`n'==0) + (1-`p0')*exp(-1*`lambda')*(`lambda'^`n')/exp(lnfact(`n')) local cp= `cp'+`p' local pcp = `cp'-`p' quietly recode x (`pcp'/`cp' = `n') local n = `n' + 1 }
Negative Binomial and Zero-Inflated Negative Binomial Random Variable that allows for over dispersion.
The hallmark of the Poisson distribution is that the mean is
equal to the variance. Many times that assumption is not satisfied and the
variance is greater than the mean. To account for the over-dispersion, we can
use a Negative
Binomial as a Gamma mixture of Poisson random variable that accounts for
over-dispersion by adding a parameter alpha. Like the prior code fragment,
we can simulate both types of variables within the same code fragment (the prior
code was more stable when we specified extreme values of lambda and p0,
this one is not. We recommend using conservative values of the parameter and
checking that x was recoded for all values). When alpha is equal to zero, we end up with a
Poisson random variable (this
program blows up if you set alpha = 0, instead we recommend setting alpha =0.01).
The code fragment requires the specification of the mean, p0 and alpha. The density
formula of the negative binomial distribution was taken from Regression Analysis
of Count Data by Cameron and Trivedi, formula (4.14).
clear set obs 1000 gen x = uniform( ) gen u = x local mean = 20 local alpha = .1 local p0 = .20 local a = 1/`alpha' local cp=0 local n =0 while `cp'<.99999 { local p = `p0'*(`n'==0) + (1-`p0')*(exp(lngamma(`n'+`a')) / (exp(lngamma(`a'))*(exp(lnfact(`n')))))*((`a'/(`a'+`mean'))^`a') /// *((`mean'/(`mean'+`a'))^`n') local cp= `cp'+`p' local pcp = `cp'-`p' quietly recode x (`pcp'/`cp' = `n') local n = `n' + 1 }