This example is taken from Levy and Lemeshow’s Sampling of Populations.
page 53 simple random sampling This example uses the momsag data set. The _one_ on the nest statement tells SUDAAN that the entire sample is “at the same level” – in other words, the entire sample is in the same stratum. _one_ is a SUDAAN keyword and not a variable in the data set. You can use the samcnt statement to check to see if SUDAAN calculates the same number of observations as you think is in the data set (see page 105 in the SUDAAN manual). The totcnt statement computes the fpc (i.e., finite population correction), which is why it is the same variable used as the fpc in Stata.
proc descript data = momsag filetype = sas design = wor total; weight weight1; nest _one_; totcnt birth; var momsag; run; Number of observations read : 25 Weighted count : 773 Denominator degrees of freedom : 24 Variance Estimation Method: Taylor Series (WOR) by: Variable, One. ----------------------------------------------------- | | | | Variable | | One | | | 1 | ----------------------------------------------------- | | | | | MOMSAG | Sample Size | 25 | | | Weighted Size | 773.00 | | | Total | 711.16 | | | SE Total | 42.11 | | | Mean | 0.92 | | | SE Mean | 0.05 | -----------------------------------------------------
This example is taken from Lehtonen and Pahkinen’s Practical Methods for Design and Analysis of Complex Surveys.
page 29 Table 2.4 Estimates from a simple random sample drawn without replacement (n = 8); the Province’91 population.
data page29; input id cluster ue91 lab91; fpc = 32; wt = 4; strata = 1; datalines; 1 1 4123 33786 2 4 760 5919 3 5 721 4930 4 15 142 675 5 18 187 1448 6 26 331 2543 7 30 127 1084 8 31 219 1330 ; run;
The code below gets the total and the standard error of the total for the variable ue91 as shown in the first line of the table. You cannot get both the total and the median in the same proc descript. The two setenv statements are optional; they only control the appearance of the output. The print statement tells SUDAAN what to include in the output. The print statement will override any of the statistics listed on the proc descript statement. In other words, if you request the mean on the proc descript statement and do not include mean on the print statement, the mean will not be displayed in the output.
proc descript data = page29 filetype = sas design = wor; weight wt; nest strata; totcnt fpc; var ue91 ; print nsum total setotal mean semean deffmean ; setenv colwidth = 15; setenv decwidth = 3; run;
Number of observations read : 8 Weighted count : 32 Denominator degrees of freedom : 7 Variance Estimation Method: Taylor Series (WOR) by: Variable, One. ----------------------------------------------------- | | | | Variable | | One | | | 1 | ----------------------------------------------------- | | | | | UE91 | Sample Size | 8 | | | Total | 26440.00 | | | SE Total | 13282.26 | | | Mean | 826.25 | | | SE Mean | 415.07 | | | DEFF Mean #4 | 0.75 | -----------------------------------------------------
The code below produces the correct ratio estimate, except that you have to move the decimal over two places.
proc ratio data = page29 filetype = sas design = wor; weight wt; nest strata; totcnt fpc; numer ue91; denom lab91; run;
Number of observations read : 8 Weighted count : 32 Denominator degrees of freedom : 7 Variance Estimation Method: Taylor Series (WOR) by: Variable, One. --------------------------------------------------- | | | | Variable | | One | | | 1 | --------------------------------------------------- | | | | | UE91/LAB91 | Sample Size | 8 | | | Weighted Size | 32.00 | | | Weighted X-Sum | 206860.00 | | | Weighted Y-Sum | 26440.00 | | | Ratio Est. | 0.13 | | | SE Ratio | 0.00 | ---------------------------------------------------
The code below produces that median shown on the third line of the table. This, and most of the other estimates of medians, is slightly different from what is shown in the text. We suspect that this difference is the result of slightly different algorithms used by the different packages (PC Carp was used to generate the estimates given in the text).
proc descript data = page29 filetype = sas design = wor; weight wt; nest strata; totcnt fpc; var ue91; percentile / median; setenv colwidth = 10; setenv decwidth = 3; run;
Number of observations read : 8 Weighted count : 32 Denominator degrees of freedom : 7 Variance Estimation Method: Taylor Series (WOR) by: Variable, One, Percentiles. for: Variable = UE91. ----------------------------------------------------------------------------------- One Sample Weighted Lower 95% Upper 95% Percentiles Size Size Quantile Limit Limit ----------------------------------------------------------------------------------- 1 50.00 8 32.00 219.00 135.74 737.27 ----------------------------------------------------------------------------------- --------------------------------- One SE Percentiles Quantile --------------------------------- 1 50.00 127.19 ---------------------------------