Here is a tiny example showing how to use the survey commands in Stata. Consider the data file we call **svysmall** shown below.

use http://www.ats.ucla.edu/stat/stata/faq/svysmall, clear listhouse eth wt y x1 x2 x3 1 1 .4 3 4 5 3 1 1 .9 9 4 5 6 2 1 1.2 9 8 7 3 2 1 1 8 7 4 2 2 1 1.1 8 7 6 3 3 2 .8 8 7 3 2 4 2 .4 8 2 0 3 4 2 .7 8 2 5 3

In this tiny example, **house** is the household, **eth** is the ethnicity, and **wt** is the weighting
for the person. You can use the **svyset** commands to tell Stata about these things and it remembers them. If you save the data file, Stata remembers them with the data file and you don’t even need to enter them the next time you **use **the data file. Below, we tell Stata that the **psu** (primary sampling unit) is the household (**house**). Further, the sampling scheme included stratified sampling (**strata)** based on ethnicity (**eth**). Finally, the weighting variable (**pweight**) is called **wt**.

The way the **svyset** command is constructed is different between Stata version 7, 8 and 9. If you are not using Stata 9 or later, the syntax below will not work. Please see this page for examples. An example is
given below. Notice that the PSU variable is given before the pweight, which is given in square brackets.

svyset house [pweight = wt], strata(eth)

Once Stata knows about the survey via the **svyset **commands, you can use the **svy: **prefix using syntax which is quite similar to the non-survey versions of the commands. For example, the **svy: regress**
command below looks just like a regular **regress** command, but it uses the information you have provided about the survey design and does the computations taking those into consideration.

svy: regress y x1 x2 x3

The output is below, and it tells you the **pweight**, **strata**, and **psu** variables so you can confirm the right variables have been chosen.

Survey: Linear regression Number of strata = 2 Number of obs = 8 Number of PSUs = 4 Population size = 6.5000001 Design df = 2 F( 2, 1) = . Prob > F = . R-squared = 0.2216 ------------------------------------------------------------------------------ | Linearized y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | .3321757 .294268 1.13 0.376 -.9339573 1.598309 x2 | -.138397 .2335074 -0.59 0.613 -1.143098 .8663043 x3 | .5504173 .3170068 1.74 0.225 -.8135527 1.914387 _cons | 5.050307 2.040247 2.48 0.132 -3.728167 13.82878 ------------------------------------------------------------------------------