WesVar FAQ How do I import data and conduct analyses of survey data in WesVar?

IMPORTING DATA

ANALYZING DATA

Making tables

How do I indicate an FPC value to be used for the entire data set?
Why are so many of the cases in my data set not used when I make a table with two (or more) variables?
What are the RS2 and RS3 check boxes in the table set up dialogue box?
What are the "PV" and "ONE" functions that are available when computing statistics for a table?

Regression analyses

GENERAL

ANSWERS

INPUTTING DATA

How do I decide what method of creating the replicate weights I should use?
In order to decide what method of creating the replicate weights you should use, you need to know several things about your data set. The first question to answer is "Does the data set have stratification?". (If the data set has poststratification but not stratification, the answer to this question is no.) If the data set does not have stratification, then you would use the jackknife-1 (jk1) method of creating the replicate weights. Examples of sampling designs that do not include stratification are simple random sampling and possibly cluster sampling. If your data are stratified, the next question is "How many PSUs (primary sampling units) are in each strata?". If every strata has exactly two PSUs per strata, then you can use balanced repeated replicates (BRR), Fay’s or the jackknife-2 (jk2) method of creating the replicate weights. (Note that Fay’s method is a variant of BRR and is recommended only under certain conditions. Please consult the WesVar manual or another source to determine if Fay’s method is appropriate for your data.) If the sampling fraction is large and you need to use an FPC, you will want to use either the jackknife-2 (jk2) or the jackknife-n (jkn) method. If one or more strata have more than two PSUs, then you will want to use the jackknife-n method of creating the replicates. If you have certainty or self-representing PSUs, then you will want to use the jackknife-n method and indicate which strata contains the certainty PSU. Also, Appendix D of the WesVar manual provides an excellent overview and examples of the process of selecting which type of replication is most appropriate for common survey designs.

How do I indicate a poststratification variable and the poststratification weights?
If your survey design includes poststratification, you need to indicate to WesVar the variable that defines the post-strata and the poststratification population control totals (not the weights themselves). After you have created the replicate weights, click on data -> poststratification. This will open up a dialogue box. At the top, you can give the poststratification weights a prefix (this is useful when looking at the data in a text file or in another statistical package). Next, you need to indicate which variable contains the poststrata coding. Finally, you need to name a file that contains the poststratification totals. This file should be a text file, although the text file can have either a .txt or a .dat file extension. The file must contain two columns of numbers. The first column gives the strata and the second gives the population total. WesVar will give an error message if the number of strata in the text file does not match up correctly with the number of strata in the variable given in the "cell definition" box. There is a short "movie" that shows how to do this that you can view by clicking here.

How do I indicate non-response adjustments?
In order to use the non-response feature of WesVar, you need to have a dataset that contains rows (cases) for both respondents and non-respondents. You will need to indicate which variable contains the cell definition data. Also, you will need to have a variable in your dataset that indicates the response status of each subject, and this variable must be coded 1, 2 and 3 (1 = respondent, 2 = nonrespondent, 3 = ineligible). There cannot be missing values in this variable. The non-respondents will be weighted as zero, while both the full sample and replicate weights will be adjusted for respondents. The weights for subjects coded as ineligible will not be altered. If the variable in your dataset that contains the non-response information is not coded 1, 2, 3, you can use the recode function in WesVar to create a new variable that is coded in this way.

How do I indicate different FPC values for different strata?
To indicate different FPC values that should be used for different strata, use the "Attach Factors" feature by clicking on Data -> Attach Factors. You can type in the FPC values for each strata (the case numbers are listed on the left-hand side of the window), and you can use the "Fill Down" button to have the current FPC value placed in all of the boxes below. When you come to the next strata, just type in the FPC value and click on "Fill Down" again, until the FPC values have been entered for each strata. There is a short "movie" that shows how to this that you can view by clicking here.

How do I indicate certainty (self-representing) PSUs?
You indicate to WesVar which strata contain certainty or self-representing PSUs while you are creating the replicate weights. After you have indicated the method of replication, the VarUnit and the VarStrat (if applicable), click on the button to the right called "SR Units". This will open up a dialogue box in which you can indicate which strata contain the certainty PSUs.

How can I create a subset of my data?
After you have created the replicate weights for your data (using the entire data file) you can create a subset of the data by clicking on Data -> Subset Population. Select the variable and indicate which values of the variable should be included. For example, if you want only men included in your analyses, select the variable that codes for gender and set it equal to the value for men. You can also use a continuous variable to create your subset. For example, if you would like the subset of your data to include people who older than 60 years, you would type (or point and click) years > 60. Note that you may want to save this subsetted data with a new name, so that you do not overwrite your original data set.

How do I indicate raking values in the data?
Raking is done in WesVar through a process of iterative poststratification, during which the full sample and replicate weights are adjusted. You will need to indicate a control variable and a corresponding ASCII (text) file. The text file must contain two columns: the first with the value of the control variable and the second with the control total for that cell. You may rake in as few as two and as many as eight dimensions. You will need to have a separate text file for each dimension that is raked. Finally, you need to indicate the stopping rules by selecting the appropriate tab in the upper right-hand corner of the dialogue box. Please consult the WesVar manual (especially pages 4-25 and 4-26) for details regarding the stopping rules. Finally, note that your full sample weight may or may not already be raked; either way is fine.

How can I find out how many values of a variable I have in my data set?
To find out how many values, including missing values, there are of a particular variable in your data set, go to the data window (the window you use to import data into WesVar). Click on Format -> Label. You will see all of the values of the variable listed, and off to the left you will see the minimum and maximum values of the variable, as well as the number of missing values.

How can I recode values of a variable in my data set?
To recode values of a variable, go to the data window (the window you use to import data into WesVar) and click on Format -> Recode. You have three options for recoding variables: recoding a continuous variable into a continuous variable, which you would do if, say, you wanted to create a variable that is the square of another variable; recoding a continuous variable into a discrete (or categorical) variable, which you would do if you wanted to change a continuous variable like height into a categorical variable (short, medium and tall); and recoding a discrete variable into a discrete variable, which you would do if you wanted to make a new discrete (or categorical) variable with fewer categories than the original variable or if you wanted to change the reference category of a categorical variable. Recoding from discrete to discrete will also allow you to recode, say, multiple stratification variables into one stratification variable, as required by WesVar (you can only select one variable as the VarStrat). To do this, select each of the variables that contain strata information and then give each combination a unique value for the new variable. For example, if your data set was stratified on gender (two categories, 1 and 2) and race (three categories, 1, 2 and 3), then your new variable, perhaps called newstrat, might be coded as 1 for category 1 for gender and category 1 for race, 2 for category 1 for gender and 2 for category 2 of race, and so on.

Gender	Race	Newstrat
1	1	1
1	2	2
1	3	3
2	1	4
2	2	5
2	3	6

MAKING TABLES

How do I indicate an FPC value to be used for the entire data set?
To indicate an FPC (finite population correction) value to be used for the entire data set, click on "Options" under the table request node of the workbook tree. You can change the default FPC value of one to whatever is required for your data set. You need to calculate the FPC value yourself outside of WesVar. To do this use the formula: 1 – (n/N), where n = the sample size and N = the population size. Remember that you need to use an FPC only if your sampling fraction is large.

Why are so many of the cases in my data set not used when I make a table with two (or more) variables?
By default, WesVar does a case-wise deletion, meaning that if a row (or case) has any missing data for one or more of the variables used in making the table, that case will be dropped (i.e., not used in making the table). To prevent WesVar from doing this, click on the "Options" node under the Table Request node in the workbook tree and uncheck the box at the bottom indicating that all cases with missing values should be excluded.

What are the RS2 and RS3 check boxes in the table set up dialogue box?
These are the Rao-Scott approximations. They are adjustments to the Pearson chi-squared test. The RS2 uses a design effect adjustment and the RS3 is based on the Satterthwaite adjustment. In its simplest form, a Rao-Scott approximation is the chi-squared statistic divided by either the mean of the eigenvalues or the mean of the design effect for that cell.

What are the "PV" and "ONE" functions that are available when computing statistics for a table?
The name "PV" is short for "plausible value" and it operates on another function of several variables to return the average of the functions. For example, you might use the "PV" function to get the average of several averages. The "ONE" function is used to produce the sum of the weights.

REGRESSION ANALYSES

How do I change the reference category used for my categorical variables?
By default, WesVar uses the last category as the reference category for categorical variables in regression analyses. If you do not want the last category (i.e., the category coded with the highest numerical value) as the reference category, then you will need to use the recode function to recode your variable, giving the category that you want as the reference category the highest numerical code. To recode a variable, from the data window, click on data -> recode. Select the "New Discrete" button to recode your discrete (categorical) variable into a new discrete (categorical) variable. In the dialogue box, you will need to type in the name of the new variable (the original variable will not be replaced with the new variable) and then select the variable to be recoded. Next, type in the new code for each of level of the variable. For example, suppose your original variable has three levels coded 1, 2 and 3. You do not want level 3 used as the reference category; instead, you want level 1 used. You recode the variable so that level 1 of the original variable is coded as 3, level 2 is coded as 2 and level 3 is coded as 1. Click on "Update Selected" to change the value for the currently highlighted value. Click on "Update All" to change all of the original values to the current value. When you are done, click on OK.

What is the difference between class and source variables?
You will notice that most, if not all, of the variables listed as source variables are also listed as class variables. To make the class variables, WesVar has taken all of the source variables, which are all of the variables in your data set that you put in the variables column when you imported your data, and made dummy variables out of them. The only exception to this is for variables that have 256 or more distinct values; those variables are found only in the source variables box and were not dummy coded. When selecting the dependent variable for any type of regression analysis, you will want to select the variable from the source variables box. For the independent variables, you can select the variable from either the source or the class variables box, depending on how you want WesVar to treat the variable in your model.

How can I specify tests of linear combinations?
After specifying your model, open up the model node of the workbook tree, and then open up the node that displays your model. The second entry will be "Tests". Click on "Tests" and you can input a label for your test and the formula that you would like used. You can also request a score test by clicking in that box.

How can I define which category I want modeled in a logistic regression?
After you define your logistic regression model, open up the model node of the workbook tree, and then open up the node that displays your model. The last node under that is called "Success". Click on the "Success" node to define which category you wanted modeled as "success".

GENERAL

How can I change the number of decimal places that my results are displayed in?
To change the number of decimal places that your results are displayed in by changing the default setting, click on file -> preferences. There are two boxes at the bottom that control the numerical part of the output display, one to switch between fixed and scientific notation and one to indicate the number of decimal places for the estimates and the standard errors. Also, if you are conducting a regression analysis, you can modify the number of decimal places displayed by clicking on the output control node of the workbook tree after you have specified your model.

How can I stop WesVar from doing a case-wise deletion on my data?
You can keep WesVar from performing a case-wise deletion by clicking on file -> preferences -> tables (2). On the left is a box called "missing" in which you can indicate whether WesVar should perform a case-wise deletion. Also, when making tables, you can click on the "Options" node and uncheck the box at the bottom indicating that all cases with missing values should be excluded.

How can I get all of the warning messages to stop displaying?
There are two things that you can do to get the warning messages to stop displaying. The first is to check the box on many of the warnings that says not to show the warning again. To stop the display of other warnings, click on file -> preferences -> options (the right-most tab at the top), and you can check which warnings you do not want to have displayed.

Why can’t I use Stat/Transfer or DBMS/Copy to convert my data into WesVar format?
Neither Stat/Transfer nor DBMS/Copy convert data into WesVar format because WesVar needs certain information about the data set that it uses while importing the data. Neither Stat/Transfer nor DBMS/Copy can provide WesVar with the information that it needs, so these methods of importing data will not work. Please note that WesVar version 4.2 supports many import file formats, so importing data using WesVar is fairly simple.