Mplus Class Notes: Entering Data

Mplus version 8 was used for these examples. All the files for this portion of this seminar can be downloaded here.

1.0 Entering free format data

Data files for Mplus are just plain ASCII text files. “Free format”, where the values for each of the variables are separated by a delimiter such as a blank or a comma, is one way in which data can be found in a text file. The example below contains the first 20 lines from a file called hsb.dat. Here you can see the variables are separated by commas, and the variable names are not on the first line. The variables in the file are id, female, race, ses, schtyp, prog, read, write, math, science and socst.

 70,0,4,1,1,1,57,52,41,47,57
121,1,4,2,1,3,68,59,53,63,61
 86,0,4,3,1,1,44,33,54,58,31
141,0,4,3,1,3,63,44,47,53,56
172,0,4,2,1,2,47,52,57,53,61
113,0,4,2,1,2,44,52,51,63,61
 50,0,3,2,1,1,50,59,42,53,61
 11,0,1,2,1,2,34,46,45,39,36
 84,0,4,2,1,1,63,57,54,58,51
 48,0,3,2,1,2,57,55,52,50,51
 75,0,4,2,1,3,60,46,51,53,61
 60,0,4,2,1,2,57,65,51,63,61
 95,0,4,3,1,2,73,60,71,61,71
104,0,4,3,1,2,54,63,57,55,46
 38,0,3,1,1,2,45,57,50,31,56
115,0,4,1,1,1,42,49,43,50,56
 76,0,4,3,1,2,47,52,51,50,56
195,0,4,2,2,1,57,57,60,58,56
114,0,4,3,1,2,68,65,62,55,61
 85,0,4,2,1,1,55,39,57,53,46

The Mplus commands to read the data are shown below. These are the commands that you can enter into a blank Mplus text file and save as an input file (.inp). The title, data and variables command blocks are required. The analysis command block is included so that we can check the data. We will go into this command block in more detail in the next unit.

title:  Entering data example free format using hsb.dat

data:
file is hsb.dat;

variable:
names are
id female race ses schtyp prog read write math science socst;

analysis:
type = basic;

After saving and running the .inp file, you can look in the output file for “INPUT READING TERMINATED NORMALLY” appearing below the entered code. This is a good first check that your data were read in successfully. We will discuss further checks in the next section.

2.0 Entering fixed format data

The file fixed.dat contains ten observations with the data in fixed-width columns. The codebook for the data is given below.

 195  094951
 26386161941
 38780081841
 479700  870
 56878163690
 66487182960
 786  069  0
 88194193921
 98979090781
107868180801

codebook
variable name	column number
id	1-2
a1	3-4
t1	5-6
gender	7
a2	8-9
t2	10-11
tgender	12

Fixed format data are handled using a fortran-type format statement in the data command block. The Mplus commands are shown below. On the format statement, 3F2.0 indicates that the file begins with three variables each of length two. These are followed by one variable of length one (F1.0), then two of length 2 and one of length 1 (2F2.0, F1.0). This matches what we see in the codebook.

title:  Entering data example fixed format using fixed.dat

data:
file is fixed.dat;
format is (3F2.0, F1.0, 2F2.0, F1.0);

variable:
names are
id a1 t1 gender a2 t2 tgender;
missing are blank; 

analysis:
type = basic;

Again, after saving and running this input, you can check the output to see if “INPUT READING TERMINATED NORMALLY” appears.

3.0 Entering data using Stata

If you are a Stata user, a user-written a command, stata2mplus, will convert a Stata dataset to an Mplus ASCII data file plus the necessary commands (in an Mplus input file) to read in the data. You can get the stata2mplus ado file by typing search stata2mplus in the Stata command window and following the directions that are given.

Here is the Stata command to load and convert the Stata dataset hsb2.dta to Mplus. A .dat file containing the dataset and the input file needed to read the dataset into Mplus are created. It stores both in the current working directory in Stata (use the command pwd to get the path) with the dataset name hsb2.dat and hsb2.inp.

use https://stats.idre.ucla.edu/stat/stata/notes/hsb2.dta, clear
stata2mplus using hsb2

Looks like this was a success.
To convert the file to mplus, start mplus and run
the file hsb2.inp

The code from the input file created appears below. The Mplus .inp file is saved in the current working directory, which is listed in the lower left-hand corner of the Stata window. To change it, you can use the Stata’s cd command.

The .inp file contains more detail about the data file than our earlier examples; however, all of the same command blocks are present. Again, the analysis type = basic statement is included to allow you to run descriptive statistics in order to insure that the data were input correctly.

Title:
  Stata2Mplus conversion for hsb2.dta
  List of variables converted shown below
  id :
  female :
    0: male
    1: female
  race :
    1: hispanic
    2: asian
    3: african-amer
    4: white
  ses :
    1: low
    2: middle
    3: high
  schtyp : type of school
    1: public
    2: private
  prog : type of program
    1: general
    2: academic
    3: vocation
  read : reading score
  write : writing score
  math : math score
  science : science score
  socst : social studies score
Data:
  File is hsb2.dat;
Variable:
  Names are
     id female race ses schtyp prog read write math science socst;
  Missing are all (-9999);
  Usevariables are
     id female race ses schtyp prog read write math science socst;
Analysis:
  Type = basic;

4.0 Entering data with missing values using Stata

Our next example of entering data shows how to enter a version of the hsb dataset that has missing data. Below is the Stata code for reading the missing data file and converting it to an Mplus data file along with an Mplus input file (hsbmis.inp).

use https://stats.idre.ucla.edu/stat/data/hsbmis.dta, clear
stata2mplus using hsbmis, missing(-9999)

Looks like this was a success.
To convert the file to mplus, start mplus and run
the file hsbmis.inp

The input file for this example is identical to the previous example except for the file name.

5.0 Entering data from SPSS

If you are an SPSS user, you can prepare your data to be read into Mplus with a few steps detailed in SPSS FAQ: How can I move my data from SPSS to Mplus?. Starting from the hsb2.sav dataset, once you have created a .csv file, hsb2.csv, without variable names, the code below can read in your data.

title:  Entering data from SPSS

data:
file is hsb2.csv;

variable:
names are
id female race ses schtyp prog read write math science socst;

analysis:
type = basic;

6.0 Entering data with missing values from SPSS

If your SPSS data file contains missing data, complete the same steps you would for SPSS data without missing values, but note the values used for missing values. For example, if -999 is the value used in coding missing values, then the previous example’s code would be amended with a Missing statement in the Variable: block indicating this. Below, we use hsbmis.csv.

title:  Missing data from SPSS

data:
file is hsbmis.csv;

variable:
names are 
id female race ses schtyp prog read write math science socst;
missing are all (-999); 

analysis: 
type = basic;

7.0 Entering missing data from a raw data file with dots (.)

Our last example of entering data illustrates entering data with a raw data file that has dots (.) to represent missing values. Suppose that you had a data file that was either comma-separated, tab-separated or space-separated where . was used to indicate missing values. You can indicate such missing values with a Missing are .; statement in the variable block as shown in the example below, which reads the hsbmisdot.dat data file.

title: Missing with dots;

data:
file is hsbmisdot.dat;

variable:
names are id female race ses schtyp prog read write math science socst;
missing are .;

analysis:
type = basic;

The results of this are identical to the above example.

Other things to know about Mplus

There are three files that are associated with any analysis in Mplus: the data file, the input file (which contains the Mplus program that you wrote), and the output file. All of these files are text files. Each analysis must be in its own input file. Mplus creates an output file for each input file that is run. This opens by default after the analysis has been run, and it has the same name as the input file (but has an .out extension).
All statements must end with a semicolon. The title command is the only command that does not have to end in a semicolon.
The maximum length of any line in an Mplus input file is 90 characters (80 characters in older versions of Mplus). If a statement needs more than 90 characters, break the statement up into multiple lines, ending the statement (not each line) in a semicolon. Note that this means that very long file path specifications can be problematic; you may need to save your files to a location that has a shorter file path.
Mplus cannot handle string variables; such variables should be removed from the data file or converted to numeric before converting the data set to Mplus.
By default, Mplus will use all of the variables in the data set in the analysis or model. To avoid this, the usevariables statement can be included in the variables command block. This can be shortened to usevars.
- Note that for certain models if you specify variables under usevariables and don’t include them in the model, you will get a warning that the “Variable is uncorrelated with all other variables”.
Mplus is not case sensitive. However, in many examples of Mplus code, the Mplus commands and options are in capital letters to identify them as being part of the Mplus code.
Variable names can be no longer than 8 characters; if your variable names are longer than 8 characters, they will be truncated to 8 characters. Variable names must start with an alphabet character (i.e., a letter of the alphabet). Variable names can contain numbers and/or the underscore character (_).
Dummy variables must be created for any categorical predictor variables. You can either do this in your preferred general-use statistical software package (e.g., SAS, Stata, SPSS, R, etc.) or in Mplus in a define command block.
If your data file is in the same folder as the input file, you do not have to specify a path for the data file in your input file.
The keywords is, are and = can be used interchangeably on all commands except define, model constraint and model test. Items in a list can be separated by either blanks or commas.
Comments can be added to the Mplus syntax by starting the line with an exclamation point (!). The line does not need to be ended with a semi-colon. Each line of comment must start with an exclamation point.
The Mplus User’s Guide can be found on the Mplus website.