Data

Viewing Data

By default, data files are written to a Data subfolder located in the same folder as your Experiment File. For example, if your experiment were c:\experiments\test\test.exp then your data would be saved to c:\experiments\test\data\*.   If this data subfolder does not exist when you run a session, MediaLab will create it. The main data file will be given the same name as the Experiment File, but with different file extensions to indicate the different formats (e.g., .txt for the SPSS text-formatted data, .sav for the native SPSS data file and .csv for the comma-delimited text files). Data from essay items and on-line rating items will also be located in the Data subfolder.

You can view these files by browsing through "My Computer" on your desktop to the appropriate data folder. You can also select View Data from the Data menu in MediaLab--this will allow you to view any data file (.txt, .csv, .sav) you select.  Note that MediaLab will use the default program your computer uses to view these file types.

Alternative Data Structures

How does MediaLab write the data file given that different participants can get different dependent measures? Good question! This was a real challenge, because we wanted to be able to produce a single data file that could be a constant for all participants no matter what condition they were in. As you may know, data are collected during the administration of Questionnaire Files (in which anything can be embedded). Participants can then receive different Questionnaire Files in different conditions, or the same Questionnaire Files but in different orders. Before starting an experimental session, MediaLab goes through the whole experiment file and looks for all the Questionnaire Files you have in all your conditions. When MediaLab writes the data file, it writes data for ALL the Questionnaire Files in the experiment, whether or not the subject received them. If a subject did not receive certain questions or questionnaires then missing values are written. Be aware that MediaLab only writes data for the items in a Questionnaire File as long as it can successfully proceed from the first item to the last item in a Questionnaire File. If MediaLab encounters problems during this process, it can prevent any data being written for any and all items that Questionnaire File. Thus, it is very important to test your Questionnaire Files and make sure the data record as expected.

When the subject has finished, MediaLab writes TWO sets of data files that contain the same information but are structured differently depending on your needs. These two sets are called ByQuestionnaire and ByVariablename and can be found in the Data subfolder of your experiment. Here is the difference between the folders:

ByQuestionnaire

The ByQuestionnaire data folder organizes your data by Questionnaire File (hence the name). Before writing the data, MediaLab alphabetically orders all of the Questionnaire Files in the Experiment File. MediaLab then writes all of the data for each item in the first Questionnaire File, the second Questionnaire File, and so forth. If a subject did not receive a particular Questionnaire File in their condition, then missing values are written. Thus, the final data file lists items sorted first by the alphaabetical order of the names of the Questionnaire Files used and then by the order of the items as they were programmed into the Questionnaire File.  This results in a constant data format no matter which Questionnaire Files a subject received and the order in which they were administered.

ByVariablename

In contrast, the ByVariablename data folder organizes your data by variable names. Before writing the data, MediaLab goes through all of the Questionnaire Files in your Experiment File and alphabetically orders all of the item names. It then proceeds to write the data for all of the variables in this order—regardless of the questionnaire in which they occurred. Remember that the items will be in alphabetical order so q10 would end up following q9; If you are using a naming system like that, you might want to use q09 instead. Because the ByVariablename data files don't care about which Questionnaire File contains the item, the data for items with identical item names will all be written to the same column even if they are asked in different Questionnaire Files. Thus, the final data file lists all variables in alphabetical order of their item names, regardless of their Questionnaire File or order of presentation during a MediaLab session.  This results in an extremely easy way to analyze data from most experiments. It is also especially useful when you want to include the same item in different Questionnaire Files for different conditions within an Experiment File.

CSV vs TXT vs SAV Files

You will notice that MediaLab writes a .txt file and a .csv file to each folder. The .txt file is intended to be read into SPSS which has no practical limit on cases or variables. The accompanying .sps syntax file that is generated will read this .txt file. The .csv file is intended to be read in by pretty much any spread sheet application such as Excel. Many people prefer to use Excel method because it's simpler, but note that Excel is limited to reading in the first 255 variables from your .csv file unless you are using Excel 2007 or later.

Notice that there is also an .sav file located in the ByVariablename folder. This is a native SPSS data file that can be opened directly in the Data Editor Window of SPSS. You need to be careful with the .sav files because they are very sensitive to changes in the structure of your experiment. If you make changes to your study, it's usually best to delete (or move) the old .sav file and allow MediaLab to generate a fresh one for the revised experiment. Note that versions of SPSS prior to version 12 only allow variable names of up to 8 characters; more recent versions allow for up to up to 64 bytes in length. This typically means a maximum of 64 characters in single-byte languages, such as English, French, and German, and a maximum of 32 characters in double-byte languages, such as Japanese, Chinese, and Korean. If you have any names longer than the limit allowed with your version of SPSS  (e.g., with suffixes added by MediaLab) then they will be renamed VAR0001, VAR0002, etc. In order to find out which variables had to be renamed, you can double click on them in SPSS. MediaLab saves the original variable name as the variable label.

Advanced Hint

Some people strongly prefer using the .csv data format but get stuck when they have more than 255 variables because they do not have Excel v2007 or later. If you try opening your .csv file in Excel and get the "file not loaded completely" message, you can deal with this by importing the file directly into SPSS. To do so, first rename the file from .csv to .txt.  Then, in SPSS, you can select

File > New > Data

File > Open

Files of type > Tab-delimited (*.dat,*.txt)

File name  <select your file>

Predefined format? No

Delimited? Yes

Variable names at top? Yes

First case = Line 2

Each line represents a case? Yes

All of the cases? Yes

Delimiter? Comma (uncheck all others)

Finish

If you have any trouble reading in the data, you can look at the data directly in any text editor. If the file is especially large and difficult to read in its raw form, try a powerful free text editor like PSPad from www.pspad.com.

See also

Data FAQ