• Batch Edit User's Guide
    • Introduction to Batch Applications
    • Selecting a Batch Application to Run
    • Define File Associations

Define File Associations

The Define File Associations dialog box is presented when a CSBatch application is executed. This page explains why certain files are asked of you. To find out more about the allowable Source Type options and what can be chosen, see Data Sources.
CSBatch requires a minimum of two files to run (an Input Data File and a Listing File), although the minimum number of files you will be solicited for in the Define File Associations dialog is three (you optionally have the choice of specifying an Output Data File). The basic run dialog is shown below:
file associations batch min
CSBatch applications will require additional file definitions when one of the following situations occur:
  • Lookup files are used.
  • write functions or Freq statements functions are used.
  • the impute function is used with the stat command, or set impute(on) is included.
  • the Array declaration is present with the save option.
If all of these features were used, the expanded CSBatch run dialog would appear as shown below. Note the file associations listed in the Source Type column: those are the default file types expected for each file, though most all files allow more than one option to choose from. Also note that four of the data files provide default filenames, as explained further below.
file associations batch full
The following describes the files that will be solicited from you in more detail, in the order (more-or-less) presented in the dialog box (depending on the options chosen), and why each is needed.
Input Data
This is the file that your program will run upon. For example, if your program contains a series of edits to be performed on your data files, this is your place to name those files. You can select a single input file or multiple input files. Input data files are never altered, they are only opened, read, and closed. You also have the option of not using any data file by selecting None as your input data file. This could be useful, for example, when writing a utility application that merges selected fields from 2+ external (lookup) files, based on a very specific universe criteria.
Output Data
The Output Data file is where you can choose to write out the input data file. What you've done to the input data file (or not, if no edits took place), and what format you choose to write out the data will impact the resultant file(s).
Similar to the input file, you can specify None as your output file. In this scenario, not specifying an output file might be used when you are developing and debugging your program, as you might prefer to see what is going on through the use of errmsg function calls rather than reviewing edited data.
If you have not made any edits to your input file in the batch application, then any output files specified will be, essentially, a copy of the input file. This can be useful if your intent is to write out the input data file into other format(s), such as Excel, SAS, or SPSS; or if you wish to output selected records from your input data file, rather than the entire file.
You can also choose to specify more than one file, should you wish to export the edited data to more than one type of file format. For example, you may wish to continue working with CSPro DB files, but subject-matter staff may prefer reviewing the edited data using R.
External Data (dictionary name)
For each lookup file included in your application, you will be asked to supply the data file to which the lookup dictionary refers. The name displayed within the parenthesis is the unique (internal) dictionary name. For each lookup file attached to your application, a separate line entry will appear for you to specify the associated data file. If the file does not exist, it will be created.
You also have the option to specify None for your lookup file. This can be useful if you know what lookup files you plan to use in your application, you have defined the dictionaries, but you do not yet have the data files themselves ready to use.
<Write File>
If your program makes one or more write function calls, CSBatch will ask you for the file to write them all to. This file will be a text file, regardless of the file extension used. If you fail to name a file, all write function text will be written to the listing file.
<Listing File>
This file contains a summary of the results of your run and must be provided. It will tell you the input data file used, start and stop times, the number of records read, and how many had a "bad structure" (i.e., required records were missing). If there are any errmsg functions in your program, they will be written to this file after the summary information. As seen in both screenshots above, the default file name with extension will be <AppName>.lst, but both can be changed by the user.
<Freq File>
If your program includes one or more Freq statements, CSBatch will ask you for the file to save these frequencies to. If you do not provide a file name, CSBatch will execute, but it will not generate an error message about the expected file being missing. As seen in the expanded dialog above, the default file extension will be .txt, though this can be changed by the user. Creating a CSPro table by using the .tbw source type is a good second choice.
<Impute Freq File>
If your program includes one or more impute function calls, CSBatch will ask you for the file to save these imputations to. If you do not provide a file name, CSBatch will run on your data, but you will receive an error message about the expected file being missing. As seen in the expanded dialog above, the default file name with extension will be <AppName>_impute_freq.lst, but both can be changed by the user.
<Impute Stat Data>
If your program includes one or more impute functions that use the stat command, or if you include the set impute(stat, on); in your program, CSBatch will ask you for the name of a data file to save these imputation statistics to. As seen in the expanded dialog above, the default file name with extension will be <AppName>_impute_stat.csdb, but both can be changed by the user.
<Save Array File>
If your program uses Array objects, you can choose to save the array values between program runs with the optional keyword save. When the save option is used, CSBatch will prompt you for a filename. By default the file has the same name as the application, with an .sva file extension. The leading portion of the file name can be changed, but the extension cannot.
<Paradata Log>
Paradata log files contain information about paradata events stored during an application's run. Files created using this type have the extension .cslog and can be viewed using Paradata Viewer.