Why Use IsChecked Instead of Pos


CSPro 7.4 has a new function ischecked. This function returns whether a code is part of a check box field's selections. Prior to CSPro 7.4 we would use the pos function. So why use the ischecked function rather than the pos function?

Issue with Pos

Let's look at how the LANGUAGE_SPOKEN variable would be set up. Since a person could speak multiple languages we will use a check box and the language question might look like:

language-check-box

Here English, French, Russian, and Spanish are checked. In previous versions of CSPro we would use the pos function to see if a specific language was checked. For example, if we wanted to know if French was checked we would use:

if pos("21", LANGUAGE_SPOKEN) then
   
errmsg("French is checked");
else
   
errmsg("French is not checked");
endif;

The error message "French is checked" would be issued. Continuing with our example we could ask if Russian is checked:

if pos("23", LANGUAGE_SPOKEN) then
   
errmsg("Russian is checked");
else
   
errmsg("Russian is not checked");
endif;

In this case pos would return a 1 (true) since Russian is checked and the error message "Russian is checked" would be issued.

If we asked if Hindi is checked:

if pos("33", LANGUAGE_SPOKEN) then
   
errmsg("Hindi is checked");
else
   
errmsg("Hindi is not checked");
endif;

The pos function would return a 0 (false) and the message "Hindi is not checked" would be issued.

Now let's test if Bengali is checked:

if pos("32", LANGUAGE_SPOKEN) then
   
errmsg("Bengali is checked");
else
   
errmsg("Bengali is not checked");
endif;

The pos would return a 6 (true) and the message "Bengali is checked" would be issued.

But Bengali is not checked. What happened? The pos("32", LANGUAGE_SPOKEN) searched the string "11212324" found "32" in positions 6-7 returning a 6.

Explanation

The check box codes are placed at uniformly spaced offsets based on the size of the code. For example, if the check box field has a length of 20 and each code has a length of 2, then each selected code is placed in the respective 2-digit offset. That is, positions 1-2 for the 1st code, position 3-4 for the next code, and so on.

language-vs-by-form

The data are stored in the file as shown here:

language-data

The pos function does not look by offset, but instead looks for a substring match. Unfortunately, the substring "32" does exist in the data and a false match is found. In previous versions of CSPro we would need to loop through the string being searched by 2 for the language code:

numeric languageFound = 0;

do varying numeric idx = 1 while idx <= length(LANGUAGE_SPOKEN) by 2
   
if LANGUAGE_SPOKEN[idx:2] = "32" then
        languageFound = 
1;
       
break;
   
endif;
enddo;

if languageFound then
   
errmsg("Bengali is checked");
else
   
errmsg("Bengali is not checked");
endif;

Moving Forward with IsChecked

CSPro 7.4 greatly simplifies this check with the ischecked function. The ischecked function checks for the codes at the appropriate offsets, in this case the function checks positions 1-2, 3-4, 5-6, 7-8, ..., 19-29 for the code "32". To check for Bengali we simply use:

if ischecked("32", LANGUAGE_SPOKEN) then
   
errmsg("Bengali is checked");
else
   
errmsg("Bengali is not checked");
endif;

The ischecked function returns a 0 (false) since "32" is not found within one of the uniformly spaced offsets. To do this CSPro requires the codes to be a uniform length. Notice all codes in this example were length 2.

Hierarchical Value Sets Using a Selcase Trick


The selcase ("select case") function is used to display a list of cases in an external dictionary, letting an interviewer select a case to load. One function not mentioned in that page's help documentation is the ability for the user to select multiple cases. By using the multiple keyword, the interviewer can select more than one case and then iterate over each of those cases using a for loop.

An undocumented feature allows for all qualified cases to be automatically marked. Using the automark keyword, the selcase dialog is not shown to the interviewer. These two sets of code are the same:

// create a dynamic value set showing the EAs in Region 1 / District 1
ValueSet ea_vs;

// code demo 1 --- using a forcase loop
forcase EA_DICT where REGION = 1 and DISTRICT = 1 do
    ea_vs.
add(EA_NAME, EA);
endfor;

// code demo 2 --- automatically marking cases and then using a for loop
selcase(EA_DICT, ""where REGION = 1 and DISTRICT = 1 multiple(automark);

for EA_DICT do
    ea_vs.
add(EA_NAME, EA);
endfor;

Because selcase allows you to pass a key to match ("" in the example above, which means all case keys), you can use this as a trick to efficiently create value sets if you know what part of the key is. For example, if you have a data file with 50,000 cases, forcase will always loop through all 50,000 cases, whereas providing a key match may limit your loop to substantially fewer cases.

To show a possible use for this trick, we will look at two ways of creating hierarchical value sets for geocodes. Supposing we have three levels of geography—Region, District, and EA—one way to structure a geocode lookup file is as follows:

RegionDistrictEAGeocode Name
1Region 1
12District 2 in Region 1
125EA 5 in Region 1 / District 2

That is, when defining regions, the district and EA codes are left blank, and when defining districts, the EA code is left blank.

Using a forcase loop to populate the districts based on a selected region, we would loop over the entire data file, filtering on cases where the geocode region matches the selected region, where the geocode district is defined, but where the geocode EA is blank:

PROC DISTRICT

preproc

   
ValueSet geography_vs;

   
forcase GEOCODES_DICT where GEOCODE_REGION = REGION and
                                GEOCODE_DISTRICT <> 
notappl and
                                GEOCODE_EA = 
notappl do

        geography_vs.
add(GEOCODE_NAME, GEOCODE_DISTRICT);

   
endfor;

   
setvalueset(DISTRICT, geography_vs);

Prior to this, to generate the region value set, we would look for cases where the geocode district is blank, and then following this, to generate the EA value set, we would look for cases where the region and district match the selected codes and where the EA is defined. To generate the hierarchical value sets for the three levels of geography would require fairly different loops.

With the selcase automark trick, we can create a single function that can be used to generate the value set for each level of geography:

PROC GLOBAL

ValueSet geography_vs;

function CreateGeographyVS(string already_selected_key, numeric geocode_length)

   
// clear the dynamic value set
    geography_vs.clear();

   
// automatically select all cases that match the key passed in as a parameter
    selcase(GEOCODES_DICT, already_selected_key) multiple(automark);

   
// this geocode starts at the position after the already selected key
    numeric new_key_offset = length(already_selected_key) + 1;

   
// loop over the cases that match the already selected key
    for GEOCODES_DICT do

       
// extract the remaining geocodes
        string new_key_portion = strip(key(GEOCODES_DICT)[new_key_offset]);

       
// when the remaining geocodes match the geocode length we are expecting,
        // this is a match so add it to the value set
        if length(new_key_portion) = geocode_length  then
            geography_vs.
add(GEOCODE_NAME, tonumber(new_key_portion));
       
endif;

   
endfor;

   
// make sure that there was at least one geocode for this hierarchical level
    if geography_vs.length() = 0 then
       
errmsg("Geocode lookup error ... the geocode database is not complete.");
       
stop(1);
   
endif;

end;

We call this function from each procedure, specifying the currently selected geocode and the geocode length at each level. In this example, we assume that the region is length 1 and that the other two geocodes are length 2:

PROC REGION

preproc

    CreateGeographyVS(
""1);
   
setvalueset(REGION, geography_vs);

PROC DISTRICT

preproc

    CreateGeographyVS(
maketext("%v", REGION), 2);
   
setvalueset(DISTRICT, geography_vs);

PROC EA

preproc

    CreateGeographyVS(
maketext("%v%v", REGION, DISTRICT), 2);
   
setvalueset(EA, geography_vs);

Now we have a generalizable function that we can use in our censuses or surveys, a function that will work with any number of levels of geography.

Dynamic Value Sets With the New ValueSet Object


CSPro 7.3 introduces new ways to work with dynamic value sets. Dynamic value sets define the acceptable options for a field and they vary based on responses previously given. Typical value sets, defined in the data dictionary, define a fixed set of responses for a field, but with a dynamic value set, you can customize these responses based on specific conditions.

Prior to CSPro 7.3, you could create dynamic value sets using arrays, but working with these was cumbersome and not intuitive. Now there is a ValueSet object that allow for simpler, and more sophisticated, value set creation. Four scenarios are presented below that show how to use the new ValueSet object.

Easily Create a Dynamic Value Set in a Loop

A typical task is to create a value set based on some attributes entered previously. For example, you might want to present a list of people in a household who are aged 15+ as eligible heads of household. Using the ValueSet object with a for loop with a where condition makes this task trivial:

PROC HOUSEHOLD_HEAD

preproc

   
ValueSet household_head_vs;

   
for numeric line_number in PERSON_ROSTER where AGE >= 15 do
        household_head_vs.
add(NAME, line_number);
   
endfor;

   
setvalueset(HOUSEHOLD_HEAD, household_head_vs);

Combining Value Sets

Suppose you have a question that asks about the way that someone deceased. In the dictionary there is one set of responses that applies to all people and an additional set of responses that applies to females aged 12+. Now you can easily create a dynamic value set, conditionally adding the female aged 12+ responses:

PROC MORTALITY_REASON

onfocus

   
ValueSet mortality_reason_vs = MORTALITY_REASON_ALL_PEOPLE_VS;

   
if SEX = 2 and AGE >= 12 then
        mortality_reason_vs.
add(MORTALITY_REASON_FERTILE_WOMEN_VS);
   
endif;

   
setvalueset(MORTALITY_REASON, mortality_reason_vs);

Removing a Value Based on a Previous Selection

Sometimes a questionnaire has a series of questions that asks about preferences, such as, "What is your favorite color?," and then, "What is your second favorite color?" The list of options for the second question can exclude the selected answer to the first question. The ValueSet object makes this task very easy:

PROC SECOND_FAVORITE_COLOR

preproc

   
ValueSet second_favorite_color_vs = FAVORITE_COLOR_VS;

    second_favorite_color_vs.
remove(FAVORITE_COLOR);

   
setvalueset(SECOND_FAVORITE_COLOR, second_favorite_color_vs);

Iterate Through Value Set Codes and Labels

Finally, there are two lists that are part of a value set, accessed using the codes and labels attributes. Just as ValueSet is a new object in CSPro 7.3, lists, though around in some form for years, are now fully useable objects. This simplifies iterating through the codes and labels of a value set. For example, if the first two digits of the county code are equal to the state code, a dynamic value set for counties could be created as follows:

PROC COUNTY

preproc

   
ValueSet filtered_county_vs;

   
numeric first_county_code = STATE * 100;
   
numeric last_county_code = first_county_code + 99;

   
do numeric counter = 1 while counter <= COUNTY_VS.codes.length()

       
if COUNTY_VS.codes(counter) in first_county_code:last_county_code then
            filtered_county_vs.
add(COUNTY_VS.labels(counter), COUNTY_VS.codes(counter));
       
endif;

   
enddo;

   
setvalueset(COUNTY, filtered_county_vs);

Validating Text Fields with Regular Expressions


There are many ways of formatting the text data you collect in a CSPro application. For example, in the United States it is common to write a telephone number as xxx-xxx-xxxx or (xxx) xxx-xxxx. If only a text field is used, the interviewer could enter either format. However, not knowing the format creates extra work post-data collection, so as the application developer you will want to accept a single format.

This is done using the regexmatch function which was introduced in CSPro 7.2. The function takes two strings, the target and a regular expression and returns whether there is a match or not. In this example, the target string is the telephone number and the regular expression string describes the valid variations of the telephone number.

Regular expressions have their own syntax separate from CSPro logic. To help write your regular expression you can use any regular expression editor that supports the ECMAScript (JavaScript) engine (or flavor).

Writing a Regular Expression

Let us write a regular expression that describes a telephone number in the following format: xxx-xxx-xxxx. We will use the online regular expression editor regex101, make sure to select ECMAScript as the flavor. Start by typing the phone number 123-456-7890 into the test string field. As you write the regular expression, you will notice that the test string is highlighted as it is described by the regular expression.

Step 1

step-1-regex101

Begin your regular expression by asserting its position at the start of a newline. This will keep your phone number from matching something like otherData123-456-7890.

Step 2

step-2-regex101

The first character is any number from 0 to 9.

Step 3

step-3-regex101

The following two characters are also any numbers from 0 to 9. Signal that the pattern will repeat three times.

Step 4

step-4-regex101

The next character is a hyphen, and will match nothing else, so enter the literal hyphen character.

Step 5

step-5-regex101

Notice the pattern of the next four characters is the same as the past four. Wrap everything, but the caret in parentheses to create a capture group and signal that the pattern will repeat two times.

Step 6

step-6-regex101

The last four characters are any numbers from 0 to 9. Signal that the pattern will repeat four times.

Step 7

step-7-regex101

Finally, end your regular expression by asserting its position at the end of a newline. This will keep your phone number from matching something like 123-456-7890otherData.

Validating a Text Field

With your regular expression in hand, you are ready to validate the telephone number in CSPro. Call regexmatch passing in the telephone number and the regular expression. If 0 is returned then display an error message and re-enter. This allows the interviewer to correct the telephone number. Otherwise, if 1 is returned, do nothing and let the interview continue.

PROC TELEPHONE_NUMBER

postproc

   
if regexmatch(TELEPHONE_NUMBER, "^([0-9]{3}-){2}[0-9]{4}$") = 0 then
       
errmsg("Invalid format! Use the following format: xxx-xxx-xxxx.");
       
reenter;
   
endif;

To see a working example, download the regexmatch application.

Disabling Automatic Updates on Android


Before sending interviewers into the field, consider what would happen if CSEntry automatically updated during your survey. If you developed a CSPro application for a previous version or the current version of CSEntry, you may not want the next update. Fortunately, it is simple to disable automatic updates for CSEntry or all apps on Android.

Disable automatic updates for CSEntry

  1. Open the Google Play Store google-play.
  2. Tap Menu menu > My apps & games.
  3. Select CSEntry.
  4. Tap More more.
  5. Uncheck the box next to "Auto-update."

Disable automatic updates for all apps

  1. Open the Google Play Store google-play.
  2. Tap Menu menu > Settings.
  3. Tap Auto-update apps.
  4. Select Do not auto-update apps.

Update apps manually

  1. Open the Google Play Store google-play.
  2. Tap Menu menu > My apps & games.
  3. Apps with an update available are labeled "Update."
  4. Tap Update All to update all apps. For individual apps, find the specific app you want to update and tap Update.

Tip: In some cases, you may need to restart your device to update an app.