At the end of this session participants will be able to:
- Use lookup files in data entry
- Create user-defined functions in CSPro
- Extend the interface of CSEntry with userbar buttons
- Use the commands advance and move to navigate through the questionnaire
- Use the function visualvalue to get the value of variables that are "off path"
- Use the command showarray to display tables to the interviewer
It is currently possible for the interviewer to enter a district code that is not valid for the province that was selected. In order to verify that the district code is valid for the province we need to use the list of province and district codes in the annex of the questionnaire. We can do this by creating a lookup file from the Excel spreadsheet. We can then use this lookup file to check for valid districts.
First, we need to create a dictionary for our lookup file. For this task, we want to be able to query if the combination of the province and district codes is a valid district. In annex 2 of the questionnaire we have a table with the province code, district code and district name. We can use this as a lookup file where the keys are the province and district code and the value is the district name. We can create a dictionary for this using the Excel to CSPro tool from the tools menu. Select the "Create CSPro dictionary from Excel File" tab, click "Select Excel File" and browse to the file QuestionnaireAnnexes.xlsx. Choose "ANN2 – District Codes" and click "Analyze Worksheet". This detects the dictionary variables to create from the columns of the spreadsheet. The tool will show four variables, one for each column and will set the default names to column headers of the spreadsheet. Our lookup file dictionary will have the province code and district code as the ID items and province and district names as regular variables. Next to the province and district codes check the "ID" box to indicate that these should be ID items. In order to avoid name conflicts with the main dictionary we can use DI_PROVINCE_CODE, DI_DISTRICT_CODE, DI_PROVINCE_NAME, and DI_DISTRICT_NAME as the variable names. Modify them in the Excel to CSPro tool.
We must make sure that the lengths and zero-fill settings of the ID items exactly match those in the main dictionary otherwise we won't be able to use the variables from the main dictionary as keys for the lookup. Verify that the length of the province and district codes match those in the main dictionary and check the box "Zero fill numerics" to make sure that the ID items are zero filled. Once all settings are correct, click create dictionary to generate the CSPro dictionary file. Save it as "Districts.dcf" in the household folder.
Once we have the dictionary we need to convert the Excel spreadsheet into a CSPro data file. We can use the first tab of the Excel to CSPro tool to do this. Select the QuestionnaireAnnexes.xlsx file again along with the Districts dictionary that we just created. Specify a new output data file to write the lookup file to. Under "Record to Worksheet Mapping" choose the second worksheet "ANN 2-District Codes". Under the "Item to Column Mapping" link the dictionary variables to the appropriate columns in the Excel spreadsheet: DI_PROVINCE_CODE to "Column B: Province Code", DI_DISTRICT_CODE to "Column D: District Code", etc… Finally, click "Create CSPro Data File" to generate the file. Save it in the Data folder. Verify the file in Data Manager to make it sure it was converted correctly.
Finally, in the district proc use the
loadcase() command to lookup the province and district codes in the file. Loadcase takes the name of the dictionary (
DISTRICT_DICT) and the values to use as keys (ID items) for the lookup. For example, to lookup province 3, district 6 we would do:
If loadcase finds a record in the lookup file with province code 3 and district code 6 it will return 1 and set the variables in DISTRICT_DICT to the values from the case it found. In this case that means setting the two ID items DI_PROVINCE_CODE, DI_DISTRICT_CODE and the variables DI_DISTRICT_NAME and DI_PROVINCE_NAME.
We can use this to test if the province and district codes are valid in the DISTRICT proc:
PROC DISTRICT
// Verify that the district code is valid for the province selected.
if loadcase(DISTRICT_DICT, PROVINCE, DISTRICT) = 0 then
errmsg("District code %d is not valid for province %l",
DISTRICT, PROVINCE);
reenter;
else
errmsg("You have selected district: %s", DI_DISTRICT_NAME);
endif;
Note that we are using the
PROVINCE and
DISTRICT from the main dictionary as arguments to
loadcase, not the ID items from the districts dictionary. Before calling loadcase the ID items for the external dictionary are all blank. They are only set if
loadcase is successful.
Note that when you run this application, in addition to copying the pen and pff files to the Android device, you must now also copy the lookup file (the .csdb file).
We can now add alpha variables to the main dictionary, assign the province and district names to them and display them on the form as protected fields:
PROC DISTRICT
// Verify that the district code is valid for the province
// selected.
if loadcase(DISTRICT_DICT, PROVINCE, DISTRICT) = 0 then
errmsg("District code %d is not valid for province %l",
DISTRICT, PROVINCE);
reenter;
else
// Assign province and district names from lookup to main
// dictionary variables so we can display them on form.
PROVINCE_NAME = DI_PROVINCE_NAME;
DISTRICT_NAME = DI_DISTRICT_NAME;
endif;
If the data for your lookup file changes once your survey is in the field, rather than completely regenerating the lookup file you can update the existing lookup data file based on a new Excel spreadsheet. Choose the existing csdb file as the output data file and under "Case Management" choose "Modify, add cases" or "Modify, add, delete cases". Instead of erasing and regenerating the data file this will update only the cases that are new or have been modified in the Excel file. Choosing "Modify, add, delete cases" will also delete cases that are in the data file but are not in the spreadsheet. Using this approach is important if you are planning to use CSPro data synchronization to update the lookup file during data collection. If you are using data synchronization to update the file and you generate a completely new data file you will get duplicate cases when the new lookup file is downloaded to the devices. If instead, you update the existing file then data synchronization will simply update the data files on devices that download it.
An alternative to using the lookup file to check the district and province codes is to combine the lookup file with
setvalueset to create a dynamic value set for the district that includes only districts in the selected province. To do this we need to extract all the districts in the selected province from the lookup file. We can do this using the
forcase loop which iterates over all cases in an external data file. By itself,
forcase will loop through each case in the data file. You can also add an optional "where" clause to only go through the districts in the selected province.
PROC DISTRICT
onfocus
// Create dynamic value set of districts for selected province using lookup file
ValueSet districtsVSet;
forcase DISTRICT_DICT where DI_PROVINCE_CODE = PROVINCE do
districtsVSet.add(DI_DISTRICT_NAME, DI_DISTRICT_CODE);
endfor;
setvalueset(DISTRICT, districtsVSet);
Often you find you have identical blocks of logic in multiple procs in your application. This can cause problems if you later change the code in one place to fix a bug and forget to change it in the other. In such situations it is better to put the logic in a user-defined function which you can then call from all the procs where it is used. User-defined functions are defined in the PROC GLOBAL. Anything declared in the PROC global is available in all the PROCs in your program. If you declare a logic variable inside the PROC of a dictionary variable or group, it is only available inside that PROC but if you declare it in the PROC GLOBAL you can use it anywhere. To view the proc GLOBAL, in the logic view, click on the first item in the form tree. This shows all of the program logic at once: the proc GLOBAL plus all the other procs. Clicking on any other item in the form tree shows just the procs for that item.
User-defined functions can take arguments and return values just like built in functions. You call them the same way you call built in CSPro functions. Let's define a function that we can use in all the places where we check if a household member is a woman of childbearing age. This function will be passed the index (row number) of the household member in the household roster and it will return one if the person is a woman over 12 years old and zero otherwise. This way if we later decide that we should be using 13 or 14 as a minimum age we only have to make the change in one place.
// Determine if member of household from household roster is a woman of childbearing
// age. Pass in the index (occurrence number) of the household member.
function isChildbearingWoman(index)
if SEX(index) = 2 and AGE(index) >= 12 then
isChildbearingWoman = 1;
else
isChildbearingWoman = 0;
endif;
end;
Now we can use this function when building the value sets for line number of mother of child (B10) and the line number of mother of deceased (E08).
PROC MOTHER_LINE_NUMBER
onfocus
// Create the value set for child mother from all eligible women
// in household roster.
ValueSet motherVSet;
do numeric indexRoster = 1 while indexRoster <= totocc(HOUSEHOLD_MEMBERS_ROSTER)
if isChildbearingWoman(indexRoster) = 1 and indexRoster <> curocc() then
motherVSet.add(NAME(indexRoster), indexRoster);
endif;
enddo;
motherVSet.add("Non-resident",87);
motherVSet.add("Deceased",88);
setvalueset(MOTHER_LINE_NUMBER, motherVSet);
PROC MOTHER_OF_DECEASED_LINE_NUMBER
onfocus
// Create the value set for deceased mother from all eligible women
// in household roster
ValueSet motherVSet;
do numeric indexRoster = 1 while indexRoster <= totocc(HOUSEHOLD_MEMBERS_ROSTER)
if isChildbearingWoman(indexRoster) = 1 then
motherVSet.add(NAME(indexRoster), indexRoster);
endif;
enddo;
motherVSet.add("not in household",99);
setvalueset(DECEASED_MOTHERS_LINE_NUMBER, motherVSet);
Looking at these two procs we could probably move the shared code that creates the value set into a function as well.
// Create a value set of all household members that are eligible to be
// mothers by filling in the ValueSet motherVSet that is passed as
// the first argument.
// Labels are names of household members and codes are the corresponding
// line numbers.
// The household member in row excludeIndex will not be included in the value
// set. This can be used to exclude someone from being their own mother.
function createMothersValueSet(ValueSet motherVSet, excludeIndex)
do numeric indexRoster = 1 while indexRoster <= totocc(HOUSEHOLD_MEMBERS_ROSTER)
if isEligibleMother(indexRoster) = 1 and indexRoster <> excludeIndex then
motherVSet.add(NAME(indexRoster), indexRoster);
endif;
enddo;
end;
PROC MOTHER_LINE_NUMBER
onfocus
// Create the value set for child mother from all eligible
// women in household roster. Exclude current occurrence so
// that person cannot be their own mother.
ValueSet motherVSet;
createMothersValueSet(motherVSet, curocc());
// Add additional entries for non-resident and deceased
motherVSet.add("Non-resident",87);
motherVSet.add("Deceased",88);
// Update the value set with values in ValueSet object
setvalueset(MOTHER_LINE_NUMBER, motherVSet);
PROC MOTHER_OF_DECEASED_LINE_NUMBER
onfocus
// Create the value set for child mother from all eligible
// women in household roster
ValueSet motherVSet;
createMothersValueSet(motherVSet, 0);
// Add additional entry for not in household
motherVSet.add("not in household",99);
// Update the value set with values in codes and labels
setvalueset($, motherVSet);
Another use for user-defined functions is for fills in question that require some calculation. For example, for question D05 we want to double check that the total of the children living with the women, living elsewhere and deceased equals the number of total births. We do this by asking the question:
Just to make sure that I have this right, (name) has had in total (total number) births during her life. Is this correct?
We can use the question text for this but we don't have a dictionary variable for total births. We only have the yes/no variable IS_TOTAL_BIRTHS_CORRECT. We could create an additional dictionary variable but instead we can simply create a function to compute the total births and use that as the fill value.
// Compute totalBirths to display in question text in fertility section
function totalBirths()
totalBirths = CHILDREN_IN_HOUSEHOLD + CHILDREN_ELSEWHERE + CHILDREN_DECEASED;
end;
In the question text we can simply add the function call surrounded by ~~:
Just to make sure that I have this right, ~~NAME~~ has had in total ~~totalBirths()~~ births during her life. Is this correct?
What happens when one of the fields in this calculation is skipped? The value becomes
notappl which messes up the entire calculation. We need to be a little smarter in calculating our total to exclude the skipped values.
// Compute totalBirths to display in question text in fertility section
function totalBirths()
numeric total = 0;
if CHILDREN_IN_HOUSEHOLD <> notappl then
total = total + CHILDREN_IN_HOUSEHOLD;
endif;
if CHILDREN_ELSEWHERE <> notappl then
total = total + CHILDREN_ELSEWHERE;
endif;
if CHILDREN_DECEASED <> notappl then
total = total + CHILDREN_DECEASED;
endif;
totalBirths = total;
end;
From logic we can add buttons to the CSEntry user interface that call user-defined functions. Here is how to add a userbar button that will create an errmsg dialog that says "hello". First we define the function hello in the PROC GLOBAL:
Then in the preproc of the application we add it to the userbar:
When adding a button in the preproc of the application it is important to call clear first, otherwise we can end up with two or three copies of the same button if we start the application multiple times. We added the button in the preproc of the application but you can add or remove buttons in any proc so you can, for example, only a show button within a certain field or a certain roster.
Let's try a more interesting example. Let's add a "Go To…" button that will let the user navigate directly to a particular section of the questionnaire.
The above will let the interviewer jump from one section to another but using
skip means that if we use our
Go To… button to jump over a section, that section will end up skipped and won't be saved in the data file. Instead of using
skip we can use
advance which moves forward in the questionnaire without marking fields as skipped. Using
advance also runs all the preprocs and postprocs of the fields that are passed through to ensure that no consistency or out of range checks are missed.
This stops the function from skipping over data when navigating, however it still has a limitation. We can only navigate forward in the questionnaire. To go backwards we need to use
reenter, but in our goto() function we don't know if the user wants to move forward or backward. Fortunately, CSPro provides the command
move which will use either
skip or
reenter as appropriate. By default, when going forward, move does a skip but you can add
advance after the field name to make it do an
advance instead of a
skip.
Let's extend our goto function to let the interviewer navigate directly to an individual in the household roster. If they choose "household members" then instead of going to the first person in the roster, we will show them a list of all household members in the roster and the let them choose which one to goto. For this we can use the function
showarray(). This function takes an array of values and displays them in a grid in a dialog box and returns the row number that the user picks.
An array logic variable is similar to a dictionary item with occurrences. A numeric array of length seven stores seven numbers, each of which is accessed through subscripts.
showarray() takes a two-dimensional array. You can think of a two-dimensional array as a grid of variables or as a matrix. You declare a two-dimensional array the same way you declare a one-dimensional array except that you specify the size in both dimensions: number of rows and number of columns.
In our case we want up to 30 rows, one for each person, and we will use 3 columns so that we can display the name, sex and relationship for each person.
When assigning a value to a two-dimensional array you specify both the row and column you want to put the value in:
// Set value in row 4, column 2
householdMembersArray(4, 2) = "This is the 4th row, 2nd column";
In our examples we will loop through the members in the household roster and add the name, sex and relationship for each one to our array. Since this will be more than a few lines of code let's put this into a function by itself and then call it from our goto() function.
// Show list of entries in household roster in a dialog and let interviewer pick
// one. Returns the row number of the person that was picked or zero if the
// dialog was canceled.
function pickFromHouseholdRoster()
numeric i;
do i = 1 while i <= totocc(HOUSEHOLD_MEMBERS_ROSTER)
householdMembersArray(i, 1) = strip(NAME(i));
householdMembersArray(i, 2) = getlabel(SEX, SEX(i));
householdMembersArray(i, 3) = getlabel(RELATIONSHIP, RELATIONSHIP(i));
enddo;
householdMembersArray(i, 1) = ""; // Mark end
numeric picked = showarray(householdMembersArray, title("Name", "Sex",
"Relationship"));
pickFromHouseholdRoster = picked;
end;
// Userbar function for navigating
// directly to different parts of questionnaire.
function goto()
numeric section = accept("Go to?",
"Identification",
"Household members",
"Demographics");
if section = 1 then
move to IDENTIFICATION_FORM advance;
elseif section = 2 then
numeric index = pickFromHouseholdRoster();
if index > 0 then
move to NAME(index) advance;
endif;
elseif section = 3 then
move to DEMOGRAPHICS_FORM advance;
endif;
end;
Note that for sex and relationship we want the value set labels so we use the function getlabel(). This function returns the label from value set as an alpha value. It takes the name of the variable (or value set) and the value to use. For example
getlabel(SEX, 1) will return "Male". Inside our loop,
getlabel(SEX, SEX(i)) will return the label for the sex of the ith household member.
Our goto() function works when we are in section B or C but when we are in section A the sex and relationship are blank. What is going on? In system controlled mode, CSEntry keeps track of which variables are on and off the "path". Variables that you have entered are considered "on path" but those that have been skipped, even if there was a value in them before they were skipped, are considered "off path". As we have seen before, variables that are "off path" are considered blank (
notappl) in logic. It turns out that variables that are ahead of the current field are also considered "off path" until you pass through them. The idea is that these fields have not yet been validated by running their preproc and postproc with the current values of all preceding fields and therefore cannot be considered final. The effect of this is that the values of all fields ahead of the current field in the questionnaire are
notappl in logic.
Interestingly, as we can see from our current code, this only applies to numeric items. Our code works just fine for the NAME field.
You can see which fields are "on path" by looking at the background color of the field:
- Green: on path
- Dark Gray: skipped
- White: not yet been filled in
- Light Gray: protected
Note that the field coloring scheme is different in operator controlled mode.
Fortunately, we can get the value of fields that are "off path" using the function
visualvalue(). It returns whatever value is currently visible in the field whether or not it has been skipped or is ahead of the current field. Using this for relationship and sex in our function we get:
function pickFromHouseholdRoster()
numeric i;
do i = 1 while i <= totocc(HOUSEHOLD_MEMBERS_ROSTER);
householdMembersArray(i, 1) = strip(NAME(i));
householdMembersArray(i, 2) = getlabel(SEX, visualvalue(SEX(i)));
householdMembersArray(i, 3) = getlabel(RELATIONSHIP,
visualvalue(RELATIONSHIP(i)));
enddo;
householdMembersArray(i, 1) = ""; // Mark end
numeric picked = showarray(householdMembersArray,
title("Name", "Sex", "Relationship"));
pickFromHouseholdRoster = picked;
end;
All that is left now is to use the appropriate relationship for the sex of the household member:
Let's prefill the interview start time. We can make the field protected and set the value in the preproc just like we did with the
PERSON_NUMBER field. We can use the function
systime() which returns the current time as a number formatted according to the format specification passed in.
PROC INTERVIEW_START_HOURS
preproc
// Prefill interview start time with current time.
INTERVIEW_START_TIME = systime("HHMM");
This works the first time we visit the field but what happens when we come back to the question a minute or two later? We only want to record this value the first time the interviewer enters the field and once it is set we don't want it to change. We can do this by comparing the value of
INTERVIEW_START_TIME to blank (
notappl) in the preproc and only setting the value to systime if it is blank.
But this doesn't seem to work. Why? Is
INTERVIEW_START_HOURS on path when we are in the preproc of
INTERVIEW_START_HOURS? It is not. We have to get to the postproc for the variable to be on path. However, we can use
visualvalue() to get the value of the field in the preproc:
We can also fill in the interview end time automatically using
systime(). Unlike with the start time, we need to do this at the end of the interview, i.e. in the postproc of the questionnaire.
We should make this field protected. However, if we do that we get an error when we don't fill it in while in section A of the questionnaire. While we don't normally want to allow a field to be left blank, in this case we need to make an exception. We can do that by changing the validation method in the field properties to "Allow out of range without confirmation". With this setting, you are allowed to leave a field blank or to enter a value outside the value set with no error.
Having captured the interview start and end time as hours and minutes we can calculate the total interview time by subtracting the start time from the end time. However, this calculation is tricky because we have to handle the case where the end minutes are less than the start minutes. It would be easier if instead of using
systime() we used the function
timestamp() which gives the time in total seconds since January 1, 1970. If we subtract the start timestamp from the end timestamp we get the total interview time in seconds. Let's add new protected variables for start timestamp, end timestamp and interview duration and fill them in. Since these fields will be on the section A form but will not be filled until the end of the questionnaire, set the validation method to "Allow out of range without confirmation" as we did with INTERVIEW_END_TIME.
PROC INTERVIEW_START_TIMESTAMP
preproc
// Prefill interview start time with current time.
if visualvalue(INTERVIEW_START_TIMESTAMP) = notappl then
INTERVIEW_START_TIMESTAMP = timestamp();
endif;
PROC POPSTAN2020_QUEST
postproc
// Set interview end time and total time first time end of questionnaire is reached
if INTERVIEW_END_TIMESTAMP = notappl then
INTERVIEW_END_TIMESTAMP = timestamp();
INTERVIEW_DURATION_MINUTES = (INTERVIEW_END_TIMESTAMP - INTERVIEW_START_TIMESTAMP)/60;
endif;
If we don't need to display the start and end times on the form we can avoid using visualvalue. We can simply remove the variables from the form and keep them in the dictionary. When we set the values of these variables in logic and the case is saved the values are saved to the data file just like variables that are on the form. Unlike variables on the form, however, variables that are not on the form are always considered "on path" so you do not need to worry about using visualvalue.
- Modify the question text for E01 to say "Has any member of this household passed away in the past years, that is since (year)?" where year is replaced by the year of the interview minus 5. For example, if the year of the interview is 2018 it should read "Has any member of this household passed away in the past years, that is since 2013?". Use the the interview date from question A6 to compute the value to insert.
- Extend the goto() function to include the remaining sections of the questionnaire (D through G).
- Add a button to the userbar called "household summary" that when clicked uses the errmsg function to display a message that shows the name of the head of household, and the number of household members by sex. For example: "Head of Household: John Brown, Total Members: 5, Women: 2, Men: 3". Bonus if you can get this to work when you click the button from section A. Hint: count and seek will not work with visualvalue so you will need to use a loop to find the number of males and females.
- Implement a check on the minimum and maximum per unit possession values in question G01. Use the spreadsheet in annex 4 to create a lookup file containing the asset code, minimum value and maximum value. Use the loadcase command with this lookup file to find the minimum and maximum values for the selected asset and show an error message if the per unit asset value entered in the roster is below the minimum value or above the maximum value. This should be a soft check.
- Fill in the interview start and end date automatically using the sysdate() command and make the interview start and end date fields protected.
- Add question A10, interview status to the dictionary and form. We want to fill in this question at the start of the interview if the response is code 2 (non-contact), 3 (vacant) or 4 (refused) and then immediately end the questionnaire without going into any of the subsequent questions. However, if there is a respondent willing to give the interview, then the interview status should be set 5 (partially complete) until the entire questionnaire is complete at which point it should be set to 1 (complete). To implement this, keep the question in its current position in the questionnaire but use a dynamic value set to limit the options so that the interview cannot be set as completed until the entire questionnaire has been completed. The first time the field is entered (before any value has been entered) the value set should be: 2 Non-contact, 3 Vacant, 4 Refused, 5 Continue interview. If the interviewer chooses 2, 3 or 4, end the interview, otherwise, if they choose 5, continue to the next field. At the end of the interview (postproc of the level), if the value is currently 5, set it to 1 (complete). If the interviewer returns to A9 after the field has been set to 1 (complete) then display the value set as it is on the questionnaire.