The selcase (“select case”) function is used to display a list of cases in an external dictionary, letting an interviewer select a case to load. One function not mentioned in that page’s help documentation is the ability for the user to select multiple cases. By using the multiple keyword, the interviewer can select more than one case and then iterate over each of those cases using a for loop.
An undocumented feature allows for all qualified cases to be automatically marked. Using the automark keyword, the
selcase dialog is not shown to the interviewer. These two sets of code are the same:
// create a dynamic value set showing the EAs in Region 1 / District 1
valueset ea_vs;
// code demo 1 --- using a forcase loop
forcase EA_DICT where REGION = 1 and DISTRICT = 1 do
ea_vs.add(EA_NAME, EA);
endfor;
// code demo 2 --- automatically marking cases and then using a for loop
selcase(EA_DICT, "") where REGION = 1 and DISTRICT = 1 multiple(automark);
for EA_DICT do
ea_vs.add(EA_NAME, EA);
endfor;
Because selcase allows you to pass a key to match (“” in the example above, which means all case keys), you can use this as a trick to efficiently create value sets if you know what part of the key is. For example, if you have a data file with 50,000 cases, forcase will always loop through all 50,000 cases, whereas providing a key match may limit your loop to substantially fewer cases.
To show a possible use for this trick, we will look at two ways of creating hierarchical value sets for geocodes. Supposing we have three levels of geography—Region, District, and EA—one way to structure a geocode lookup file is as follows:
Region |
District |
EA |
Geocode Name |
1 |
|
|
Region 1 |
1 |
2 |
|
District 2 in Region 1 |
1 |
2 |
5 |
EA 5 in Region 1 / District 2 |
That is, when defining regions, the district and EA codes are left blank, and when defining districts, the EA code is left blank.
Using a forcase loop to populate the districts based on a selected region, we would loop over the entire data file, filtering on cases where the geocode region matches the selected region, where the geocode district is defined, but where the geocode EA is blank:
PROC DISTRICT
preproc
valueset geography_vs;
forcase GEOCODES_DICT where GEOCODE_REGION = REGION and
GEOCODE_DISTRICT <> notappl and
GEOCODE_EA = notappl do
geography_vs.add(GEOCODE_NAME, GEOCODE_DISTRICT);
endfor;
setvalueset(DISTRICT, geography_vs);
Prior to this, to generate the region value set, we would look for cases where the geocode district is blank, and then following this, to generate the EA value set, we would look for cases where the region and district match the selected codes and where the EA is defined. To generate the hierarchical value sets for the three levels of geography would require fairly different loops.
With the selcase automark trick, we can create a single function that can be used to generate the value set for each level of geography:
PROC GLOBAL
valueset geography_vs;
function CreateGeographyVS(string already_selected_key, numeric geocode_length)
// clear the dynamic value set
geography_vs.clear();
// automatically select all cases that match the key passed in as a parameter
selcase(GEOCODES_DICT, already_selected_key) multiple(automark);
// this geocode starts at the position after the already selected key
numeric new_key_offset = length(already_selected_key) + 1;
// loop over the cases that match the already selected key
for GEOCODES_DICT do
// extract the remaining geocodes
string new_key_portion = strip(key(GEOCODES_DICT)[new_key_offset]);
// when the remaining geocodes match the geocode length we are expecting,
// this is a match so add it to the value set
if length(new_key_portion) = geocode_length then
geography_vs.add(GEOCODE_NAME, tonumber(new_key_portion));
endif;
endfor;
// make sure that there was at least one geocode for this hierarchical level
if geography_vs.length() = 0 then
errmsg("Geocode lookup error ... the geocode database is not complete.");
stop(1);
endif;
end;
We call this function from each procedure, specifying the currently selected geocode and the geocode length at each level. In this example, we assume that the region is length 1 and that the other two geocodes are length 2:
PROC REGION
preproc
CreateGeographyVS("", 1);
setvalueset(REGION, geography_vs);
PROC DISTRICT
preproc
CreateGeographyVS(maketext("%v", REGION), 2);
setvalueset(DISTRICT, geography_vs);
PROC EA
preproc
CreateGeographyVS(maketext("%v%v", REGION, DISTRICT), 2);
setvalueset(EA, geography_vs);
Now we have a generalizable function that we can use in our censuses or surveys, a function that will work with any number of levels of geography.