Index File

Discussions about CSEntry
Post Reply
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Index File

Post by khurshid.arshad »

I am using an external CSDB file containing 40,000 cases, with each case consisting of a single line of information.

I am using a forcase statement to select cases based on a where clause and then writing the selected data to another CSDB file. The process works but is time-consuming. However, when I implement an index, the file is not written. It creates blank csdb file.

Could you advise on why the use of an index might prevent the file from being written?

setfile (External_Dictionary, "..\file.csdb");Contains 40000 cases
setfile (External_Dictionary, "..\file.csidx",create);

setfile (Another_Ext_Dictionary, "..\writefile.csdb");for selected sample

forcase External_Dictionary, where Supervisor_Code= Suplogin_code do

HH_ID=SW_SEL_HH_ID;
PRO_CODE=SW_PRO_CODE;
HH=SW_SELECTED_HH;
HHH_NAME=SW_HHH_NAME;

writecase(Another_Ext_Dictionary);
endfor;
close(ASSIGN_HOUSEHOLD);
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: Index File

Post by khurshid.arshad »

Dear Team;

Any update?

a.
justinlakier
Posts: 227
Joined: November 21st, 2022, 4:41 pm

Re: Index File

Post by justinlakier »

Hello,

The example you gave was of the working code with the forcase, so it's not clear why the version implemented with an index isn't working. However, looking at the use of "setfile" in the snippet you provided raises some questions, as you may be looping over the wrong External_Dictionary if you are attempting to set multiple data files to that dictionary. You can test by putting an output statement in the loops to confirm when you are properly looping through the data you want in both the forcase and index versions of the loop. If this does not help, please provide here or at cspro@lists.census.gov your zipped code including the forcase and index versions of the loop. Let us know if you have any additional questions.

Hope this helps,
Justin
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: Index File

Post by khurshid.arshad »

Thank you. Please check the email.
a.
justinlakier
Posts: 227
Joined: November 21st, 2022, 4:41 pm

Re: Index File

Post by justinlakier »

Hello,

Having tested it, I can now see why the attempt to use a .csidx index file is causing a problem. After I commented out the line

Code: Select all

setfile (External_Dictionary, "..\file.csidx",create);
it worked fine. In order to loop over the file.csdb with all of your cases, you need to use

Code: Select all

setfile (External_Dictionary, "..\file.csdb");
. Calling another setfile to External_DIctionary immediately afterwards is overriding the .csdb you need to loop over with the .csidx that has no cases and cannot be looped over.

See more on index files here for why this .csidx file has no cases and cannot be looped over. You can see that a .csidx file is only made for text/JSON data sources, not for csdb data sources. You can check this by going to your folder and seeing that there are .csidx files for your text files, but none are created for the .csdb files. Let us know if you have any additional questions.

Hope this helps,
Justin
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: Index File

Post by khurshid.arshad »

Dear Justinlakier

Can you explain how I can create a JSON file for external data?

Thank you.
a.
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: Index File

Post by khurshid.arshad »

Hello,

Now, I am able to convert the CSDB file into JSON format BY following these steps:
Open the CSDB file in CSPro Data Viewer window->File->Export->Select JSON as the export format.

The converts the CSDB data into JSON format. However, the time taken to view the sample remains the same.

setfile(SAMPLED_HH_DICT, "..\106_Ext_Data\MyFile.json");//Household[s] Sample file
setfile(ASSIGN_HOUSEHOLD, concat("..\106_Ext_Data\PRN_", edit("9999", PSU_ID), ".csdb"));//Selected Household sample file





forcase SAMPLED_HH_DICT where SW_PRO_CODE= PSU_ID and SW_S_CODE= tonumber(loadsetting("SupervisorID")) do

endfor;
Gregory Martin
Posts: 1851
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: Index File

Post by Gregory Martin »

All CSPro data files use an index. Some indices, like for CSPro DB, are internal to the file itself, and others (like for text), are external, using that .csidex format. However, the index is only on the case key, and because you are querying data that is not part of the key, the index won't help much.

Because data is broken out into tables in CSPro DB files in its SQLite format, you can query data by creating an appropriate query. For example, your code uses this to get a list of SW_SEL_HH_ID values to work with:
forcase SAMPLED_HH_DICT where SW_PRO_CODE = PSU_ID AND SW_S_CODE = tonumber(loadsetting("SupervisorID")) do
I can write this using a SQLite query:
string query = maketext("SELECT `level-1`.`sw_sel_hh_id` FROM `cases` "
                        "JOIN `level-1` ON `level-1`.`case-id` = `cases`.`id` "
                        "JOIN `sampled_hh_rec` ON `sampled_hh_rec`.`level-1-id` = `level-1`.`level-1-id` "                         
                        "WHERE `cases`.`deleted` = 0 AND `sampled_hh_rec`.`sw_pro_code` = %d AND `sampled_hh_rec`.`sw_s_code` = %d;"
,
                        PSU_ID, tonumber(loadsetting("SupervisorID")));

List string selected_hh_ids;
sqlquery(SAMPLED_HH_DICT, selected_hh_ids, query);
Now you can process the appropriate values:
do numeric ctr = 1 while ctr <= selected_hh_ids.length()
    SW_SEL_HH_ID = tonumber(selected_hh_ids(ctr));
    errmsg("Matched: %v", SW_SEL_HH_ID);
enddo;
On my machine, the forcase loop takes 0.955 seconds, whereas sqlquery takes 0.022 seconds, so a massive time improvement.
khurshid.arshad
Posts: 608
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: Index File

Post by khurshid.arshad »

Dear Gregory;

I hope you and team are doing well.

It is working perfectly. Thank you.
a.
Post Reply