Export large DAT files

Other discussions about CSPro
k.t.nguyen@cgiar.org
Posts: 3
Joined: July 10th, 2018, 5:57 am

Export large DAT files

Postby k.t.nguyen@cgiar.org » July 24th, 2018, 4:56 am

Hi all,

I currently have many large DAT files that I want to export to more commonly used formats like CSV or TAB Delimited. My issue is the exported files become very large (2-3 gb) and hard to process by within Excel or R' capacity. I would eventually want to connect each data to a region and make some type of maps. What is your experience with this issue? Is there other ways that you would recommend using these DAT files?

Thanks so much and I look forward to your advice!
Kien

josh
Posts: 1653
Joined: May 5th, 2014, 12:49 pm
Location: Washington DC

Re: Export large DAT files

Postby josh » July 24th, 2018, 6:36 am

When dealing with very large data sets we often break them up in smaller files and process one piece at a time. For example we might create one file per region or province of the country. When using CSPro for tabulation this is a common workflow as CSPro can process each of the smaller files and the aggregate the tabulated together at the end. Depending on how you are processing the data in R or Excel you might be able to do something similar. To split your data you can use a batch application and in your logic call the setoutput function (http://www.csprousers.org/help/CSPro/se ... ction.html) to set the a different output file for each region.

Another way to reduce the file size is to drop the variables that you don't need for your maps. You can do this by making a copy of your dictionary, removing the variables you don't want and then using the reformat data tool to with the old and new dictionaries to create a new data file without those variables.


Return to “Other”