Reformat Tool hangs at 31% for 13GB dataset

Discussions about tools to complement CSPro data processing
Forum rules
New release: CSPro 8.0
Post Reply
slwendo
Posts: 6
Joined: March 11th, 2017, 4:13 pm

Reformat Tool hangs at 31% for 13GB dataset

Post by slwendo »

While reformatting a 13GB dataset (to fewer variables so I can run tables faster), the Reformat tool hangs at 31%. I left the machine running overnight and it was done the next morning but the output file was orders of magnitude larger than the input file. Actually, the reason the tool stopped was that there wasn't any more space on the machine to hold the output. Has anyone experienced something similar or has anyone successfully reformatted input files larger than 10GB?

I am using CSPPro 6.3 on a 64-bit machine running windows 10.
josh
Posts: 2399
Joined: May 5th, 2014, 12:49 pm
Location: Washington DC

Re: Reformat Tool hangs at 31% for 13GB dataset

Post by josh »

Try breaking your 13GB file into smaller files (<2 GB each) and reformat them separately. See this post for info on how to break up a data file into smaller ones (http://www.csprousers.org/forum/viewtopic.php?f=8&t=542). Then try running reformat on each of the smaller files. If that works you can concatenate the reformatted files together to get the data file for tabulation (you may also want to look at "run in parts" in the tabulation chapter of the CSPro help to see how you can generate tables from multiple input files). If that doesn't work please let us know and send us a copy of your dictionary.
Gregory Martin
Posts: 1777
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: Reformat Tool hangs at 31% for 13GB dataset

Post by Gregory Martin »

If you're just trying to remove some variables from the data file, you can also use the Export Data tool. Select CSPro format as your output format, and then select only the variables that you're interested in.

I can't guarantee that it'll work on a 13 gb file, but it has a better chance of working than the Reformat Data tool.
Post Reply