Dear Master
Does any tool to change the duplicate case ID number ? so later when its done, I can open this concatenated file without error message (duplicate ID/index).
This is my situation :
I have merge many files with Concatenate tools, with option "concatenate regardless file structure." I did not use the dictionary for check the duplicate because this will take much time, as I realized that the files I merge has many duplicate id.
File to merge: file1.dat, file2.dat, file3.dat, file4.dat, file5.dat, file6.dat, file7.dat, file8.dat, file9.dat, file10.dat, .... file20.dat
Output: All_Concatenated.dat
Right now I use Index File tools to check the duplicate ID in All_Concatenated.dat, but can not make direct change of duplicate ID.
I have to choice which case, write down the the other/s one, or leave them, then do the change the ID on file1.dat or file2.dat ...that will take a long time
So usually, I take the other way that more easy for me: I convert this All_Concatenated.dat to SPSS and make the change the duplicate ID in SPSS.
But, on this step, I can not back to CSPro data entry mode as file has been on SPSS format.
So perhaps if any tools that able to change duplicate ID Number directly ? So my All_Concatenated.dat would clean and can open without error/index id duplicate message anymore.
Please advice, master. Josh, George .
Many Thanks
Yanina
Tools to Direct Change Duplicate ID Number
-
- Posts: 1805
- Joined: December 5th, 2011, 11:27 pm
- Location: Washington, DC
Re: Tools to Direct Change Duplicate ID Number
You can run CSPro batch applications on files that have duplicate IDs, so with logic you could convert the ID. but you have to decide what rules you want to use to generate the new ID.
See attached for an example of how to do this. I assume that an ID like 999999 is unused and then assign that to duplicate cases. To run this you'll have to use the 7.0 beta.
See attached for an example of how to do this. I assume that an ID like 999999 is unused and then assign that to duplicate cases. To run this you'll have to use the 7.0 beta.
PROC GLOBAL
numeric newKey = 999999;
numeric caseCounter;
PROC MODIFYDUPLICATEKEY_FF
preproc
// clear out anything in the temporary key storage file
close(KEYSTORAGE_DICT);
open(KEYSTORAGE_DICT,create);
PROC MODIFYDUPLICATEKEY_QUEST
inc(caseCounter);
CASE_KEY = ID;
// has this key already been used?
if loadcase(KEYSTORAGE_DICT,CASE_KEY) then
errmsg("The key %d was already found at case %d; changing to %d",ID,CASE_NUMBER,newKey);
ID = newKey;
inc(newKey,-1);
endif;
// save the information about this key
CASE_KEY = ID;
CASE_NUMBER = caseCounter;
writecase(KEYSTORAGE_DICT);
numeric newKey = 999999;
numeric caseCounter;
PROC MODIFYDUPLICATEKEY_FF
preproc
// clear out anything in the temporary key storage file
close(KEYSTORAGE_DICT);
open(KEYSTORAGE_DICT,create);
PROC MODIFYDUPLICATEKEY_QUEST
inc(caseCounter);
CASE_KEY = ID;
// has this key already been used?
if loadcase(KEYSTORAGE_DICT,CASE_KEY) then
errmsg("The key %d was already found at case %d; changing to %d",ID,CASE_NUMBER,newKey);
ID = newKey;
inc(newKey,-1);
endif;
// save the information about this key
CASE_KEY = ID;
CASE_NUMBER = caseCounter;
writecase(KEYSTORAGE_DICT);
- Attachments
-
- modifyDuplicateKey.zip
- (3.11 KiB) Downloaded 491 times
Re: Tools to Direct Change Duplicate ID Number
Dear Gregory Martin
Thanks. Very happy when you can give us solution like this.
I have trying with your logic but my concatenate file get error.
The output dat file (that using your logic batch): I open with pen file but get error : too many occurances for record ...
With CSindex tools file1_file2.dat (concatenate) has found duplicate cases on case id: 1.
Would you please see myproj.zip attached.
Please advice. Thank you
Yanina
Process Messages
*** Case [21 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 9
*** Case [31 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 8
*** Case [11 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 7
User unnumbered messages:
Line Freq Pct. Messa
---- ---- ---- -----
25 3 - The k
CSPRO Executor Normal End
Thanks. Very happy when you can give us solution like this.
I have trying with your logic but my concatenate file get error.
The output dat file (that using your logic batch): I open with pen file but get error : too many occurances for record ...
With CSindex tools file1_file2.dat (concatenate) has found duplicate cases on case id: 1.
Would you please see myproj.zip attached.
Please advice. Thank you
Yanina
Process Messages
*** Case [21 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 9
*** Case [31 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 8
*** Case [11 ] has 1 messages (0 E / 0 W / 1U)
U -25 The key DEFAULT was already found at case 1; changing to 7
User unnumbered messages:
Line Freq Pct. Messa
---- ---- ---- -----
25 3 - The k
CSPRO Executor Normal End
- Attachments
-
- myproj.zip
- (9.68 KiB) Downloaded 408 times
-
- Posts: 1805
- Joined: December 5th, 2011, 11:27 pm
- Location: Washington, DC
Re: Tools to Direct Change Duplicate ID Number
In addition to what you posted, can you post the batch application that you used to change the IDs? I would like to see the data file with the changed IDs.
The dictionary in what you posted, myproj.dcf, has an ID of length 1, but the listing messages that you posted have an ID of length 3, so something is not consistent.
The dictionary in what you posted, myproj.dcf, has an ID of length 1, but the listing messages that you posted have an ID of length 3, so something is not consistent.
Re: Tools to Direct Change Duplicate ID Number
Dear Gregory Martin
Thank you so much for your kind help.
The batch run well and very smoothly on most of my application (include above myproj.zip). Yes, the ID length have to consistent with the batch.
Duplicatas Case_ID can able to separate and renumbered correctly.
But I have data and application attached that failed to run with your batch.
Would you please see my attached application and its DATA*.DATA inside (mypro that I don't know why its failed to run ..
I am using your batch modifyDuplicateKey.zip attached within.
I got error on most all cases INVALID RECORD TYPE on any Multiple Occurences field.
///
*** Case [ ..] has 2 messages (1 E / 1 W / 0U)
W 10007 Invalid Record Type: ' '
on and on ..
============
DATA1.DAT : 3296 Cases
DATA2.DAT : 3 Cases : But all duplicate with DATA1.DAT
DATA1_DATA2_CONCATENATE.DAT : Concatenate fle content
=============
Then I can't got the correct data output
Your help is very appreciate.
Yanin
Thank you so much for your kind help.
The batch run well and very smoothly on most of my application (include above myproj.zip). Yes, the ID length have to consistent with the batch.
Duplicatas Case_ID can able to separate and renumbered correctly.
But I have data and application attached that failed to run with your batch.
Would you please see my attached application and its DATA*.DATA inside (mypro that I don't know why its failed to run ..
I am using your batch modifyDuplicateKey.zip attached within.
I got error on most all cases INVALID RECORD TYPE on any Multiple Occurences field.
///
*** Case [ ..] has 2 messages (1 E / 1 W / 0U)
W 10007 Invalid Record Type: ' '
on and on ..
============
DATA1.DAT : 3296 Cases
DATA2.DAT : 3 Cases : But all duplicate with DATA1.DAT
DATA1_DATA2_CONCATENATE.DAT : Concatenate fle content
=============
Then I can't got the correct data output
Your help is very appreciate.
Yanin
- Attachments
-
- MYPROJ2.zip
- (183.6 KiB) Downloaded 406 times
-
- Posts: 1805
- Joined: December 5th, 2011, 11:27 pm
- Location: Washington, DC
Re: Tools to Direct Change Duplicate ID Number
The problem is that your dictionary, PROJ_01.dcf, doesn't have a record type. The ID, RESP_ID, goes from position 1-6 in the file.
In what I sent you, the record had a record type and so the ID went from position 2-7 in the file. If you change that, you can run your code and you get this output:
*** Case [ 1] has 1 messages (0 E / 0 W / 1U)
U -25 The key 1 was already found at case 3083; changing to 999999
*** Case [ 2164] has 1 messages (0 E / 0 W / 1U)
U -25 The key 2164 was already found at case 1; changing to 999998
*** Case [ 2165] has 1 messages (0 E / 0 W / 1U)
U -25 The key 2165 was already found at case 2; changing to 999997
In what I sent you, the record had a record type and so the ID went from position 2-7 in the file. If you change that, you can run your code and you get this output:
*** Case [ 1] has 1 messages (0 E / 0 W / 1U)
U -25 The key 1 was already found at case 3083; changing to 999999
*** Case [ 2164] has 1 messages (0 E / 0 W / 1U)
U -25 The key 2164 was already found at case 1; changing to 999998
*** Case [ 2165] has 1 messages (0 E / 0 W / 1U)
U -25 The key 2165 was already found at case 2; changing to 999997
Re: Tools to Direct Change Duplicate ID Number
Dear Gregory
Exelente. Yes. Its worked. Thank you so much.
For members that have same issue, here are my steps that solve my problem:
1. Change the order of field position on Dictionary PROJ_01.dcf,
2. Save to new Dictionary,
3. Reformated the concatenated Data to new.
4. Then run Greogry Martin code above.
5. Bingo
See screenshoot.
Yanin
Exelente. Yes. Its worked. Thank you so much.
For members that have same issue, here are my steps that solve my problem:
1. Change the order of field position on Dictionary PROJ_01.dcf,
2. Save to new Dictionary,
3. Reformated the concatenated Data to new.
4. Then run Greogry Martin code above.
5. Bingo
See screenshoot.
Yanin
- Attachments
-
- dict_changes.PNG (124.84 KiB) Viewed 12686 times