Comma Delimited (CSV) Data Source

Overview

The Comma Delimited (CSV) data source allows writing, but not reading, data to a comma-separated values file. When writing data, each value is separated by a comma. This data source only supports writing the values from one record, which by default is the first record defined in the dictionary. To specify writing data from another record, you must define the "record" property in the connection string.

Because CSV files can be read by Microsoft Excel and many other programs, using this data source allows you to share data with a large number of users in an open, text-based, format.

This data source is similar to the other delimited text data sources, Semicolon Delimited and Tab Delimited. Many applications that read delimited text can also read Excel files, and using the Excel data source may an option in many scenarios.

The Comma Delimited data source is used when a file has the extension .csv.

Functionality

The Comma Delimited data source supports the following features:

Feature	Supported
Reading cases	✘
Writing cases	✔
Notes, case labels, and case statuses	✘
Storage of more than one kind of record	✘
Binary data items	✘
Deleting cases	✘
Undeleting cases	✘
Syncing data	✘
Cases with duplicate keys	✘
Case identification via UUID	✘
Contains an embedded dictionary	✘
Allows record sorts	✘

Special Character Handling

CSPro writes CSV files following the RFC 4180 standard. Because commas are used to separate values, if a value contains a comma, the value must be escaped before writing. The following characters are escaped:

Comma: Value wrapped in double quotes.
Newline: Value wrapped in double quotes.
Double quote: If the value is wrapped in double quotes because it contains a comma or newline, the double quotes character appears twice.

For example, a value like "She said, \"Hello\"" would be written as:

"She said, ""Hello"""

Customizable Behavior

The following behavior can be customized by specifying properties in the connection string. The default behavior is marked with ⁺⁺⁺.

Property Name and Values	Description

"decimalMark"	Determines how the decimal mark is written for numeric items with decimals.
"comma"	Values are written with a comma (1,23).
"period" ⁺⁺⁺	Values are written with a period (1.23).

"encoding"	Determines the text encoding of the file.
"ANSI"	The contents are encoded as part of the Windows code page. On Android this value is ignored and "UTF-8-BOM" is used instead.
"UTF-8"	The contents are encoded as UTF-8 and written without a byte order mark (BOM).
"UTF-8-BOM" ⁺⁺⁺	The contents are encoded as UTF-8 and written with a three-byte BOM.

"header"	Determines if a header row is written and the value of the column heading.
"default" ⁺⁺⁺	The item's label is written unless writing both codes and labels, in which case the item's name is written for code columns and the label is written for label columns.
"suppress"	No header row is written.
"names"	The item's name is written.
"labels"	The item's label is written.

"mappedSpecialValues"	Determines how the special values missing and refused are written.
"codes" ⁺⁺⁺	The value of the mapped code is written. For example, if missing is mapped to -99, then -99 is written.
"suppress"	No value is written.

"record"	If the name of a record is provided, only items from that record are written.

"writeCodes"	Determines if the item's code is written.
true ⁺⁺⁺	The code is written.
false	The code is not written.

"writeLabels"	Determines if the item's label is written.
true	The label is written.
false ⁺⁺⁺	The label is not written.

For example, the following connection string, specified in a batch PFF, would result in a CSV file containing the codes and labels of the HOUSING_REC record:

OutputData=housing.csv|writeLabels=true&record=HOUSING_REC

See also: Data Sources