Comma Separated Values (CSV) Standard File Format

The CSV ("Comma Separated Values") file format is often used to exchange data between differently similar applications. The CSV file format is useable by KSpread, OpenOffice Calc and Microsoft Excel spread-sheet applications. Many other applications support CSV in some fashion, to import or export data. CSV files have become obsolete due to XML data exchange possibilities (ie ODF, SOAP), JSON and gRPC

The CSV Format

Each record is one line
Line separator may be LF (0x0A) or CRLF (0x0D0A), a line separator may also be embedded in the data (making a record more than one line but still acceptable).
Fields are separated with commas.
Duh. However, it's not uncommon to see the comma (, [0x2c]) replaced with a tab ( [0x09]) , semi-colon (; [0x3b]) or pipe (| [0x7c]).
Quote Wrapping
When the fields contain exotic characters, such as a comma or quote or new line (or anything really) it must be wrapped with " [0x22].
Equal Prefix
In some cases a quote wrapped field may be prefixed with an equal symbol (= [0x3d]) to provide an even stronger hint to the receving software to interpret the field value literally.
Leading and trailing whitespace is ignored
Unless the field is wrapped with double-quotes (" [0x22]) in that case the whitespace is preserved.
Embedded commas
Field must be wrapped with double-quotes.
Embedded double-quotes
Embedded double-quote characters must be doubled, and the field must be delimited with double-quotes.
Embedded line-breaks
Fields must be wrapped by double-quotes.
Always Wrapped
Fields may always be wrapped with double quotes, they should be parsed and discarded by the reading applications.

CSV Files and Leading Zeros on Numeric Fields

Sometimes leading zero values are required in a data set and while the leading zeros are present in the data they are not displayed. In some software it's possible to force strict interpretation of the CSV field value with a leading = (equal) symbol.

This may chop the leading zero on some softwares, even if quoted.

0306703,0035866,NO_ACTION,06/19/2006
0086003,"0005866",UPDATED,06/19/2006

This incantation may convince that software to keep the leading zero.

="0306703",="0035866",NO_ACTION,06/19/2006
="0086003",="0005866",UPDATED,06/19/2006

Acceptable CSV Mime Types

Sadly there is no definitive standard for this, here is a collection of types we've seen in use.

CSV Examples

Here are some examples that demonstrate the rules above. Each sample describes the data and how the reading application should interpret it.

Standard Line

This shows three fields, each with simple data.

Edoceo, Seattle, WA

Whitespace

The first field should be interpreted by reading applications as [space]Edoceo[comma][space]Inc.[space]. Whitespace also could include line breaks.

" Edoceo, Inc. ",Seattle,WA

Embedded Commas

The first field should be interpreted by reading applications as Edoceo[comma][space]Inc.

"Edoceo, Inc.",Seattle,WA

See Also