This package is a parser converting CSV text input into arrays or objects. It
implements the Node.js stream.TransformAPI. It also
provides a simple callback-based API for convenience. It is both extremely easy
to use and powerful. It was first released in 2010 and is used against big data
sets by a large community.
Source code for this project is available on GitHub.
Features
- Follow the Node.js streaming API
- Simplicity with the optional callback API
- Support delimiters, quotes, escape characters and comments
- Line break discovery
- Support big datasets
- Complete test coverage and samples for inspiration
- no external dependencies
- to be used conjointly with
csv-generate,stream-transformandcsv-stringify
Usage
Run npm install csv to install the full CSV package or run
npm install csv-parse if you are only interested by the CSV parser.
Use the callback style API for simplicity or the stream based API for scalability. You may also mix the two styles. For example, the fs_read.js example pipe a file stream reader and get the results inside a callback.
There is also a synchronous API if you need it.
For additional usage and example, you may refer to example page, the "samples" folder and the "test" folder.
Callback API
signature: parse(data, [options], callback)
Node.js Stream API
signature: parse([options], [callback])
Synchronous API
Using this API involves requiring the 'csv-parse/lib/sync' module.
signature: records = parse(text, [options])
Parser options
auto_parse(boolean)
If true, the parser will attempt to convert input string to native types.auto_parse_date(boolean)
If true, the parser will attempt to convert input string to dates. It requires the "auto_parse" option. Be careful, it relies onDate.parse.columns(array|boolean|function)
List of fields as an array, a user defined callback accepting the first line and returning the column names, ortrueif autodiscovered in the first CSV line. Defaults tonull. Affects the result data set in the sense that records will be objects instead of arrays. A value "false" skips the all column.comment(char)
Treat all the characters after this one as a comment. Defaults to''(disabled).delimiter(char)
Set the field delimiter. One character only. Defaults to","(comma).escape(char)
Set the escape character. One character only. Defaults to double quote.from, (number)
Start returning records from a particular line.ltrim(boolean)
Iftrue, ignore whitespace immediately following the delimiter (i.e. left-trim all fields). Defaults tofalse. Does not remove whitespace in a quoted field.max_limit_on_data_read(int)
Maximum numer of characters to be contained in the field and line buffers before an exception is raised. Used to guard against a wrongdelimiterorrowDelimiter. Default to 128,000 characters.objname(string)
Name of header-record title to name objects by.quote(char)
Optional character surrounding a field. One character only. Defaults to double quote.relax(boolean)
Preserve quotes inside unquoted field.relax_column_count(boolean)
Discard inconsistent columns count. Default tofalse.rowDelimiter(chars|constant)
String used to delimit record rows or a special constant; special constants are'auto','unix','mac','windows','unicode'; defaults to'auto'(discovered in source or'unix'if no source is specified).rtrim(boolean)
Iftrue, ignore whitespace immediately preceding the delimiter (i.e. right-trim all fields). Defaults tofalse. Does not remove whitespace in a quoted field.skip_empty_lines(boolean)
Don't generate records for empty lines (line matching/\s*/), defaults tofalse.skip_lines_with_empty_values(boolean)
Don't generate records for lines containing empty column values (column matching/\s*/), defaults tofalse.to, (number)
Stop returning records after a particular line.trim(boolean)
Iftrue, ignore whitespace immediately around the delimiter. Defaults tofalse. Does not remove whitespace in a quoted field.
All options are optional.
Internal properties
Those properties are for internal usage but may be considered useful to the
final user in some situations. They are accessible from the intance returned by
the parse function.
count(number)
Internal counter of records being processed.empty_line_count(number)
Internal counter of empty linesskipped_line_count(number)
Number of non uniform lines skipped whenrelax_column_countis truelines(number)
The number of lines encountered in the source dataset.is_int(regexp, function)
The regular expression or function used to determine if a value should be cast to an integer.is_float(regexp, function)
The regular expression or function used to determine if a value should be cast to a float.
Migration
Most of the generator is imported from its parent project CSV in an effort to split it between the generator, the parser, the transformer and the stringifier.
As record has disappeared, you are encouraged to use the "readable" event
conjointly with the "read" function as documented above and in the
Stream API.