Tools.Parse.Csv

Type	Read/write	Author	Availability
Direct provider	Read	Finbourne	Provided with LUSID

The Tools.Parse.Csv provider enables you to write a Luminesce query that reads CSV, text and more from cells within a table.

Note: The LUSID user running the query must have sufficient access control permissions to use this provider. This should automatically be the case if you are the domain owner.

See also: Tools.Parse.Xml

Basic usage

@data = 
select 
  '<filename>' as Filename,
  '<column-name>[, <column-name>...]
<some-data>[, <some-data>...]' as Content;

@parsed = 
use Tools.Parse.Csv with @input
--<optional-arguments>  
enduse;

select * from @parsed;

Input tables

Tools.Parse.Csv takes in one input table and outputs a table of data, see example 1.

Options

Tools.Parse.Csv has options that enable you to refine a query.

An option takes the form --<option>=<value>, for example --fileFilter=myFile.csv. Note no spaces are allowed either side of the = operator. If an option takes a boolean value, then specifying that option (for example --addFileName) sets it to True; omitting the option specifies False.

The table below provides information on the most commonly used options for Tools.Parse.Csv:

Option	Value	Status	Information
`--fileFilter`	Regex string, for example `MyFile.Csv`, `MyFile.*` or `File[1-3]`	Optional	Based on the `Filename` column, selects the rows which should be processed. If the `Filename` column is omitted, names files sequentially as `File1`, `File2` and so on.
`--addFileName`	Boolean	Optional	Adds a column to the result set containing the file the row came from.
`--delimiter`	String	Optional	Specifies the delimiter that separates values. Defaults to `,`.

To see a help screen of all available options, their data types, default values, and an explanation for each, run the following query using a suitable tool:

@x = use Tools.Parse.Csv 
--help
enduse;
select * from @x;

Examples

Note: For more example Luminesce queries, visit our GitHub repo.

Example 1: Reading a single CSV

@data = 
select 
  'MyFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text, more text, 1, 2' as Content
;

@parsed = 
use Tools.Parse.Csv with @data 
enduse;

select * from @parsed;

The table of data returned looks like this:

Example 2: Reading multiple CSVs at the same time

In this example, the --addFileName option is specified to ensure the first column in the table of results contains the filename corresponding to each row of data.

@data = 
select 
  'MyFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text, more text, 1, 2' as Content
union all
select
  'MyOtherFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
text from my other file, more text from my other file, 10, 20' as Content
union all
select
  'MyThirdFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text from my third file, more text from my third file, 100, 200' as Content
;

@parsed = 
use Tools.Parse.Csv with @data
--addFileName 
enduse;

select * from @parsed;

The table of data returned looks like this:

Example 3: Returning only rows with a particular filename

In this example, the --fileFilter option is specified to only select rows with a Filename value that contains ReportFile.

@data = 
select 
  'ReportFile_1.Csv' as Filename,
  'EmployeeName, Id
Jane Bloggs, 26' as Content
union all
select
  'ReportFile_2.Csv',
  'EmployeeName, Id
Joe Bloggs, 58'
union all
select
  'UnrelatedFile.Csv' as Filename,
'UnrelatedColumn1, UnrelatedColumn2
some unrelated text, some more text'
;

@parsed = 
use Tools.Parse.Csv with @data
--fileFilter=ReportFile.* 
enduse;

select * from @parsed;

The table of data returned looks like this, with UnrelatedFile.Csv being ignored for not matching the specified --fileFilter value:

Example 4: Returning only particular rows without any filenames

In this example, the --fileFilter option is specified to only select particular rows to process. As no Filename column is specified in @data, Tools.Parse.Csv gives each row a sequential name which can then be referred to in the --fileFilter option.

@data = 
select 
  'EmployeeName, Id
Jane Bloggs, 26' as Content
union all
select
  'EmployeeName, Id
Joe Bloggs, 58'
union all
select 
  'EmployeeName, Id 
Jane Doe, 73'
union all 
select 
  'EmployeeName, Id
John Doe, 107'
;

@parsed = 
use Tools.Parse.Csv with @data
--fileFilter=File[2,4] 
enduse;

select * from @parsed;

The table of data returned looks like this, containing only the second and fourth rows from @data: