Tools.Parse.Csv

Type

Read/write

Author

Availability

Direct provider

Read

Finbourne

Provided with LUSID

The Tools.Parse.Csv provider enables you to write a Luminesce query that reads CSV, text and more from cells within a table. 

Note: The LUSID user running the query must have sufficient access control permissions to use this provider. This should automatically be the case if you are the domain owner.

See also: Tools.Parse.Xml

Basic usage

@data = 
select 
  '<filename>' as Filename,
  '<column-name>[, <column-name>...]
<some-data>[, <some-data>...]' as Content;

@parsed = 
use Tools.Parse.Csv with @input
--<optional-arguments>  
enduse;

select * from @parsed;

Input tables

Tools.Parse.Csv takes in one input table and outputs a table of data, see example 1.

Options

Tools.Parse.Csv has options that enable you to refine a query.

An option takes the form --<option>=<value>, for example --fileFilter=myFile.csv. Note no spaces are allowed either side of the = operator. If an option takes a boolean value, then specifying that option (for example --addFileName) sets it to True; omitting the option specifies False.

The table below provides information on the most commonly used options for Tools.Parse.Csv:

Option

Value

Status

Information

--fileFilter

Regex string, for example MyFile.Csv, MyFile.* or File[1-3]

Optional

Based on the Filename column, selects the rows which should be processed. If the Filename column is omitted, names files sequentially as File1, File2 and so on.

--addFileName

Boolean

Optional

Adds a column to the result set containing the file the row came from.

--delimiter

String

Optional

Specifies the delimiter that separates values. Defaults to ,.

To see a help screen of all available options, their data types, default values, and an explanation for each, run the following query using a suitable tool:

@x = use Tools.Parse.Csv 
--help
enduse;
select * from @x;

Examples

Note: For more example Luminesce queries, visit our GitHub repo.

Example 1: Reading a single CSV

@data = 
select 
  'MyFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text, more text, 1, 2' as Content
;

@parsed = 
use Tools.Parse.Csv with @data 
enduse;

select * from @parsed;

The table of data returned looks like this:

Example 2: Reading multiple CSVs at the same time

In this example, the --addFileName option is specified to ensure the first column in the table of results contains the filename corresponding to each row of data.

@data = 
select 
  'MyFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text, more text, 1, 2' as Content
union all
select
  'MyOtherFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
text from my other file, more text from my other file, 10, 20' as Content
union all
select
  'MyThirdFile.Csv' as Filename,
  'TextColumn1, TextColumn2, NumberColumn1, NumberColumn2
some text from my third file, more text from my third file, 100, 200' as Content
;

@parsed = 
use Tools.Parse.Csv with @data
--addFileName 
enduse;

select * from @parsed;

The table of data returned looks like this:

Example 3: Returning only rows with a particular filename

In this example, the --fileFilter option is specified to only select rows with a Filename value that contains ReportFile.

@data = 
select 
  'ReportFile_1.Csv' as Filename,
  'EmployeeName, Id
Jane Bloggs, 26' as Content
union all
select
  'ReportFile_2.Csv',
  'EmployeeName, Id
Joe Bloggs, 58'
union all
select
  'UnrelatedFile.Csv' as Filename,
'UnrelatedColumn1, UnrelatedColumn2
some unrelated text, some more text'
;

@parsed = 
use Tools.Parse.Csv with @data
--fileFilter=ReportFile.* 
enduse;

select * from @parsed;

The table of data returned looks like this, with UnrelatedFile.Csv being ignored for not matching the specified --fileFilter value:

Example 4: Returning only particular rows without any filenames

In this example, the --fileFilter option is specified to only select particular rows to process. As no Filename column is specified in @data, Tools.Parse.Csv gives each row a sequential name which can then be referred to in the --fileFilter option. 

@data = 
select 
  'EmployeeName, Id
Jane Bloggs, 26' as Content
union all
select
  'EmployeeName, Id
Joe Bloggs, 58'
union all
select 
  'EmployeeName, Id 
Jane Doe, 73'
union all 
select 
  'EmployeeName, Id
John Doe, 107'
;

@parsed = 
use Tools.Parse.Csv with @data
--fileFilter=File[2,4] 
enduse;

select * from @parsed;

The table of data returned looks like this, containing only the second and fourth rows from @data: