Read/write files to Google Cloud Storage (Gcs.x)

Prev Next

You can use the Gcs.* providers supplied by FINBOURNE to interact with files stored in Google Cloud Storage (GCS) buckets.

Providing you have a suitable license and access control permissions, you can:

  • Read data from CSV, Excel, SQLite, XML, Parquet and raw text files stored in one or more buckets, perhaps in preparation for loading investment data into LUSID.

  • Write data to files of the same types, perhaps having retrieved investment data from LUSID.

  • Search for files (both flat and hierarchical buckets are supported).

Loading a key for a GCS service account into LUSID

This section assumes you have a GCS project with at least one bucket. To see how to set up a service account for the project and create a JSON key with a suitable role, read the appendix.

You must load the JSON key for the service account into the LUSID Configuration Store. The following instructions are for the LUSID web app but you can use the Configuration Store REST API if you wish:

  1. Sign in to the LUSID web app as a user with administrative privileges.

  2. Navigate to System Settings > Configuration Store.

  3. Choose the Shared keys tab to create a key that everyone can use, or Personal keys to restrict the key just to you. Note if you have multiple keys, the latter are preferred in alphabetical order of Code (see below).

  4. Click the Add configuration set button.

  5. In the New set dialog, specify a Scope of luminesce-provider and a Code of the form <your-memorable-string>-gcs-config, for example test-gcs-config. Note these values are case-sensitive:

  6. In the Add item area, enter a Key of GCS-CREDENTIAL-JSON (case-sensitive) and a Value of the JSON key for the service account, and click the Add item button:

  7. Click the Save button to add the set to the list of keys:

Understanding access control permissions for the Gcs.* providers

You, and any other LUSID user wishing to run queries against a Gcs.* provider, must have suitable LUSID access control permissions. The general principle of creating policies for Luminesce providers, assigning them to roles and roles to LUSID users, is explained in this article. Note you should automatically have sufficient privileges if you are the LUSID domain owner with the built-in lusid-administrator role.

Getting help for the Gcs.* providers

You can access the latest built-in help for Gcs.* providers by running a query like this:

@x = use Gcs.Csv
        --help
        enduse;

Using the Gcs.* providers

The Gcs.* providers are very similar to the Drive.* providers for interacting with files in Drive.

For example,  use the Gcs.File provider to list all the files in a bucket:

select * from Gcs.File where RootPrefix = 'luminesce-provider-test-hierarchical-bucket'

Use the Gcs.Csv provider to download the contents of a CSV file in a bucket to a table variable:

@x = use Gcs.Csv
        --file=luminesce-provider-test-hierarchical-bucket/my_file.csv
        enduse;
        select * from @x

Use the Gcs.SaveAs provider to write a table of data to a CSV file and upload it to a bucket:

@data = select distinct TableName from Sys.Field order by 1;
        @x = use Gcs.SaveAs with @data
        --type=csv
        --path=luminesce-provider-test-hierarchical-bucket
        --fileNames=my_file
        enduse;
        select * from @x

For examples of other operations, see the documentation for the similar Drive.* providers below:

To interact with...

Read from using the ... provider
(links are to docs for similar Drive.* providers)

Write to using the ... provider
(links are to docs for similar Drive.* providers)

CSV and similarly-structured files

Gcs.Csv

Gcs.SaveAs

Excel files

Gcs.Excel

SQlite files

Gcs.Sqlite

XML files

Gcs.Xml

Text files

Gcs.RawText

Apache Parquet files

Gcs.Parquet

Find files

Gcs.File

N/A

Appendix: Setting up a service account for a GCS project

  1. Navigate to console.cloud.google.com, open your GCS project and select IAM & Admin > Service Accounts from the left-hand menu.

  2. Click +Create service account and specify a name and description.

  3. In the Permissions area, select a role of Storage Object User:

  4. Click the Done button to create the service account.

  5. In the list of service accounts, select Manage keys from the Actions column dropdown:

  6. Select Create new key from the Add key dropdown.

  7. Choose JSON and click Create to download a JSON key for loading into the LUSID Configuration Store, for example:

    {
            "type": "service_account",
            "project_id": "luminesce-provider-test",
            "private_key_id": "redacted-private-key-id",
            "private_key": "-----BEGIN PRIVATE KEY-----\redacted-private-key\n-----END PRIVATE KEY-----\n",
            "client_email": "luminesce-provider-test-servic@luminesce-provider-test.iam.gserviceaccount.com",
            "client_id": "102254980429290973242",
            "auth_uri": "https://accounts.google.com/o/oauth2/auth",
            "token_uri": "https://oauth2.googleapis.com/token",
            "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
            "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/luminesce-provider-test-servic%40luminesce-provider-test.iam.gserviceaccount.com",
            "universe_domain": "googleapis.com"
    }

    Note if you load the key using the Configuration Store REST API you must escape all the " quote characters in the key.