Data layer access
You can leverage the Depot CLI to get data layer access to your Depot environment. Under the hood, the Depot CLI invokes your environment's Depot Gateway Lambda function. This function transforms your data layer access requests into requests for Depot REST API that sits above your environment.
Most data layer access commands require a target dataset (id) to be specified.
Use the environment-level depot list-datasets command to retrieve a list of datasets
in your environment, along with their ids. Filter the result of this command by piping
it into a tool such as jq. For example depot list-datasets | jq '.items[].id'
CREATE
There are 2 x mechanisms you can use to create data in your environment. One is direct creation using the API, and the other is creation via upsert in a Transaction.
Basic Create
To create a record in a target Dataset, you can use the CLI create command. Provide the target schema, dataset id, and input data.
depot create my.contacts.Person --datasetId a5ae93a56dcf --data "{\"id\": \"1\", \"firstName\": \"Hans\", \"lastName\": \"Zimmer\", \"address\": {\"street\": \"Regent Street\", \"houseNumber\": \"111\", \"postcode\": \"ZC11111\"}}"
Create with Transaction
Transactions offer a lot of flexibility with regard to the operations they perform. To simply create a record (or multiple records) you can issue a new upsert transaction with the create-transaction command. At a minimum, you'll need to provide the dataset id, input data location (as a URI) and format (JSON or PARQUET).
This action type upserts (i.e. creates or updates) all object schemas in the dataset with data made available in S3. JSON and Parquet formats are supported.
Assuming you have the following sample multi-line JSON data file in a S3 URI location s3://ws-example-snowflake-bucket196/test1:
{"id": 2, "firstName": "Alice", "lastName": "Smith", "address": {"street": "Def Rd", "houseNumber": "7", "postcode": "AC12345"}}
{"id": 3, "firstName": "Kim", "lastName": "Dotnet", "address": {"street": "Foo Lane", "houseNumber": "99", "postcode": "ZV5432F"}}
{"id": 4, "firstName": "Jesse", "lastName": "James", "address": {"street": "Test Lane", "houseNumber": "167", "postcode": "NM5432F"}}
and your Dataset uses a schema such as:
my.contacts.Person:
type: object
properties:
firstName:
type: string
description: The first name of the person
required: true
lastName:
type: string
description: The last name of the person
required: true
address:
type: my.contacts.Address
description: Address details for the person
required: true
my.contacts.Address:
type: struct
properties:
street:
type: string
houseNumber:
type: integer
postcode:
type: string
Then you can upsert these records with the create-transaction command like so:
depot create-transaction --dataset a5ae93a56dcf --data "s3://ws-example-snowflake-bucket196/test1" --format JSON
Monitor the status of Transactions with the list-transactions or get-transaction commands.
GET
To query data use the get command.
depot get my.contacts.Person --datasetId a5ae93a56dcf --id "1"
The result will be output as JSON to stdout. For example:
{
schema: 'my.contacts.Person',
firstName: 'Hans',
lastName: 'Zimmer',
address: { postcode: 'ZC11111', street: 'Regent Street' },
created: '2022-03-21T12:18:36.169Z',
id: '1',
updated: '2022-03-21T12:18:36.169Z',
version: 1,
hash: 'VPOv36qdVdh/2Orv4Jl/JA=='
}
The --expand argument can be used to expand certain fields in the response. For example --expand address,telephone.
LIST
To return a list of items in a dataset, you can use the list data command.
depot list my.contacts.Person --datasetId a5ae93a56dcf
You can further refine results by providing --filter, --limit, --sort, or --nextToken arguments. For example:
depot list my.contacts.Person --datasetId a5ae93a56dcf --limit 3 --filter 'lastName="Smith"'
or to sort by the lastName key:
depot list my.contacts.Person --datasetId a5ae93a56dcf --sort lastName
When using the list operation with multiple pages of results, the --nextToken argument can be used to retrieve the next page of results. For example:
depot list my.contacts.Person --datasetId a5ae93a56dcf --limit 3 --nextToken "nextTokenValueFromPreviousResponse"
Note: list ordering is not guaranteed when Snowflake is the backing storage engine for the Dataset's Location.
DELETE
depot delete my.contacts.Person --datasetId a5ae93a56dcf --id 1
PATCH
The patch command can be used to update a record. Provide a target dataset, item id, and the patch data. For example to change the lastName value for a record:
depot patch my.contacts.Person --datasetId a5ae93a56dcf --id 1 --data "{\"lastName\": \"Strimmer\"}"
The version field increments by 1 to show that the item has been updated.
{
schema: 'my.contacts.Person',
firstName: 'Hans',
lastName: 'Strimmer',
address: { postcode: 'ZC11111', street: 'Regent Street' },
created: '2022-03-21T12:37:16.134Z',
id: '1',
updated: '2022-03-21T12:39:06.134Z',
version: 2,
hash: 'rlpE17oIonDuWRxp4X7EBA=='
}
Retrieve a page of links for an object in a dataset with the link data command.
depot link contacts.Person --id 1 --datasetId a5ae93a56dcf --link address --sort created --limit 10
Hints and tips
GET requets for data in your environment will target your environment's primary storage location. LIST operations will query your environment's secondary storage location. For example, if you have a DynamoDB storage location and have added a second storage location using Snowflake. Then GET requests will retrieve information directly from the underlying DynamoDB table(s). LIST operations will query Snowflake.