Skip to main content

Dataset

Overview

A Dataset is where data is stored in pre-defined schemas. They can, for example, contain the data for an application, replicated data from another system, or a set of materialised views that transform data from other datasets as part of a data pipeline. A dataset is defined by a package (it’s structure) and a store (it’s versioned storage location).

new depot.Dataset(this, 'SampleSourceDataset', {
environment: depotEnvironment,
name: 'sample.source',
location: myLocation,
package: myPackage
});

A Dataset is typically configured with a location for storage (which could be a direct target location, or a composite location, consisting of tiered storage - e.g. primary, secondary).

Dataset Executors

It is possible to assign specific executors to a Dataset in order to ensure that these are used for certain purposes. Use the executors property on a Dataset to set optional purpose to executor target mappings.

For example:

import { DatasetExecutorPurpose } from '@stage-tech/depot-cdk/dist/stage-depot-dataset';

const executor1 = {}; // executor for deployments actions
const executor2 = {}; // executor for rollup purposes
const executor3 = {}; // executor for API calls

new depot.Dataset(this, "Dataset", {
environment: depotEnvironment,
location: myLocation,
name: "sample.source",
package: myPackage,
executors: [
{ purposes: [DatasetExecutorPurpose.DEPLOYMENT], executor: executor1 },
{ purposes: [DatasetExecutorPurpose.ROLLUP], executor: executor2 },
{ purposes: [DatasetExecutorPurpose.API], executor: executor3 }
]
});
Executor Selection Logic

When an executor is not located with a specified purpose on a Dataset, then the Depot fallback logic is to use the default Executor on the Location. This Location will be used for whatever that operation is that is missing a designated purpose as the fallback. If an Executor cannot be found via this search chain, then an error will be thrown stating that an Executor could not be located.