Multi-dataset tests

By default, a DepotTest operates on a single namespace — one set of schemas deployed to one location. This covers the vast majority of cases.

When your package declares schemas that span multiple datasets (e.g. a view in dataset petstore that references object schemas in dataset source), you need to exercise those cross-dataset references end-to-end. The multi-dataset entry point lets you do this within a single test.

info

Multi-dataset is an opt-in capability. The single-namespace form (DepotTest.in(namespace)) is unchanged and remains the recommended default for the common case.

The multi-dataset entry point

Instead of passing a single MergedSchemas namespace, pass an object whose keys are arbitrary dataset identifiers and whose values are dataset specs:

import { DepotTest, TestLocationType } from "@stage-tech/depot-test";

await DepotTest.in({
  source: {
    namespace: sourceNamespace,
    locationType: TestLocationType.SNOWFLAKE,
  },
  petstore: {
    namespace: petstoreNamespace,
    locationType: TestLocationType.SNOWFLAKE,
    using: ["source"],
  },
})
  .setContent({ data: { "source.dogs": dogs, "source.cats": cats }, dataset: "source" })
  .check({
    data: [
      {
        scope: { schema: "petstore.AnimalCount", arguments: { minAgeArg: 3 }, sortFields: [{ field: "animal" }] },
        values: [
          { animal: "dog", count: 2 },
          { animal: "cat", count: 1 },
        ],
      },
    ],
    dataset: "petstore",
  })
  .run();

Dataset spec fields

Field	Type	Description
`namespace`	`MergedSchemas`	The schemas that belong to this dataset.
`locationType`	`TestLocationType`	Where to deploy this dataset. Defaults to the value of `options({ locationType })` if set.
`using`	`string[]`	Keys of other datasets this dataset depends on. Mirrors the `using` field in your `PackageDataset` declaration.

`using` and cross-dataset references

The using array tells the backend which dependency datasets to make accessible when building this dataset's views. If petstore.animal contains SQL referencing source.dogs and source.cats, declaring using: ["source"] on the petstore spec ensures those schemas are in scope when the view DDL is resolved.

Targeting operations at a dataset

With multiple datasets, each operation needs to know which dataset it targets. Specify dataset (for read/write to the same dataset) or source / target for cross-dataset operations:

DepotTest.in({ source: { namespace: ... }, petstore: { namespace: ..., using: ["source"] } })
  // Seed into source
  .setContent({ data: { "source.dogs": dogs }, dataset: "source" })
  // Check in petstore
  .check({ data: [...], dataset: "petstore" })
  // Cross-dataset transaction: read from source, write to petstore
  .transaction({ source: "my.SourceSchema", target: "petstore.TargetSchema", sourceDataset: "source", targetDataset: "petstore" })
  .run();

The dataset shorthand is equivalent to setting both sourceDataset and targetDataset to the same value. A clear runtime error is thrown if an operation cannot resolve its dataset and no default has been set.

Backend constraints

All datasets in a multi-dataset test must use the same backend family. Mixing backends (e.g. SNOWFLAKE for one dataset and AURORA for another) is rejected with a descriptive error at job submission time, before any infrastructure is provisioned.

caution

SNOWFLAKE + AURORA in the same DepotTest.in(map) call is not supported and will be rejected. Cross-location multi-dataset tests are a separate, deferred capability.

SNOWFLAKE and SNOWFLAKE_MOCK_ICEBERG datasets may be combined freely — both are Snowflake-backed and differ only in naming strategy.

Complete example

The example below splits the catsAndDogs namespace (where petstore.animal is a view that unions source.dogs and source.cats) into two logical datasets to exercise the cross-dataset reference end-to-end.

multi-dataset.test.ts
import { DepotTest, TestLocationType } from "@stage-tech/depot-test";

const sourceNamespace: MergedSchemas = {
  "source.dogs": catsAndDogs["source.dogs"],
  "source.cats": catsAndDogs["source.cats"],
};

const petstoreNamespace: MergedSchemas = {
  "petstore.animal": catsAndDogs["petstore.animal"],
  "petstore.AnimalCount": catsAndDogs["petstore.AnimalCount"],
};

describe("multi-dataset cross-dataset view", () => {
  it("petstore.AnimalCount resolves across datasets", async () => {
    await DepotTest.in({
      source: {
        namespace: sourceNamespace,
        locationType: TestLocationType.SNOWFLAKE,
      },
      petstore: {
        namespace: petstoreNamespace,
        locationType: TestLocationType.SNOWFLAKE,
        using: ["source"],
      },
    })
      .setContent({
        data: {
          "source.dogs": [
            { id: "1", name: "Charlie", age: 1, breed: "Akita" },
            { id: "2", name: "Jax",     age: 4, breed: "Coonhound" },
            { id: "3", name: "Ginger",  age: 5, breed: "Bulldog" },
          ],
          "source.cats": [
            { id: "1", name: "Poppy", age: 1, breed: "British Shorthair" },
            { id: "2", name: "Luna",  age: 2, breed: "Siamese" },
            { id: "3", name: "Daisy", age: 5, breed: "Ragdoll" },
          ],
        },
        dataset: "source",
      })
      .check({
        data: [
          {
            scope: {
              schema: "petstore.AnimalCount",
              arguments: { minAgeArg: 3 },
              sortFields: [{ field: "animal" }],
            },
            values: [
              { animal: "dog", count: 2 },
              { animal: "cat", count: 1 },
            ],
          },
        ],
        dataset: "petstore",
      })
      .run();
  });
});

tip

When using programmatic data input with multiple datasets, the data map in setContent can contain schemas from any dataset — the backend routes each schema to its owning dataset based on the namespace declarations. Only the dataset prop on the operation itself determines which dataset's infrastructure handles the step.

The multi-dataset entry point​

Dataset spec fields​

using and cross-dataset references​

Targeting operations at a dataset​

Backend constraints​

Complete example​