Migrations: Technical notes

Migration Plan file format

The Migration Plan file is a JSON file that conforms to the MigrationPlan.

During the planning phase, Depot will populate the metadata field as a way to help potential migration plan transformers (or post-migration custom transform engines) to figure out the correspondence between the Depot schema and field names, and their realization as objects and columns within the target SQL database (taking into account the applicable NamingStrategy and any overrides).

When performing the asynchronous execution, Depot ignores the metadata field (which may be omitted entirely in a plan transformer's output), and goes back to the applicable SqlNamingStrategy if any name still needs to be translated.

Migration Planned events

The following event is a sample event that has been sent on the event bus:

{
  "version": "0",
  "id": "29ebe1db-cb39-15de-6844-c9aad9072b22",
  "detail-type": "migration-plan",
  "source": "tech.stage.depot",
  "account": "058264526050",
  "time": "2025-04-28T08:31:35Z",
  "region": "eu-west-1",
  "resources": [],
  "detail": {
    "environmentId": "e2e404529091ed2",
    "datasetId": "6d9c7969f0b1",
    "datasetAlias": "WsDevDogSrcDeferredMigration",
    "environmentUri": "s3://sdp-bootstrap-058264526050-eu-west-1/sdp-e2e404529091ed2/datasets/6d9c7969f0b1/04065ddb4e45814edf8fd9fd82e7f72cb3a1d584dd9b5300d4561d2d48c70487.json",
    "oldEnvironmentUri": "s3://sdp-bootstrap-058264526050-eu-west-1/sdp-e2e404529091ed2/datasets/6d9c7969f0b1/39cf367fe2a91b676d0065712e670809dc8a2abf514971b443254c22a79fbc65.json",
    "datasetVersionHash": "42b14733ff9d8f8de9f22524ef466a13f3b6511f7cb9ebb74de3ab59c24c5b37",
    "oldDatasetVersionHash": "06aeef48489f428db3f91b3878cad61b828142f8f5b5557a2460ec9a3565e40a",
    "migrationPlanBucket": "sdp-bootstrap-058264526050-eu-west-1",
    "migrationPlanKey": "sdp-e2e404529091ed2/migrations/6d9c7969f0b1_WsDevDogSrcDeferredMigration/2025-04-28/2025-04-28T08:31:33.816Z_MIGRATION_PLANNED_6d9c7969f0b1_WsDevDogSrcDeferredMigration_01967b85-dd5c-7306-b067-0a06ab34d6ae.json",
    "migrationId": "01967b85-dd5c-7306-b067-0a06ab34d6ae",
    "summary": "migration-planned",
    "executionResponsibility": "to-be-executed-by-receiver"
  }
}

It is recommended to debounce using the .id field (eventbridge event id).

Asynchronous migrations should be executed when .detail.executionResponsibility equals to-be-executed-by-receiver (the other possible value, executed-by-sender means that the migration has already been executed, as in the synchronous choreography)

tip

The rule should be built before the dataset, so your Dataset construct should depend on the rule in the CDK. This will cause the datasetId to be unknown at the time of creating your rule. As a result, you should specify a filtering criteria on the datasetAlias to keep the rule specific to the particular dataset.

The detail part conforms to the MigrationPlannedEvent interface.

Asynchronous event chronology

When the user did not request customEventBinding, Depot will set up the EventBridge rule to listen to the migration event and trigger the asynchronous execution.

The rule and execution infrastructure are set up in the main Platform CDK stack, ahead of creating or updating the dataset's own stack.

The mechanism is as follows (simplified):

note

For legibility, the EventBridge "lifecycle" messages are not shown in the diagram. They are sent immediately after each PATCH operation to the dataset.

In reality, the "Migration Planned" message does not directly trigger the Batch migration executor. Instead, a debounce mechanism is used to ensure that only one migration process is executed at a time per dataset:

The debounce mechanism is necessary, as in case of an update to the main EventBridge rule (picking up Migration Events from after the migration planning), CloudFormation may provisionally set up a new rule alongside the old, before deleting the old one. This can result in more than one instance of the eventbridge rule being active at the same time when the command to perform the migration is sent over.

The "debounce" mechanism ensures that only one migration will be triggered even in that circumstance. It is implemented using a FIFO SQS queue, using the hash of the event content as the message deduplication id over a one minute window. Messages are also grouped by the datasetId in order to not enforce event ordering beyond one dataset.

The debounce window is kept short (about one minute), so that in case a user (operator) needs to manually restart the deferred-dataset-migration-runner step function, the manually-initiated message will not be absorbed away.

Concurrency control

Synchronous migrations

Each Dataset is mapped to a CloudFormation resource. CloudFormation will ensure that only one update per dataset can run at any point in time.

Asynchronous migrations

The migration executor is implemented as an AWS Batch job. The job definition is configured to require a "consumable" resource to allow the execution to begin.

Depot defines:

one Replenishable consumable resource per asynchronous-migration dataset, with a capacity of 1
one Job Definition per asynchronous-migration dataset, requiring 1 unit of the corresponding consumable resource
step function using the "dispatcher" pattern to submit the batch job definition corresponding to the dataset being migrated.

This mechanism ensures that only one migration can be executed at a time per dataset. AWS Batch ensures that consecutive migration requests will be queued and executed in FIFO order for that datasets. Lastly, dataset migrations belonging to distinct datasets can execute in parallel.

tip

It is possible to inspect the consumable resources in the AWS Batch console, under the "Consumable Resources" tab.

Discussion

Job Definitions may consume more than one consumable resource. It is possible to imagine updating the per-dataset job definitions to also require things like:

one unit of the per-dataset consumable that corresponds to dependency datasets, or perhaps dependents? As a way to organise cascading dataset updates if we choose to suppor this.
one unit of a "global" consumable (with as many tokens as there are asychronous-migration datasets) in order to simplify the computation of a metric we could chart into the Cloudwatch Dashboard

to be discussed

Migration Plan file format​

Migration Planned events​

Asynchronous event chronology​

Concurrency control​

Synchronous migrations​

Asynchronous migrations​