CloudFormation Stack Structure
Depot environments are deployed as a hierarchy of CloudFormation stacks. Understanding this hierarchy is important when reading deploy logs, estimating template size headroom, and diagnosing cross-stack dependency issues.
Stack hierarchy
Each Depot environment maps to one root CloudFormation stack (DeploymentEnvironment) plus several nested stacks. All nested stacks are uploaded as separate S3 objects and each has its own 1 MB template size budget.
| Nested stack | Logical ID | Conditional | Purpose |
|---|---|---|---|
MigrationSupport | Migrations | Yes — only if deferred migrations are configured | Batch compute environment, job queue, job definition, debouncer, and executor Step Function for asynchronous dataset migrations |
S3TablesCommon | S3Tables | Yes — only if one or more S3 Tables locations exist | Table bucket, Lake Formation role, and guard provider for S3 Tables–backed datasets |
Datasets | Datasets | No — always present | Container for all individual dataset nested stacks |
SeedingDispatcher | SeedingDispatcher | Yes — only if any datasets are seedable | Step Function and Lambda for coordinating initial data seeding |
Dataset-<id> | Dataset<datasetId> | No — one per declared dataset | All resources for a single dataset: state table entry, S3 prefix policy, executor-specific custom resources (Snowflake, Aurora, Iceberg, etc.) |
SnowflakeCommon, IcebergCommon, AuroraCommon, and OpsAlarms are regular CDK Constructs (not nested stacks) and contribute their resources directly to the root template.
Template size budget
CloudFormation enforces a hard 1 MB limit per template. Nested stacks each have their own independent 1 MB budget, so the strategy is to push resources down into nested stacks rather than accumulate them in the root.
The root template is the most constrained because it references every nested stack and holds all top-level resources (VPC, KMS keys, IAM roles, Snowflake/Aurora/Iceberg common resources). Adding datasets does not increase the root template — only the Datasets nested stack grows, which has its own budget.
Typical sizes by environment class
Sizes are measured from deployed stacks and vary as the codebase evolves. All sizes are JSON-encoded template bytes rounded to the nearest KB.
| Environment | Datasets | Root | Datasets | Migrations | SeedingDispatcher | S3Tables | Dataset-X (min / median / max) |
|---|---|---|---|---|---|---|---|
| e2e | 18 | ~414 KB | ~23 KB | ~43 KB | ~8 KB | ~1 KB | ~1.5 KB / ~14 KB / ~36 KB |
| sit | 107 | ~512 KB | ~127 KB | ~67 KB | ~26 KB | — | — |
| master | 159 | ~555 KB | ~202 KB | ~186 KB | ~49 KB | — | ~1.5 KB / ~14 KB / ~28 KB |
S3Tables is present only in environments with at least one S3 Tables location configured. The Migrations and SeedingDispatcher nested stacks scale with dataset count because they hold per-dataset event rules and seeding state machines respectively.
Per-dataset size variance
Datasets template
Each additional dataset adds one AWS::CloudFormation::Stack resource entry plus any cross-stack output wiring. Comparing sit (107 datasets, ~127 KB) and master (159 datasets, ~202 KB):
(202 KB − 127 KB) / (159 − 107) ≈ 1.4 KB per dataset
At 1.4 KB per dataset and a 1 MB limit, the Datasets template can accommodate roughly 700 datasets before approaching its ceiling, assuming current fixed overhead (~27 KB).
Per Dataset-X template
Dataset templates vary significantly based on the number of executors configured (Snowflake, Aurora, Iceberg) and their custom resources. The min/median/max range in the master environment (~1.5 KB / ~14 KB / ~28 KB) reflects datasets ranging from simple S3-only datasets to fully-configured multi-executor datasets.
Root template
The root template is nearly invariant to dataset count. Adding datasets has negligible impact on the root because all dataset resources live in the Datasets subtree.
| Adding one more… | Root impact | Datasets template | Dataset-X template | Notes |
|---|---|---|---|---|
| Any dataset | ~0 bytes | ~1.4 KB | ~1.5–28 KB | fixed overhead per Dataset-X stack reference in Datasets |
Why nested stacks
CDK's NestedStack construct uploads each template as a separate S3 object and references it from the parent via AWS::CloudFormation::Stack. This allows environments with 100+ datasets to stay under the 1 MB root limit: the root template holds only a single Datasets reference, and the Datasets template holds one reference per dataset.
The two-level nesting (root → Datasets → Dataset-X) isolates the growth: deploying a new dataset expands only the Datasets template by ~1.4 KB and adds one new Dataset-X template (~1.5–28 KB), leaving the root template unchanged.