Data Modelling Conventions

We need to agree on a set of modelling conventions that we use when creating Depot schemas for each component. The naming needs to feel consistent when viewed from a project level, and give the impression it has been designed with one hand, even when teams are working independently.

Schemas

General

Use singular names for objects

Description

Include a meaningful description with the schema

example.report.Item:
  type: object
  description: Report the details, known references and royalty value for an item over a time window

Properties

General

Keep the property names as short as possible without losing the meaning

ID

When the primary key for an object schema is also the unique identifier for an entity then use the special id field that is automatically added to every object schema - there’s no need to specify it although you can for clarity.

example.object.Work:
  type: object
  properties:
    title:
      type: string
    status:
      type: string

When the primary key is a composite key, name the properties as entityId and explicitly declare the id.

public.object.ReportItem:
  type: object
  id:
    expression: this.batchId + '_' + this.objectId
  properties:
    batchId:
      type: string
    objectId:
      type: string

You can generate an ID using a uuid with a prefix:

example.object.WellIdentifiedThing:
  type: object
  id:
    expression: "'example:thing:' + sys.uuidv4()"
  properties:
    title:
      type: string

Optional / Required

Properties are optional unless explicitly declared as mandatory by adding ! to the type. Rather than declaring a field as optional consider where it would make life easier for consumers of the data to make the field mandatory with a default value.

duration:
 type: string? 
   # explicitly optional
duration1:
 type: string
 # implicitly optional
duration3:
 type: string!
 # required
duration4:
 type: string
 required: true
value:
 type: number
 default: 0

Dates and times

Use the appropriate date and datetime types that conform to the iso8601 standard

startTime:
  description: The start of the time window
  type: datetime
endTime:
  description: The end of the time window
  type: datetime

Namespace

We’re frequently dealing with identifiers for the same entities from different providers with their own ID schemes. It’s incredibly important that we get a consistent approach to storing identifiers so that we can link across components.

Common list of project-wide namespaces
Namespaces are lowercase with hyphens
All identifiers are prepended with namespace:
Where the formatting isn’t always consistent we should normalize the ID (e.g. ISWC)

info

When we include the namespace as part of the identifier we can join on a single field, rather than having to always join on namespace and identifier

info

Look up the Platform > Data Standard page over Confluence

Examples

Type	Value	ID
ExampleObject	EO-12345	eo:12345
Work Object	4db32dz5G2dXf2	wo:4db32dz5G2dXf2

Constraints

Use constraints everywhere to ensure that data is formatted correctly and falls within the expected bounds

value:
  type: number
  constraints:
    - min: 0
duration:
  type: string
  constraints:
    - pattern: PT\dM\d(\.\d+)?S

Description

Include a meaningful description with every property.

startTime:
  description: The start of the time window
  type: datetime
endTime:
  description: The end of the time window
  type: datetime
value:
  description: The (approximate) royalty value in euros over the time window
  type: number

Packages

Public

Where a schema is part of the interface to your component, use public as the first part of your package name.

public.object.ReportItem:

warning

Based on our Snowflake naming conventions this will create the Snowflake schema and table PUBLIC_MATCH.REPORT_TRACK

This doesn’t exactly match the current conventions, but does keep the Depot schemas relatively tidy and more importantly unique within an environment.

Enums

Values

Uppercase
Full words separated with underscores

example.report.ExampleRole:
  type: enum
  values:
    - CREATOR
    - EDITOR
    - ADMIN
    - READER
    - GUEST

Schemas​

General​

Description​

Properties​

General​

ID​

Optional / Required​

Dates and times​

Namespace​

Examples​

Constraints​

Description​

Packages​

Public​

Enums​

Values​

Schemas

General

Description

Properties

General

ID

Optional / Required

Dates and times

Namespace

Examples

Constraints

Description

Packages

Public

Enums

Values