Skip to content

License: Apache License 2.0
Steward: Hackolade (IntegrIT SA/NV) -- pending governance evolution
Repository: github.com/xdbml/xdbml-spec
Canonical home: xdbml.org

Abstract

xDBML (eXtended Database Markup Language) is an open markup language for describing the shape of structured data across heterogeneous storage technologies -- relational databases, document stores, columnar warehouses, search engines, graph databases, serialization formats, and API contracts. xDBML is a strict superset of DBML 3.13.6 (the Database Markup Language maintained by Holistics under Apache 2.0) and extends it with constructs for nested hierarchical structures, polymorphism, polyglot vocabulary, first-class JSON data, named reusable types (object-shaped and scalar), AI-readiness metadata, explicit namespace-level declarations, precise relationship cardinality, property-bearing edges for graph models, views, and a module system for splitting schemas across multiple files.

xDBML is designed from the ground up for AI-assisted data modeling and AI-mediated schema interchange. The language matches the way modern LLMs already describe schemas: nested structures are first-class, polymorphism uses the same oneOf/anyOf/allOf vocabulary as JSON Schema, paths into nested fields use unambiguous dotted notation, and every construct accepts synonyms:, business_term:, tags:, and granularity: settings that let natural-language queries resolve to canonical schema elements without guesswork. An LLM asked to "find the monthly recurring revenue field" should not have to infer from column naming conventions; xDBML lets the schema declare the synonyms explicitly.

xDBML is designed as the schema layer of a modern data architecture: upstream of physical engine DDLs (SQL, Cassandra CQL, MongoDB validators, Cypher), downstream of contract layers (ODCS) and semantic layers (OSI, dbt MetricFlow). It serves human authors, AI-assisted data modeling workflows, and bidirectional translation with the major serialization formats (Avro, Parquet, JSON Schema, OpenAPI).

xDBML extends DBML into a unified metadata and semantic data modeling language with richer support for validation, semantics, cardinality, annotations, and AI-friendly metadata, while deliberately staying readable and Git-friendly. It is designed for AI-assisted data modeling and AI-mediated schema interchange.

Strict-superset claim: every valid DBML document parses correctly under xDBML rules, and every DBML construct (used in a way valid in DBML) means the same thing in xDBML as in DBML. When a file declares xdbml: 0.2 (or later), it has opted into xDBML's extended semantics; the semantic meaning of constructs in that context may legitimately differ from DBML's, because the file is no longer claiming to be DBML.

What's new in v0.2: the module system (§26), allowing schemas to be split across multiple files via use and reuse directives, with optional inline clone blocks for file autonomy; and scalar Named Types (§14.7), extending Named Types to describe single-shape declarations rather than only object-shaped structures.

Table of contents

Abstract

  1. Introduction
  2. Document structure
  3. Lexical conventions
  4. Version declaration
  5. Project
  6. Container
  7. Entity
  8. Field
  9. Index
  10. Checks -- entity-level constraints
  11. Relationship and cardinality
  12. Edge -- relationships with properties
  13. View -- derived shapes with source queries
  14. Named type
  15. Enum
  16. TablePartial and TableGroup
  17. Diagram View
  18. Note
  19. Path syntax
  20. Polymorphism
  21. JSON as a first-class data type
  22. BSON scalar types
  23. AI-readiness and semantic-interop settings
  24. Validation constraints
  25. Records -- sample data
  26. Module system
  27. AST representation
  28. Round-trip semantics
  29. Relationship to other standards
  30. Conformance

Appendix A. Reserved keywords
Appendix B. Lexical grammar reference
Appendix C. Worked end-to-end examples
Appendix D. Compatibility with DBML 3.13.6
Appendix E. Sources and acknowledgments

1. Introduction

xDBML is born to solve the same frustrations DBML was designed for (difficulty building a mental "big picture" of database structure, opaque field meanings, ER diagrams and DDL that are hard to read and usually outdated), plus additional pains DBML cannot solve: AI-readiness metadata, LLM-portable schemas, nested structures and polymorphism, schema drift across polyglot stacks, and property-bearing graph edges. For the full positioning relative to DBML, see the FAQ.

1.1 Scope

xDBML describes the structural and semantic layer of data: entities, fields, types, relationships, classifications, validation rules, and AI-readiness metadata -- the shape and meaning of data.

xDBML is the format for humans and AI to exchange schemas with an xDBML tool. It is not the round-trip format between a data modeling tool and a target technology. That tool-to-target round-trip happens in native DDL or schema -- the tool understands the target's complete capability surface (partitioning, sharding, tablespaces, replication, PL/SQL, triggers, identity columns, advanced constraints, refresh schedules) and preserves it in the tool's own canonical model. xDBML carries the parts of that model with meaning across boundaries.

xDBML scope diagram

In scope:

  • Entity, table, collection, record, and similar structural constructs (§7)
  • Fields with their declarative type, default, nullability, and validation constraints (§8)
  • Named reusable types and composition (§14)
  • Relationships and cardinality, including cross-engine references (§11)
  • Nested structures: objects, arrays, maps, sets, tuples, and polymorphism (§20, §21)
  • Property-bearing graph edges (§12)
  • Views as declarative shape + opaque source query (§13)
  • Indexes in declarative form (§9)
  • AI-readiness metadata: synonyms, business terms, classification tags, granularity hints (§23)
  • Custom properties for organization-specific extensions
  • Target name declarations and per-Container target overrides (§5)
  • Notes and prose annotations at every level (§18)

Out of scope, by design:

  • Engine operational features. Partitioning strategies, sharding configuration, tablespaces, storage models, replication topology, materialized view refresh schedules, clustering keys, time-travel configuration. These stay native to each target.
  • Procedural code. PL/SQL, T-SQL, stored procedures, triggers, server-side functions, computed columns with engine-specific functions. xDBML expresses declarative shape and metadata, not behavior.
  • Identity and sequencing details. xDBML can declare a field as a primary key with auto-generation; the exact sequence configuration is engine-specific.
  • Wire-protocol and evolution rules. Avro schema evolution rules, Protobuf reserved fields, GraphQL federation directives, OpenAPI endpoint operations.
  • Query languages. SQL, Cypher, MQL, GraphQL queries. xDBML generates schemas; consumers write queries.
  • Computational meaning. Measures, metrics, aggregations belong in OSI, dbt MetricFlow, LookML.
  • Contractual meaning. Data quality rules, SLAs, ownership, pricing belong in ODCS.
  • Inferential reasoning. OWL and knowledge graph standards operate at a different layer.
  • Provisioning and operations. Terraform, operators, security policies handle the infrastructure layer.

Round-trip is lossy by design with respect to target-specific features. An xDBML schema lifted from Oracle DDL preserves the logical content (tables, columns, types, declarative constraints, relationships, indexes in declarative form). It does not preserve operational specifics (partitions, storage configuration, triggers, PL/SQL) -- those are deliberately outside xDBML's scope and remain in the data modeling tool's native representation.

The language is also:

  • Polyglot by default -- no construct privileges relational databases over document, columnar, graph, or exchange-format targets.
  • Closed-world -- a field not declared in an entity cannot exist in valid data.
  • Human-authorable, machine-readable, and AI-generable in equal measure.

xDBML is a format, not a data modeling tool. The playground at xdbml.org demonstrates the language and works well for learning, prototyping, and small schemas, but sustained enterprise data modeling -- live-database reverse-engineering, target-native DDL generation across many engines, schema diffing and impact analysis, lineage and governance integration, and multi-user collaboration -- requires a purpose-built data modeling tool like ER/Studio, Erwin Data Modeler, or Hackolade. xDBML is designed as the textual exchange format and AI interaction surface that those tools can read, write, and round-trip with: complementing them, not replacing them.

1.2 Design principles

The language follows seven principles that govern every construct:

  1. Strict superset of DBML. Every valid DBML 3.13.6 document is a valid xDBML document with identical semantics.
  2. Polyglot by default. No construct privileges relational databases over document, columnar, graph, or other paradigms.
  3. Vocabulary-permissive at every level. Container, Entity, and type-level keywords accept target-native synonyms.
  4. Type-name pass-through. Scalar type names are accepted as-is and lowered by target-specific generators. BSON-specific types (objectId, Decimal128, BinData, etc.) are recognized scalar names; see §22.
  5. Explicit is better than implicit, where it matters. Path syntax requires explicit array iteration in relationships; polymorphic paths require explicit alternative selectors.
  6. Declarative metadata is welcome; computational and contractual meaning are not. xDBML carries the declarative meaning of stored data -- names, descriptions, synonyms, glossary references, classification tags, validation constraints -- and leaves computational meaning (measures, metrics, aggregations) to semantic-layer formats and contractual concerns (SLAs, ownership, pricing) to data-contract formats like ODCS.
  7. Versioned evolution. Every xDBML document declares the spec version it conforms to.

1.3 A first look

A complete xDBML document spanning Oracle (RDBMS), MongoDB (document), Avro (events), and an LPG graph model:

xdbml
xdbml: 0.1

Project ecommerce {
  targets: [Oracle, MongoDB, Neo4j]
  Note: 'Reference example for the xDBML v0.1 specification'
}

Type Address {
  street  varchar [not null]
  city    varchar [not null]
  zip     varchar
  country varchar [default: 'US']
}

Type MonetaryAmount {
  amount   Decimal128 [not null]
  currency string     [not null, pattern: '^[A-Z]{3}$', minLength: 3, maxLength: 3,
                       note: 'ISO 4217 currency code']
}

Container core [type: schema, target: Oracle] {
  Table customers {
    id              int     [pk]
    email           varchar [unique, not null, pattern: '^[^@]+@[^@]+$']
    display_name    varchar [not null]
    primary_address Address
    created_at      timestamp [granularity: second, default: `SYSTIMESTAMP`]
  }
}

Container orders_store [type: database, target: MongoDB] {
  Collection orders {
    _id          objectId  [pk]
    customer_id  int32     [not null]
    placed_at    Date      [granularity: second]
    total        MonetaryAmount
    line_items   array [
      line_item object {
        sku         string         [not null]
        quantity    int32          [not null, minimum: 1]
        unit_price  MonetaryAmount
      }
    ]
    payment_method oneOf {
      card   object { last4 string [maxLength: 4], brand string }
      bank   object { iban string }
      wallet object { provider string, account string }
    } [discriminator: method_kind]
  }
}

Container catalog [type: schema, target: Oracle] {
  Table products {
    sku       varchar [pk]
    name      varchar [not null, synonyms: ['product name', 'item name']]
    price     MonetaryAmount
    metadata  json {
      weight_grams int
      dimensions   object { length int, width int, height int }
      tags         array [varchar]
    }
  }

  View top_sellers [materialized: true, refresh_schedule: 'daily'] {
    source_query: '''
      SELECT p.sku, p.name, COUNT(*) AS order_count
      FROM products p
      JOIN orders_store.orders o ON o.line_items.sku = p.sku
      GROUP BY p.sku, p.name
      ORDER BY order_count DESC
      FETCH FIRST 100 ROWS ONLY
    '''
    sku         varchar [pk]
    name        varchar
    order_count int
  }
}

Container social [type: database, target: Neo4j] {
  Edge FOLLOWS [source: core.customers, target: core.customers,
                source_cardinality: '0..*', target_cardinality: '0..*'] {
    since      date    [not null]
    is_close   boolean [default: false]
  }
}

Ref: orders_store.orders.customer_id > core.customers.id [source: '1..*', target: '1..1']
Ref: orders_store.orders.line_items.[*].sku > catalog.products.sku

This single document expresses an Oracle relational schema, a MongoDB document collection with nested arrays and polymorphism, a catalog with both an entity and a materialized view, an LPG social-graph keyspace with a property-bearing FOLLOWS edge, two reusable named types, and cross-container relationships including one that traverses an array. It would round-trip to Oracle DDL, MongoDB collection validators, JSON Schema, Avro, Neo4j Cypher schema, and the ODCS schema section.

2. Document structure

An xDBML document consists of, in order:

  1. A version declaration (§4), optionally followed by an experimental: opt-in clause.
  2. Zero or one Project declaration (§5).
  3. Any number of top-level declarations in any order: Container (§6), Entity at project level (§7), Edge (§12), View (§13), Type (§14), Enum (§15), Ref (§11), TablePartial, TableGroup (§16), DiagramView (§17), Note (§18).

Comments may appear anywhere whitespace is permitted.

3. Lexical conventions

3.1 Whitespace and comments

Whitespace is insignificant except as a token separator.

xdbml
// Single-line comment, runs to end of line
/* Block comment;
   may span lines */

3.2 Identifiers

A bare identifier matches [A-Za-z_][A-Za-z0-9_]*. Quoted identifiers are written in double quotes and may contain any character: "my-table", "first name".

3.3 String literals

Single-quoted ('a string') with \' and \\ escapes. Triple-quoted multi-line ('''line one\nline two''') normalizes indentation.

3.4 Number literals

Conventional syntax: 42, 3.14, -100, 1.5e10.

3.5 Expression literals

Backtick-quoted strings pass through as engine-native expressions: `SYSTIMESTAMP`, `uuid_generate_v4()`.

3.6 Block delimiters

Curly braces { ... } group block bodies. Square brackets [ ... ] carry settings and array type expressions.

3.7 Setting separator (the colon)

xDBML uses a colon : to separate a setting name from its value. The colon is required wherever a setting has both a name and a value:

xdbml
Project ecommerce {
  targets: Oracle           // setting name "targets", value Oracle
  Note: 'reference example' // setting name "Note", value "reference example"
}

Container core [type: schema] {   // setting name "type", value schema
  Table customers {
    Note: 'customer master'  // setting name "Note", value "customer master"
    id    int     [pk]
    email varchar [unique, not null, default: 'unknown']   // setting name "default", value "unknown"
  }
}

The colon is not used for:

  • Construct keywords: Project ecommerce {, Container core [...] {, Table customers {, Type Address {. The construct keyword and the declared name are juxtaposed without a colon.
  • Field type declarations: id int, name varchar. The field name and type are juxtaposed.
  • Boolean-style flags inside bracket lists: [pk], [not null], [unique], [increment]. These are markers, not key-value pairs.

One exception: the Ref: statement keyword includes the colon as part of its syntax (Ref: source > target). This is a DBML inheritance preserved for compatibility, even though Ref is a top-level declaration rather than a setting.

3.8 Case sensitivity

xDBML follows a two-tier case-sensitivity policy that balances tolerance (accept variant casings users naturally type) with precision (preserve identifier casing that target databases require).

Keywords are case-insensitive. All language keywords -- construct names (Project, Container, Table, Entity, Type, Edge, View, Note), settings (targets, target, default, pattern, not null, pk), reserved type-related words (int, varchar, objectId, Decimal128), and structural words (oneOf, anyOf, array, object) -- match any casing. The parser accepts note, Note, NOTE, and nOtE as the same keyword.

Identifiers are case-sensitive. Names you give to entities, fields, containers, named types, edges, views, and enums preserve user-supplied case. customer_email and Customer_Email are different identifiers. This is essential because target databases have different casing conventions:

  • MongoDB: typically camelCase (orderId, customerEmail)
  • Cassandra: lower_snake_case strictly enforced
  • Oracle: UPPER_SNAKE_CASE by default
  • PostgreSQL: lower_snake_case by convention

A roundtrip from one of these targets to another may require explicit identifier renaming; xDBML does not silently re-case identifiers.

Target name values are case-insensitive and alias-normalized. Values declared in targets: (Project) and target: (Container) settings are matched case-insensitively and resolved through an alias table (§5.1). So targets: postgres, targets: Postgres, targets: PostgreSQL, and targets: POSTGRESQL all normalize to the canonical PostgreSQL.

Documentation convention. This specification and all official examples use a consistent stylistic convention even though the parser is tolerant:

  • Construct keywords are capital-initial: Project, Container, Table, Entity, Type, Edge, View
  • Block-body settings starting with capital are major annotations: Note: at Project/Container/Table/Entity level
  • Bracket-list settings are lowercase: [pk], [not null], [default: 'x'], [note: 'inline note']
  • Type and value keywords are lowercase: int, varchar, string, array, object, oneOf

This pattern is inherited from DBML and recommended for new schemas. The parser accepts variant casings but emitting tools, formatters, and the spec itself follow this convention.

4. Version declaration

xdbml
xdbml: 0.1

SemVer-style MAJOR.MINOR[.PATCH] numbering. Appears at the top of the file, before any other construct.

4.1 Parser behavior

Document declaresParser supportsBehavior
(no declaration)anyTreated as DBML 3.13.6 -- no xDBML extensions recognized.
xdbml: 0.10.1Parsed with v0.1 feature set.
xdbml: 0.10.2+Parsed successfully (MINOR-version backward compatibility).
xdbml: 0.20.1Warning + refuse to parse.

4.2 Experimental features

xdbml
xdbml: 0.1
experimental: [graph_path_expressions, temporal_validity]

Without the opt-in, experimental constructs are not recognized. No backward-compatibility guarantee for experimental features.

5. Project

xdbml
Project ecommerce {
  targets: Oracle
  Note: 'Reference example'
}

For polyglot schemas spanning multiple targets, targets: accepts an array:

xdbml
Project polyglot_ecommerce {
  targets: [Oracle, MongoDB, Avro]
  Note: 'Customer master in Oracle, orders in MongoDB, events in Avro'
}

At most one Project may appear. Recognized settings: targets, Note. The legacy database_type: setting is also accepted for DBML compatibility (see §5.3).

5.1 Target name vocabulary

Target names are bare identifiers when they have no spaces or special characters (Oracle, PostgreSQL, MongoDB, Snowflake, Neo4j, Avro, Cassandra, BigQuery, Databricks). They must be quoted when they contain spaces or punctuation ('SQL Server', 'Cosmos DB', 'JSON Schema').

Target name values are matched case-insensitively and resolved through the alias table below to a canonical form. The canonical form is what appears in the normalized AST and what generators key off of.

Canonical nameRecognized aliases
Oracle--
PostgreSQLPostgres, pg
'SQL Server'MSSQL, 'Microsoft SQL Server', 'T-SQL'
MySQL--
MariaDB--
SQLite--
Snowflake--
Databricks--
BigQuery'Google BigQuery'
Redshift'Amazon Redshift'
'Synapse Analytics''Azure Synapse'
TimescaleDB--
ClickHouse--
MongoDBMongo
DocumentDB'AWS DocumentDB'
'Cosmos DB'Cosmos, 'Azure Cosmos DB'
Couchbase--
RavenDB--
DynamoDB--
Cassandra'Apache Cassandra'
ScyllaDB--
Neo4j--
Memgraph--
Neptune'Amazon Neptune'
JanusGraph--
Avro'Apache Avro'
Parquet'Apache Parquet'
Protobuf'Protocol Buffers', Proto
'JSON Schema'--
OpenAPISwagger
GraphQL--
Elasticsearch--
OpenSearch--

The list is non-exhaustive. Target name values outside the table are accepted by the parser but may produce generic output from target-aware generators. The official list grows over time as the ecosystem adds first-class support for new targets.

5.2 Container target resolution

Each Container has an effective target -- the storage technology, exchange format, or API platform it lowers to. The effective target is resolved as follows:

  1. If the Container declares target: explicitly, that value is the effective target.
  2. Otherwise, if the Project declares exactly one entry in targets:, the Container inherits that single target.
  3. Otherwise (Project declares multiple targets but Container declares none), the Container's effective target is undefined and the parser MUST emit an error.
  4. If the Container declares a target: value that does not appear in the Project's targets: list, the parser MUST emit an error.

5.3 DBML compatibility: the database_type: alias

xDBML is a strict superset of DBML 3.13.6, which uses database_type: to declare a single target engine at the Project level. xDBML preserves this keyword as a legacy alias for single-target schemas:

xdbml
// DBML-compatible (legacy)
Project simple {
  database_type: 'PostgreSQL'
}

// xDBML preferred equivalent
Project simple {
  targets: PostgreSQL
}

Compatibility rules:

  1. database_type: accepts only a single string or bare identifier value. Array values are not permitted (DBML never supported them).
  2. A Project that declares database_type: X is semantically equivalent to a Project that declares targets: X. Parsers MUST normalize the AST to use targets: with a single-element list.
  3. A Project MUST NOT declare both database_type: and targets:. Doing so produces a parser error (ambiguous intent).
  4. New schemas SHOULD prefer targets: over database_type:. The legacy form remains supported for round-trip fidelity with DBML tooling and for documents authored before xDBML v0.1.
  5. The database_type: keyword has no Container-level equivalent. Container-level target declarations always use target:.
xdbml
// Single-target project; Container target is implicit
Project simple {
  targets: PostgreSQL
}
Container app [type: schema] {                   // implicit target: PostgreSQL
  Table users { ... }
}

// Polyglot project; each Container must declare its target
Project polyglot {
  targets: [Oracle, MongoDB]
}
Container core [type: schema, target: Oracle] {  // explicit
  Table customers { ... }
}
Container orders_store [type: database, target: MongoDB] {  // explicit
  Collection orders { ... }
}

6. Container

The namespace-level construct between Project and Entity.

6.1 Declaration

xdbml
Container core [type: schema] {
  Note: 'Core domain entities'

  Entity customers { ... }
  Entity accounts { ... }
}

6.2 Keyword synonyms

KeywordNative to
ContainerxDBML canonical
SchemaOracle, PostgreSQL, SQL Server, Snowflake
DatabaseMongoDB, Cosmos DB
KeyspaceCassandra, ScyllaDB
NamespaceAvro, ORC, GraphQL
DatasetBigQuery
BucketCouchbase

6.3 The type: setting

Names the target-native flavor (schema, database, keyspace, namespace, dataset, bucket, index).

6.4 Container settings

xdbml
Container core [type: schema, target: Oracle] { ... }

Container kv_store [type: keyspace,
                    target: Cassandra,
                    replication: '{class: NetworkTopologyStrategy, dc1: 3, dc2: 2}',
                    durable_writes: true] { ... }

The target: setting declares the specific target technology for the Container. It is optional in single-target projects (where it is inherited from Project.targets) and required in polyglot projects with more than one declared target. See §5.2 for the resolution rule.

Other Container settings are open-vocabulary key-value pairs. Generators consume what they understand.

6.5 Optional containers

Entities are not required to belong to a container. An entity belongs to at most one container.

6.6 Cross-container references

Relationships, edges, and inline references use dotted notation with explicit container prefix:

xdbml
Container core [type: schema] {
  Table customers { id int [pk] }
}

Container sales [type: schema] {
  Table orders {
    id          int [pk]
    customer_id int [ref: > core.customers.id]
  }
}

Ref: sales.orders.customer_id > core.customers.id

The dotted path is container_name.entity_name.field_path.

7. Entity

A named field-bearing construct.

7.1 Declaration

xdbml
Table customers {
  id    int     [pk]
  email varchar [unique, not null]
  name  varchar
}

7.2 Keyword synonyms

KeywordNative to
EntityxDBML canonical
TableOracle, RDBMS, Cassandra, Hive, BigQuery, Databricks, Snowflake,...
CollectionMongoDB, Cosmos DB, Couchbase
RecordAvro, Parquet, GraphQL

7.3 Schema-qualified declaration

xdbml
Table core.users { ... }

Produces an Entity named users inside an implicit container core. Explicit Container { Entity ... } blocks are equivalent and preferred for new xDBML.

7.4 Aliases

xdbml
Entity very_long_user_table as U { ... }
Ref: U.id < posts.user_id

7.5 Entity settings

SettingMeaning
noteDocumentation
headercolorVisualization color
synonymsAlternative names (§23.1)
business_termGlossary reference (§23.2)

8. Field

xdbml
field_name field_type [field_settings]

8.1 Field types

  • Scalar type name: int, varchar, decimal(19,4), timestamp, etc.
  • BSON scalar type (§22): string, int32, int64/long, double, bool, null, objectId, Decimal128, BinData, Date, Timestamp, Regex, JavaScript, MinKey, MaxKey, Symbol, DBPointer.
  • Complex type expression (§8.2): object, array, map, set.
  • Polymorphism construct (§20): union, oneOf, anyOf, allOf.
  • Named type reference (§14): bare name of a previously declared Type.
  • JSON-with-schema (§21): json, jsonb, variant.

8.2 Complex type keywords

ConceptPrimaryAliases
Structured valueobjectstruct, record
Ordered listarraylist
Heterogeneous tuplearray (with positional indices)list
Unique collectionset--
Key-value collectionmapdict, dictionary

8.3 Object type

xdbml
address object {
  street    string [not null]
  city      string [not null]
  zip       string
  country   varchar [default: 'US']
} [not null, note: 'Primary mailing address']

8.4 Array of scalars

xdbml
tags    array [varchar]
scores  array [int [not null]]

8.5 Array of named objects

xdbml
line_items array [
  line_item object {
    sku        varchar       [not null]
    quantity   int           [not null, default: 1]
    unit_price decimal(10,2) [not null]
  }
] [not null]

8.6 Heterogeneous tuple

xdbml
addresses array [
  [0] billing  object { street string, city string, zip string }
  [1] shipping object { street string, city string, zip string }
]

Positions are zero-indexed and must form a contiguous range starting at 0.

8.7 Maps and sets

xdbml
flags       map [string, boolean]
permissions set [varchar]

8.8 Common field settings

SettingBehavior
pk / primary keyPrimary key
null / not null / requiredCardinality
uniqueSingle-field unique constraint
default: valueDefault value
incrementAuto-increment
note: 'string'Documentation
ref: <op> targetInline foreign key (§11)
check: exprField-level check constraint
Validation constraints§24
AI-readiness§23

required is accepted as a synonym for not null and is provided for authors familiar with JSON Schema, OpenAPI, and Avro vocabularies where required is the conventional keyword. Parsers MUST normalize required to not null in the AST. Tools that emit xDBML SHOULD use not null as the canonical form. Both forms are semantically identical: the field must be present and must have a non-null value.

8.9 Cardinality vs. type-level nullability

null / not null settings express cardinality -- whether a field is required to be present and whether null is acceptable. The token null inside a union [ ... ] list (§20.1) expresses type membership -- that the null literal is one of the permitted runtime types. These are distinct concepts.

9. Index

xdbml
Table bookings {
  id            int
  country       varchar
  booking_date  date
  created_at    timestamp

  indexes {
    (id, country) [pk]
    created_at [name: 'created_at_index']
    booking_date
    (country, booking_date) [unique]
    booking_date [type: hash]
    (`id*2`)
  }
}

9.1 Index forms

FormExample
Single fieldcreated_at [name: 'idx_created']
Composite(col1, col2)
Expression(`expr`)

9.2 Index settings

SettingValues
typebtree, hash, engine-specific
namestring
uniqueflag
pkflag
notestring

9.3 Indexes on nested fields

Index paths use the path syntax of §19:

xdbml
indexes {
  shipping.address.country         // object navigation
  line_items.sku                   // implicit array iteration
  line_items.[*].sku               // explicit array iteration
  addresses.[0].city               // tuple positional
  flags.["dark_mode"]              // map literal key
  method.card.brand                // polymorphic alternative
}

For homogeneous arrays, the implicit iteration form is equivalent to the explicit form. For heterogeneous tuples and Ref source paths, explicit syntax is required.

10. Checks -- entity-level constraints

A checks { } block declares multi-column constraint expressions on an entity. It is the entity-level counterpart to the field-level check: setting (§8.8): where the field-level form expresses a constraint on a single field, the entity-level block expresses constraints that span two or more fields.

The block is a peer of the indexes { } block (§9) -- both are entity-level declarations that constrain or annotate the entity's data shape.

10.1 Declaration

xdbml
Entity users {
  id     int     [pk]
  wealth decimal(15,2)
  debt   decimal(15,2)
  checks {
    `debt + wealth >= 0` [name: 'chk_positive_net_worth']
    `wealth >= 0`        [name: 'chk_non_negative_wealth']
    `debt   >= 0`
  }
}

Each line inside checks { } is a check expression -- a backtick-wrapped boolean expression in the target engine's expression language. xDBML treats expressions as opaque strings; the parser does not interpret or validate them. Generators emit them verbatim as the target engine's CHECK constraint or equivalent.

10.2 Check settings

Each check expression accepts optional bracket settings:

SettingValuePurpose
namestringName of the constraint as it appears in the generated DDL (CONSTRAINT chk_name CHECK (...)). If omitted, the generator chooses a name.
notestringFree-text annotation. Carries through to generators that support constraint comments.

When name: is omitted, generators MAY produce a deterministic synthetic name (e.g., <entity>_chk_<index>) for reproducible DDL.

10.3 Check expressions and the target engine

Check expressions are not portable across targets in general. An expression valid in PostgreSQL may not parse in MongoDB, and a MongoDB document-level validator uses an entirely different expression syntax.

xDBML preserves the expression as written. Round-trip behavior:

  • Single-target schemas: the expression is emitted verbatim to the target.
  • Polyglot schemas: the expression is emitted to targets where it is syntactically valid; for other targets, generators MAY emit a comment placeholder or omit the constraint with a warning.

For maximum portability, authors are encouraged to use expressions that work across the target families they care about, or to use field-level check: settings (§8.8) which are more often portable.

10.4 Relationship to field-level check:

The field-level check: setting (§8.8) is preferred for single-column constraints because it sits next to the field it constrains:

xdbml
Entity products {
  price decimal(10,2) [check: `price > 0`]
}

The entity-level checks { } block is for constraints that cannot be expressed at field level -- typically because they reference two or more fields:

xdbml
Entity reservations {
  start_date date
  end_date   date
  checks {
    `start_date <= end_date`           [name: 'chk_valid_date_range']
    `end_date - start_date <= 30`      [name: 'chk_max_30_days']
  }
}

Both forms generate CHECK constraints in SQL targets. Field-level constraints are scoped to a single column in the generated DDL; entity-level constraints reference multiple columns by name.

10.5 Relationship to validation constraints (§24)

xDBML provides two complementary constraint mechanisms:

  • Validation constraints (§24) -- declarative shape constraints (pattern, minLength, maxLength, range, enum membership). Portable across targets. Lower naturally to JSON Schema, OpenAPI, Avro validators, and SQL CHECK for the constraints SQL can express.

  • Check expressions (this section, plus §8.8 for field-level) -- arbitrary boolean expressions in the target engine's language. Maximum expressiveness, but coupled to the target.

Use validation constraints when the shape constraint can be expressed declaratively (email matches pattern X, age between 0 and 150). Use check expressions when you need cross-column logic or engine-specific predicates (CASE WHEN status = 'shipped' THEN tracking_number IS NOT NULL END).

11. Relationship and cardinality

11.1 Cardinality operators

OperatorMaximum-cardinality
<one-to-many
>many-to-one
-one-to-one
<>many-to-many

11.2 Declaration forms

xdbml
// Long form
Ref name_optional {
  schema1.entity1.field1 < schema2.entity2.field2
}

// Short form
Ref name_optional: schema1.entity1.field1 < schema2.entity2.field2

// Inline form
Entity schema2.entity2 {
  id     integer
  field2 integer [ref: > schema1.entity1.field1]
}

Relationships are declared at project level. They may span containers.

11.3 Cross-container references

xdbml
Ref: sales.orders.customer_id > core.customers.id

Dotted notation: container_name.entity_name.field_path. For relationships into nested fields, the path continues into the nested structure.

11.4 Composite foreign keys

xdbml
Ref: merchant_periods.(merchant_id, country_code) > merchants.(id, country_code)

11.5 Relationships into nested fields

xdbml
Ref: orders.shipping.address.country > countries.iso_code
Ref: customers.addresses.[0].zip > postal_codes.code

11.6 Explicit .[*] required for array crossings

When a Ref source path crosses a homogeneous array, the explicit .[*] form is required:

xdbml
Ref: orders.line_items.[*].sku > products.sku    // valid
Ref: orders.line_items.sku > products.sku        // rejected

11.7 Explicit cardinality

The operators express maximum cardinality with optionality inferred from FK nullability. For precise cardinality on both ends, the source: and target: settings carry UML-style 'min..max' strings:

xdbml
// Compact form: operators imply defaults
Ref: orders.customer_id > customers.id

// Explicit form: precise cardinality on both ends
Ref: orders.customer_id > customers.id [source: '1..*', target: '1..1']

// Optional parent (each Pet may have no Owner)
Ref: pets.owner_id > people.id [source: '0..*', target: '0..1']

Recognized cardinality strings:

StringMeaning
'1..1'Exactly one
'0..1'Zero or one
'1..*'One or more
'0..*'Zero or more
'N..M'N to M (non-negative integers; * allowed for M)

Alternative four-key form:

xdbml
Ref: pets.owner_id > people.id [min_source: 0, max_source: '*',
                                 min_target: 0, max_target: 1]

Both forms produce the same AST. The compact-string form is canonical.

11.8 Default cardinality inference

OperatorFK nullable?SourceTarget
>no'1..*''1..1'
>yes'0..*''1..1'
<no'1..1''1..*'
<yes'1..1''0..*'
-no'1..1''1..1'
-yes'1..1''0..1'
<>(n/a)'0..*''0..*'

11.9 Relationship settings

xdbml
Ref: products.merchant_id > merchants.id [delete: cascade, update: no action]
Ref: legacy.user_id > users.id [inactive, color: '#999999', note: 'historical FK; superseded by audit_log.user_ref']
SettingValues
delete / updatecascade, restrict, set null, set default, no action
color#rgb or #rrggbb
source / targetCardinality strings (§11.7)
min_source / max_source / min_target / max_targetCardinality components
inactiveFlag (no value). Marks the relationship as inactive; visualization tools render it as a dotted line and may exclude it from cardinality computation. The relationship still exists structurally; this is a visualization and documentation hint, not a removal.
noteFree-text annotation.

The inactive flag is useful for documenting historical or deprecated foreign keys that remain in the schema but are no longer enforced or relied upon by application code. It has no effect on DDL generation in targets that don't distinguish active from inactive relationships.

Settings are not supported on inline ref: declarations.

12. Edge -- relationships with properties

Labeled Property Graph (LPG) databases -- Neo4j, Memgraph, JanusGraph, Amazon Neptune (LPG mode) -- and RDF-star treat relationships as first-class entities that carry properties.

12.1 Declaration

xdbml
Edge KNOWS [source: Person, target: Person] {
  since      date     [not null]
  intimacy   int      [minimum: 0, maximum: 10]
  context    varchar
}

Edge RATED [source: User, target: Movie] {
  rating     int       [not null, minimum: 1, maximum: 5]
  rated_at   timestamp [granularity: second, not null]
  review     varchar
}

An Edge declaration carries a name, source and target settings pointing to connected entities, and a body of fields.

12.2 Cardinality on edges

Edges use source_cardinality and target_cardinality settings (distinct from source: and target: which name the connected entities):

xdbml
Edge OWNS [source: Person, target: Pet,
           source_cardinality: '0..*', target_cardinality: '0..1'] {
  acquired_date  date
  purchase_price decimal(10,2)
}

12.3 Directionality

xdbml
Edge FRIENDS_WITH [source: Person, target: Person, undirected: true] {
  since date
}

When undirected: true, source and target are interchangeable.

12.4 Multiple edges between the same entities

xdbml
Edge LIKES   [source: User, target: Post] { liked_at timestamp }
Edge SHARED  [source: User, target: Post] { shared_at timestamp, audience varchar }
Edge BLOCKED [source: User, target: Post] { blocked_at timestamp, reason varchar }

12.5 Edges with named types and partials

xdbml
Type RelationshipMetadata {
  created_at    timestamp [granularity: second]
  created_by    varchar
  source_system varchar
}

Edge KNOWS [source: Person, target: Person] {
  ~RelationshipMetadata
  since    date
  intimacy int
}

12.6 Indexes on edges

xdbml
Edge RATED [source: User, target: Movie] {
  rating   int       [not null]
  rated_at timestamp

  indexes {
    rated_at
    (rated_at, rating)
  }
}

12.7 Round-trip across targets

TargetEdge representation
Neo4j / LPGNative relationship type with properties
RDF-starQuoted triple with annotation properties
RDBMSJunction table with FKs plus property columns
MongoDBArray of subdocuments on source-side entity
AvroEdge events with source_id, target_id, property fields
RDF (plain)Reified statements (lossy)

13. View -- derived shapes with source queries

13.1 Declaration

xdbml
View active_customers [materialized: false] {
  source_query: '''
    SELECT id, email, name, created_at
    FROM customers
    WHERE deleted_at IS NULL
  '''

  id         int       [pk]
  email      varchar
  name       varchar
  created_at timestamp [granularity: second]
}

View monthly_revenue [materialized: true,
                      refresh_schedule: 'daily',
                      source_database: 'Oracle'] {
  source_query: '''
    SELECT TRUNC(placed_at, 'MM') AS month, SUM(total) AS revenue
    FROM orders
    GROUP BY TRUNC(placed_at, 'MM')
  '''

  month   date           [pk]
  revenue decimal(15,2)  [granularity: month]
}

13.2 Materialized vs. virtual

  • materialized: false (default) -- virtual view; query executes on access.
  • materialized: true -- materialized view; query precomputed and stored.
SettingMeaning
refresh_scheduleRefresh schedule (hourly, daily, weekly, engine-specific)
refresh_onSource entities triggering refresh
storage_optionsEngine-specific storage hints

13.3 Source query as opaque metadata

xDBML does not parse the source_query string -- it's target-specific. The query is captured verbatim and passed to generators. xDBML can express views for non-SQL query languages (Cypher, MongoDB aggregation pipelines, KSQL) by storing the engine-native query and tagging with source_database.

13.4 Views in containers

xdbml
Container analytics [type: schema] {
  Entity orders { ... }
  View revenue_by_month [materialized: true] {
    source_query: '...'
    month   date
    revenue decimal(15,2)
  }
}

Cross-container references treat views and entities identically.

13.5 View settings

SettingMeaning
source_queryDefining query (opaque string)
materializedtrue or false (default false)
refresh_scheduleRefresh schedule
refresh_onSource entities
source_databaseDatabase flavor of source query
noteDocumentation
synonyms, business_termAI-readiness (§23)

13.6 Round-trip across targets

TargetView representation
Oracle / PostgreSQL / SQL Server / SnowflakeCREATE VIEW or CREATE MATERIALIZED VIEW
MongoDBdb.createView() for virtual; $merge for materialized
BigQueryCREATE VIEW / CREATE MATERIALIZED VIEW
DatabricksCREATE OR REPLACE VIEW / CREATE MATERIALIZED VIEW
ElasticsearchContinuous transforms (materialized)

14. Named type

A Type declaration defines a named, reusable type expression.

14.1 Declaration

xdbml
Type Address {
  street  varchar [not null]
  city    varchar [not null]
  zip     varchar
  country varchar [default: 'US']
}

Type MonetaryAmount {
  amount   Decimal128 [not null]
  currency string     [not null, pattern: '^[A-Z]{3}$', minLength: 3, maxLength: 3]
}

14.2 Reference

xdbml
Table orders {
  shipping_address Address
  billing_address  Address
  total            MonetaryAmount [not null]
}

Entity customers {
  addresses array [address Address]
}

Built-in type keywords always take precedence; named types cannot shadow them.

14.3 Composition and recursion

xdbml
Type ContactInfo {
  email   varchar
  phone   varchar
  address Address
}

Type TreeNode {
  value    int
  children array [child TreeNode]    // self-reference
}

14.4 Type settings

xdbml
Type Address [note: 'Used across customer and order entities'] {
  ...
}

Recognized: note, synonyms, business_term.

14.5 Scope

Project-scoped. Name collisions across Type, Entity, View, Edge, Enum, TablePartial are rejected at parse time.

14.6 Relationship to TablePartial

Type when...TablePartial when...
Reusing a value shapeReusing a set of entity fields
Reference from nested structuresSplice fields at entity top level
Cross-references into type's fieldsFields appear as direct columns
One canonical definition for manyEach entity gets own copy of fields

14.7 Scalar named types (new in v0.2)

The Type declarations in §14.1 through §14.6 describe object-shaped reusable structures -- a Type body contains one or more named inner fields. v0.2 extends Named Types to also support scalar shapes -- a Type whose entire content is a single field's worth of declaration.

Syntax:

xdbml
Type Email varchar [pattern: '^[^@]+@[^@]+$', tags: ['pii'], note: 'Standard enterprise email']
Type CountryCode varchar [pattern: '^[A-Z]{2}$', minLength: 2, maxLength: 2]
Type Percentage decimal(5,2) [minimum: 0, maximum: 100]

The Type's base is a built-in type (or another Named Type), and the bracketed settings carry the full field-level surface (validation constraints, notes, AI-readiness metadata, custom properties).

Usage: scalar named types are used identically to object-shaped Types -- as the type of a field declaration:

xdbml
Type Email varchar [pattern: '^[^@]+@[^@]+$', tags: ['pii']]

Entity users    { id int [pk], email Email }
Entity admins   { id int [pk], email Email }
Entity contacts { id int [pk], email Email }

The three entities each get an email field of type varchar with the pattern validation and tags attached.

Importability: scalar Named Types are importable via the type element type in the module system (§26):

xdbml
reuse { type Email } from './enterprise-types'

Entity employees {
  work_email Email
}

Relationship to field-level imports: scalar Named Types and field-level imports (§26.9) both follow the declaration-vs-placement pattern: the construct declares a reusable shape; placement happens by using the shape's name as a field's type. The difference is in where the reusable shape originates: scalar Named Types are explicitly declared as Type; field-level imports clone a shape from an existing entity field. Both mechanisms are valid; authors choose based on context.

15. Enum

xdbml
enum job_status {
  created [note: 'Waiting to be processed']
  running
  done
  failure
}

enum grade {
  "A+"
  "A"
  "A-"
}

Values with non-identifier characters must be quoted. Enums may be container-qualified.

16. TablePartial and TableGroup

16.1 TablePartial

xdbml
TablePartial base_template {
  id         int       [pk, not null]
  created_at timestamp [default: `SYSTIMESTAMP`]
  updated_at timestamp [default: `SYSTIMESTAMP`]
}

Entity users {
  ~base_template
  name varchar
}

Conflict resolution: local entity definitions override partials; among partials, last-injected wins. TablePartials may also be injected into Edge declarations.

16.2 TableGroup

xdbml
TableGroup e_commerce [color: '#3498DB', note: 'Commerce entities'] {
  orders
  products
  inventory
}

TableGroup settings:

SettingValues
color#rgb or #rrggbb. Visualization color for the group's frame in diagram renderers.
noteFree-text annotation.

The color setting controls how renderers visually frame the group's member entities. Implementations that don't support group colors ignore the setting.

16.3 Naming

TablePartial and TableGroup retain their DBML names without Entity* synonyms.

17. Diagram View

xdbml
DiagramView full_view {
  Tables { * }
  Notes { * }
  TableGroups { * }
  Containers { * }
  Views { * }
  Edges { * }
}

DiagramView sales_view {
  Tables { users; orders; products }
}

Categories: Tables, Notes, TableGroups, Containers, Views, Edges.

18. Note

Inline notes are settings or blocks on Project, Container, Entity, Field, Index, Edge, View, Type, TableGroup. Standalone notes act as canvas annotations:

xdbml
Note design_decision {
  '''
  We chose to denormalize line items into the orders entity
  rather than maintaining a separate line_items entity because
  read patterns dominate writes by 100:1.
  '''
}

19. Path syntax

19.1 Segment forms

SegmentMeaning
.field_nameNavigate into named field
.[N]Specific positional array element (zero-indexed)
.[*]All array elements (iteration)
.["literal_key"]Map literal key
.[*] (map context)All map values
."quoted name"Non-identifier name
.AlternativeNamePolymorphic alternative selector

19.2 Path examples

xdbml
customers.addresses.[0].city          // tuple positional
orders.line_items.[*].sku             // every sku across line items
events.[42].payload.user_id           // 43rd event's payload
flags.["dark_mode"]                   // specific map key
data."user.id"                        // quoted field name
payments.method.card.brand            // polymorphic alternative
sales.orders.line_items.[*].sku       // cross-container with array crossing

19.3 Implicit array iteration in indexes

Allowed only in index paths. Ref paths and tuple positions require explicit syntax.

19.4 JSONPath alias acceptance

JSONPath-style brackets (no leading dot) are accepted at parse time and normalized to the dot-prefixed canonical form:

xdbml
addresses[0].city     // parsed; normalized to addresses.[0].city

19.5 Formatter behavior

Explicit .[*] whenever path depth exceeds two segments; implicit auto-iteration otherwise. Tuple positions and quoted segments always emitted explicitly.

20. Polymorphism

20.1 Scalar type union -- union

xdbml
Entity records {
  score      union [int, decimal, null]
  legacy_id  union [string, int]                 [not null]
  tags       array [union [string, int]]
}

Element ordering is significant. The runtime type acts as the discriminator.

20.2 Structural polymorphism -- oneOf, anyOf, allOf

xdbml
Entity payments {
  amount  decimal(10,2)  [not null]

  method  oneOf {
    card object { last4 varchar(4), brand varchar }
    bank object { iban varchar }
    wallet object { provider varchar, account varchar }
  } [discriminator: method_kind]
}

20.3 Keywords

KeywordSemantics
oneOfExactly one of N alternatives
anyOfOne or more alternatives may apply
allOfValue must satisfy all listed schemas

20.4 Paths through polymorphism -- strict alternative selector

xdbml
indexes {
  method.card.brand      // valid
  method.brand           // rejected -- selector required even when unambiguous
}

Paths through union (scalar) are rejected.

20.5 Polymorphism inside arrays

xdbml
events array [
  event oneOf {
    user_event object { type varchar, user_id objectId, action varchar }
    item_event object { type varchar, item_id objectId, qty int }
  } [discriminator: type]
]

21. JSON as a first-class data type

21.1 Opaque JSON

xdbml
Entity api_logs {
  request_body  json   [note: 'shape varies by endpoint']
  response_body jsonb  [note: 'large; may exceed 1 MB']
  raw_data      variant
}

21.2 Schema-known JSON

xdbml
Entity orders {
  id      int [pk]
  payload json {
    shipping_address object {
      street  varchar [not null]
      city    varchar [not null]
    }
    items array [
      item object {
        sku      varchar [not null]
        quantity int     [not null]
      }
    ]
  }
}

21.3 json vs. object

object describes an abstract structural type -- encoding chosen by target. json, jsonb, variant describe a physical storage choice (JSON-typed column).

21.4 Keyword semantics

KeywordTarget intent
jsonGeneric JSON storage (Oracle 21c+, MySQL, MS SQL Server 2025+, BigQuery)
jsonbPostgreSQL-specific binary JSON, indexable
variantSnowflake-specific semi-structured with auto-typing

22. BSON scalar types

MongoDB stores documents in BSON, a JSON superset with additional scalar types. xDBML recognizes BSON type names as ordinary scalar type identifiers -- no special grammar treatment but preserved through the AST and emitted in MongoDB collection validators.

BSON typeTypical use
stringUTF-8 string (BSON element 0x02)
int3232-bit signed integer (BSON element 0x10)
int6464-bit signed integer (BSON element 0x12); long is a recognized alias
double64-bit IEEE 754 floating point (BSON element 0x01)
boolBoolean (BSON element 0x08)
nullNull type (BSON element 0x0A)
objectId12-byte MongoDB primary key
Decimal128High-precision decimal (IEEE 754-2008)
BinDataBinary data with subtype
TimestampMongoDB-internal timestamp
DateUTC datetime
RegexCompiled regular expression
JavaScriptJavaScript code
SymbolDeprecated; preserved for legacy reads
MinKey / MaxKeySentinel comparison values
DBPointerDeprecated; preserved for legacy reads

int64 is canonical; long is accepted at parse time and normalized to int64 in the AST. MongoDB shell output uses long, while the BSON spec uses int64 -- both forms are equivalent.

When a field lives inside a Container [type: database] targeting MongoDB, use BSON-native type names (string, int32, int64, Decimal128, Date, objectId, etc.) rather than SQL conventions (varchar, int, decimal(p,s), timestamp). The rationale:

  • BSON types are exactly what MongoDB's $jsonSchema validators emit -- no translation layer interpreting precision/scale metadata that has no BSON meaning.
  • Decimal128 carries its IEEE 754-2008 precision intrinsically and does not accept (p,s) parameters. Writing decimal(19,4) inside a MongoDB Collection misleadingly suggests precision metadata that BSON cannot represent.
  • int is ambiguous between BSON int32 and int64. Explicit int32 or int64 removes the ambiguity that cross-engine Ref validation depends on.

Generators accept SQL-named scalars inside MongoDB Containers (§1.2 design principle 4 -- type-name pass-through) and convert them on lowering. The recommendation is for clarity and round-trip fidelity, not parser correctness.

22.2 Cross-engine references

A Ref may cross between Containers whose engines use different type-name vocabularies. The reference is valid if the source and target field types map to a compatible value space per the round-trip table below.

Relational typeBSON equivalent
int, integerint32 (when target stores small IDs) or int64 (when target stores large IDs)
bigintint64
varchar, varchar(n), textstring (with optional maxLength)
decimal(p,s), numeric(p,s)Decimal128
boolean, boolbool
timestamp, timestamptz, datetimeDate

A Ref from int32 (BSON) to int (relational) is valid by default. A Ref from int64 to a relational int issues a warning if the relational type is not bigint or wide enough to hold the BSON values.

Reusable Type declarations (§14) used in both relational and MongoDB Containers should use BSON-native types in the Type body; relational generators lower them per the round-trip table:

xdbml
Type MonetaryAmount {
  amount   Decimal128 [not null]
  currency string     [not null, pattern: '^[A-Z]{3}$', minLength: 3, maxLength: 3]
}

This Type used in an Oracle Entity lowers amount to NUMBER(38,4) and currency to VARCHAR2(3) CHECK (REGEXP_LIKE(currency, '^[A-Z]{3}$')). The same Type used in a MongoDB Collection emits Decimal128 and string with maxLength: 3 and a pattern in $jsonSchema.

22.3 Round-trip to non-MongoDB targets

xdbml
Container app_data [type: database] {
  Collection users {
    _id          objectId    [pk]
    email        string      [unique, not null, maxLength: 255]
    balance      Decimal128
    last_login   Date
    avatar       BinData
  }
}

Round-trip to non-MongoDB targets:

BSON typeOraclePostgreSQLAvroJSON Schema
stringvarchar2(n) or clobvarchar(n) or textstringstring
int32number(10)integerintinteger
int64number(19)bigintlonginteger
doublebinary_doubledouble precisiondoublenumber
boolnumber(1) check (col in (0,1))booleanbooleanboolean
nullnullnullnullnull
objectIdvarchar2(24) w/ checkvarchar(24) or byteastring w/ logical typestring w/ pattern
Decimal128number(38,4)numeric(34,0)decimal high precisionstring
BinDatablobbyteabytesstring w/ format: byte
Timestamptimestamp with time zonetimestamptzlong w/ logicalType: timestamp-millisstring w/ format: date-time
Datetimestamp with time zonetimestamptzas Timestampstring w/ format: date-time

23. AI-readiness and semantic-interop settings

Four optional settings expose declarative metadata for AI consumers, semantic-layer tools (OSI, dbt MetricFlow, Cube), and governance platforms (Atlan, Collibra, DataHub, Alation). A fifth mechanism (the x_ prefix) supports organization-specific custom properties without grammar changes.

23.1 synonyms: -- alternative names

xdbml
Entity customers {
  display_name varchar [synonyms: ['name', 'full name', 'customer name']]
  mrr_amount   decimal(10,2) [synonyms: ['monthly revenue', 'recurring revenue']]
}

23.2 business_term: -- controlled-vocabulary identifier

xdbml
Entity orders {
  total_amount decimal(10,2) [business_term: 'Gross Order Value']
  net_amount   decimal(10,2) [business_term: 'https://glossary.acme.com/terms/net_order_value']
}

23.3 granularity: -- temporal granularity

xdbml
Entity events {
  occurred_at timestamp [granularity: second]
  recorded_at timestamp [granularity: microsecond]
}

Values: year, quarter, month, week, day, hour, minute, second, millisecond, microsecond, nanosecond.

23.4 tags: -- free-form classification labels

A list of string labels for classification, sensitivity, lifecycle, domain, audience, governance markers, or any other discoverable property. Tags are the dominant pattern in modern data catalogs (Atlan, Collibra, DataHub, Alation, Hackolade Studio).

xdbml
Entity customers {
  email        varchar [tags: ['pii', 'contact', 'gdpr-subject']]
  ssn          varchar [tags: ['pii', 'sensitive', 'gdpr-special', 'restricted']]
  internal_id  int     [tags: ['internal-only']]
}

Entity products [tags: ['catalog', 'public-api']] { ... }

Type MonetaryAmount [tags: ['finance', 'reusable']] { ... }

Tags compose freely with synonyms, business_term, and other settings. They differ from synonyms (which name the same thing in multiple ways) and from business terms (which reference an external glossary): tags are classification labels, freeform and multi-valued.

23.5 Custom properties -- the x_ prefix

Settings beyond the recognized vocabulary are accepted as open-vocabulary key-value pairs and passed through to the AST. To distinguish intentional custom properties from possible typos in reserved keywords, xDBML reserves the x_ prefix for explicit custom extensions.

xdbml
Entity customers {
  email varchar [
    pattern: '^[^@]+@[^@]+$',
    x_sensitivity: 'pii-low',
    x_acme_data_steward: 'jane@acme.com',
    x_retention_days: 2555
  ]
}

Container core [type: schema, x_provisioning_template: 'standard-tier-3'] { ... }

Conventions:

  • The prefix is x_ (underscore-prefix). Dash-prefix is not used because xDBML identifiers do not allow dashes.
  • The prefix is optional, not required. The parser also accepts non-prefixed unknown properties.
  • The parser does not validate names or values of x_* properties -- they pass through to the AST.
  • Tooling can distinguish x_* properties as deliberately custom (versus possibly-misspelled reserved keywords); linters may warn on non-x_-prefixed unknown properties.

A registry of de-facto-standardized custom properties is maintained at xdbml.org/registry. Custom properties that reach widespread adoption may be promoted to first-class settings in a future MINOR version.

23.6 Scope

The four first-class settings and the x_ mechanism carry declarative metadata about stored data -- names, descriptions, classifications, glossary references, AI-readiness labels. xDBML does not carry computational meaning (measures, metrics, aggregations, derived fields, hierarchies, default-aggregation hints) -- that belongs in adjacent semantic-layer formats -- nor contractual obligations (SLAs, ownership, pricing, support hours) -- that belongs in ODCS.

24. Validation constraints

xDBML accepts the full JSON Schema validation vocabulary as field settings:

SettingApplies toMeaning
patternstringRegular expression
formatstringemail, uri, uuid, date-time, etc.
minLength / maxLengthstringLength bounds
minimum / maximumnumberValue bounds
exclusiveMinimum / exclusiveMaximumnumberExclusive bounds
multipleOfnumberMultiple of N
enumanyFixed list of values
minItems / maxItemsarrayElement count bounds
uniqueItemsarrayAll elements distinct
minProperties / maxPropertiesobjectField count bounds
checkanyFree-form expression

Example:

xdbml
Entity users {
  email    varchar [pattern: '^[^@]+@[^@]+$', maxLength: 255]
  age      int     [minimum: 0, maximum: 150]
  username varchar [pattern: '^[a-z0-9_]{3,32}$']
  tags     array [varchar] [minItems: 1, maxItems: 10, uniqueItems: true]
}

24.1 Generator behavior

TargetConstraint output
Oracle / PostgreSQL / SQL Server / SnowflakeCHECK for minimum, maximum, pattern; others as comments
MongoDB validatorNative $jsonSchema with all validations
JSON SchemaDirect round-trip
Avropattern → name validation; bounds → logical-type metadata
CassandraValidations without equivalent dropped with warning

25. Records -- sample data

A records { } block declares sample data inline in the schema. Records serve documentation, testing, and example-data purposes -- they describe what real rows look like alongside the entity that defines their shape.

Records can be declared in two forms: inside an entity body (implicit column list) or at the top level referencing an entity by name (explicit column list).

25.1 Records inside an entity

When declared inside an entity body, the records { } block uses an implicit column list -- values are assigned to columns in the order the columns are declared on the entity:

xdbml
Entity users {
  id    int [pk]
  name  varchar
  email varchar

  records {
    1, 'Alice', 'alice@example.com'
    2, 'Bob',   'bob@example.com'
    3, 'Carol', 'carol@example.com'
  }
}

Each row in the block is a comma-separated list of values matching the column order. When an entity uses ~partial_name injection (§16.1), the implicit column order follows the same precedence: locally declared fields first, then partial-injected fields in injection order.

25.2 Records outside an entity (top-level form)

When declared outside an entity body, the records declaration names the entity and explicitly lists the columns being populated:

xdbml
records users (id, name, email) {
  1, 'Alice', 'alice@example.com'
  2, 'Bob',   'bob@example.com'
}

Columns not listed default to null (or the column's declared default). The order of values in each row matches the explicit column list, not the entity's field declaration order.

The top-level form is useful when:

  • Sample data is maintained separately from the entity definition (e.g., one file declares schemas, another declares fixtures)
  • Only a subset of an entity's columns has meaningful sample values
  • The same entity has multiple disjoint record sets that benefit from separate declarations

25.3 Cross-container references

To declare records for an entity inside a Container, use the dotted path:

xdbml
records core.users (id, name, email) {
  1, 'Alice', 'alice@example.com'
}

25.4 Value forms

Each value in a record row is one of:

FormExampleNotes
String'Alice', 'O\'Brien'Single-quoted. Escape inner single quotes with \'.
Multi-line string'''multi\nline'''Triple single-quoted. Useful for description text.
Number42, 3.14, -1.5e-3Integer or decimal; standard scientific notation.
Booleantrue, falseLowercase.
NullnullThe SQL/JSON null value.
Date / Time'2026-06-10', '10:30:00', '2026-06-10T14:30:00Z'ISO 8601 inside single quotes. Interpreted per the target column's declared type (date, time, timestamp with or without timezone).
Enum valueStatus.activeWhen the column's type is a declared Enum, values may be written as <EnumName>.<value> for clarity.
Backtick expression`gen_random_uuid()`Backtick-wrapped. Treated as opaque target-engine expression, disabling static type checking; lowered verbatim by generators. Useful for database-side defaults or computed values.

Each value is type-checked against the target column's declared type at generation time. Implementations SHOULD report a clear error when a value cannot be coerced (e.g., 'not_a_number' for an int column).

25.5 Generator behavior

Records are sample data, not authoritative schema content. Generators MAY emit records as:

  • SQL INSERT statements (PostgreSQL, Oracle, MySQL, etc.)
  • MongoDB insertMany calls or BSON document fixtures
  • Avro IndexedRecord instances for test corpora
  • JSON document fixtures
  • Markdown tables in generated documentation

Generators MAY also choose to skip records entirely when producing DDL-only output. The records' presence in the xDBML document is non-prescriptive about whether they appear in any generated artifact.

26. Module system

A single xDBML file can describe a complete schema, and in many cases that is the appropriate structure. For larger architectures, however, splitting the schema across multiple files becomes valuable -- separating polyglot layers by engine, decomposing by domain, or sharing canonical declarations across data products. The module system lets one xDBML document import declarations from another.

The design is intentionally close to DBML's module system documented at https://dbml.dbdiagram.io/syntax/module-system, with strict-superset semantics: every valid DBML document parses correctly under xDBML and DBML constructs mean the same thing in xDBML as in DBML. xDBML extends DBML's design in four ways, all of which are available only when the file declares xdbml: 0.2 (or later):

  • Extended element-type vocabulary covering xDBML-specific constructs (Container, Entity, Collection, Record, Type, Edge, View, DiagramView) in addition to DBML's table, enum, tablepartial, note, schema, tablegroup.
  • Field-level imports, allowing an author to clone a single field declaration from another file as a reusable named shape.
  • Container-scoped imports, allowing imports to live inside the Container body where they belong.
  • Clone blocks, optionally embedding the imported content directly in the importing file so that the file remains parseable even if the referenced file is unavailable.

This phase of v0.2 supports relative-path imports. URL imports are deferred to a later phase.

26.1 Files and projects

Every xDBML file is a complete document and may declare a Project { ... } block describing its standalone view of a schema.

When file B imports from file A, what flows from A to B is the content of A's declarations -- its Containers, Entities, Types, Edges, Views, TablePartials, Enums, Notes, TableGroups, DiagramViews. What does NOT flow is A's Project { ... } declaration. Project declarations are intrinsic to each file; they are never importable.

The same file can be opened standalone (its own Project is the project) or imported by another file (only its non-Project content contributes to the importer's project). A file MAY omit the Project declaration if its sole purpose is to be imported.

26.2 Import in full vs selective import

Two import modes are supported.

Import in full:

xdbml
reuse * from './path-to-file'

Brings every top-level declaration from the source file into the importing file's scope (except for the source's Project block). Cannot be combined with aliasing -- if you need to rename, use the selective form.

Selective import:

xdbml
reuse {
  type name
  type name as alias
  ...
} from './path-to-file'

Brings only the listed declarations. Each item may carry an optional as alias to rename the import in the current file's namespace.

Multi-line and single-line comma-separated forms are both supported:

xdbml
reuse { entity products, entity categories, type Email } from './library'

A single directive may import multiple declarations from the same source file. When combined with a clone block (§26.6), the clone block contains all the imported declarations matched by name and element type.

26.3 Importable element types

The element type names in the selective form are case-insensitive (Table, TABLE, table all work).

KeywordWhat is imported
tableEntity declared with the DBML Table keyword
entityEntity declared with Entity (synonym for table)
collectionEntity declared with Collection (synonym for table)
recordEntity declared with Record (synonym for table)
enumEnum
tablepartialTablePartial
noteSticky Note
schemaContainer declared with type: schema, including all entities inside it
containerContainer of any type, including all entities inside it
tablegroupTableGroup, including its membership
typeNamed Type (object-shaped or scalar -- §14)
edgeEdge
viewView
diagramviewDiagramView
fieldSingle field declaration cloned from an entity as a reusable named shape (see §26.8)

The project element type is NOT importable.

Source paths mirror the source file's actual hierarchy:

Source structureImport path
Entity at top level (no container)EntityName
Entity inside a ContainerContainerName.EntityName
Field on a top-level entityEntityName.FieldName
Field on an entity inside a containerContainerName.EntityName.FieldName
Nested field (object inside an entity)ContainerName.EntityName.OuterField.InnerField
Top-level Type, Enum, TablePartial, etc.Name

26.4 The use and reuse directives

Both use and reuse bring declarations from another file into the current file. They differ in one respect: transitivity to further importers.

reuse is the simpler default and is what most authors want. Declarations brought in via reuse are visible to files that further import the current file:

xdbml
// schemas/index.xdbml -- barrel file
reuse * from './oracle-core'
reuse * from './mongo-orders'

// main.xdbml -- imports the barrel
reuse * from './schemas/index'
// All declarations from oracle-core and mongo-orders are available.

use is appropriate only when the author wants the imports to remain private to the importing file. Declarations brought in via use are NOT visible to files that further import the current file.

Recommendation: use reuse by default unless you have a specific reason to prevent transitivity. Most authors will never need the non-transitive form.

Both forms support both * (import all) and { ... } (selective) syntaxes.

26.5 Where the directive goes determines where the import lives

The location of a use/reuse directive in the importing file determines where the imported element lives in the merged AST.

The rule is simple: the directive's location is the placement location. What you see textually is what the merged AST contains. There is no separate "visibility scope" or "placement scope" -- they are the same thing.

For non-field imports, the directive may appear in any location where the imported element type is normally valid:

  • Container imports (container, schema): at file scope (Containers are top-level constructs)
  • Top-level constructs (type, enum, tablepartial, edge, view, tablegroup, diagramview, note): at file scope
  • Entity imports (table, entity, collection, record): at file scope (top-level uncontained entity) OR inside a Container body (the entity becomes a member of that Container)

For field imports, the directive MUST appear at file scope (see §26.8 for the declaration-vs-placement pattern).

use/reuse directives MAY NOT appear inside Entity, Edge, View, or Type bodies. Imports are declarations that establish structural elements; they don't belong inside the bodies of other structural elements.

Naming:

The imported element is named in the importing file's namespace based on where it is placed, not where it came from. For an entity import:

  • Directive at file scope → entity is named by its bare name at file top level (e.g., products)
  • Directive inside Container ordering → entity is named within that container (e.g., ordering.products)

The source's path appears in the directive's from clause (recording where to find the element in the source); the local name reflects the directive's placement.

26.6 Clone blocks

use and reuse directives MAY be followed by an optional clone block containing the imported content embedded directly in the importing file. When the clone is present, the parser uses it as the authoritative content for that import. When the clone is absent, the parser falls back to opening the referenced file at parse time (DBML-compatible behavior).

Syntax:

xdbml
// Reference-only (DBML behavior, no clone)
reuse { entity core.dim_customer } from './conformed-dimensions'

// With clone (xDBML autonomous behavior)
reuse { entity core.dim_customer } from './conformed-dimensions' {
  Entity dim_customer {
    id      int   [pk]
    email   Email
    country CountryCode
  }
}

// With clone + tool-managed metadata
reuse { entity core.dim_customer } from './conformed-dimensions' [cloned_at: '2026-06-06T14:30:00Z'] {
  Entity dim_customer { ... }
}

What the clone block contains:

The clone block contains exactly the imported declaration(s), without any surrounding wrappers from the source file. The shape of the clone follows the granularity of the import:

Import directiveClone block content
reuse { field X.Y.Z } from ...The field declaration alone (no entity wrapper, no container wrapper)
reuse { entity X.Y } from ...The entity declaration alone (no container wrapper)
reuse { type T } from ...The Type declaration
reuse { enum E } from ...The Enum declaration
reuse { tablepartial P } from ...The TablePartial declaration
reuse { container C } from ...The Container declaration including its intrinsic entities
reuse { schema S } from ...Same as container -- the schema's content travels with it
reuse { tablegroup G } from ...The TableGroup declaration including its membership
reuse { view V } from ...The View declaration
reuse { edge E } from ...The Edge declaration
reuse { diagramview D } from ...The DiagramView declaration
reuse { note N } from ...The sticky Note declaration
reuse * from ...All top-level declarations from the source's non-Project content, preserving their natural shape

The principle: container wrappers are dropped when extracting an inner element, preserved when the container itself is the imported element.

Multiple imports in one directive with one clone block are supported. The parser matches each clone declaration to its corresponding import item by name and element type; order within the clone block is not significant.

Names in the clone use the importing file's vocabulary. When a directive aliases an import (as), the clone uses the alias. Names referenced inside a clone also use the importing file's vocabulary -- no mediation between source names and importing-file names is required.

Semantics:

  • When a clone block is present, the parser MUST use its content as the authoritative declarations for the directive's scope. The referenced path is not opened during parsing.
  • When a clone block is absent, the parser MUST open the referenced path and use the source file's content as authoritative. If the path cannot be resolved, the parse fails.
  • Clone content is parsed under the importing file's xdbml: version directive, not the source file's.

Tool-mediated clone refresh: before submitting the file to a parser, tools MAY refresh clones by re-reading their referenced sources and updating clone content (and the cloned_at timestamp). Such refresh workflows are tool-defined and out of scope for this specification.

Autonomy property: a file containing clone blocks for all of its imports is fully self-contained and parses correctly without needing access to any referenced files. Authors choose per-directive whether to include a clone based on workflow needs.

Clone metadata: the directive MAY carry settings before the clone block. The only setting defined in this version is:

  • cloned_at: <timestamp> -- ISO 8601 timestamp recording when the clone was captured

This setting is informational. The parser does not act on it. Tooling MAY use it to drive drift-detection workflows. Other metadata fields (hashes, source versions, refresh policies) are reserved for future versions.

Drift detection: drift between the clone content and the current state of the referenced file is out of scope for the specification. Tools that author or refresh clones define their own policies.

Nested references within clones: when a clone is captured, it includes the directly-imported declarations only. If those declarations reference further declarations from other files, the clone does NOT recursively capture those further references. For full self-containment, the author must also import (with clone blocks) the transitively referenced declarations.

26.7 Worked example: use vs reuse

Private internal helper (use private to importing file):

xdbml
// internal-helpers.xdbml
Type InternalAuditFields {
  created_at timestamp
  created_by varchar
}
xdbml
// orders.xdbml -- keeps the helper private
xdbml: 0.2

use { type InternalAuditFields } from './internal-helpers'

Container ordering [type: schema] {
  Entity orders {
    id          int [pk]
    audit       InternalAuditFields
    customer_id int
  }
}
xdbml
// main.xdbml
xdbml: 0.2

Project ecommerce { targets: [PostgreSQL] }

reuse * from './orders'
// 'ordering.orders' is available. 'InternalAuditFields' is NOT (orders.xdbml used it privately).

26.8 Field-level imports

Field-level imports clone a single field declaration from another file. They are useful when the importing file wants to reuse an existing field's complete definition without importing the whole entity or asking the library author to extract a Named Type.

Declaration vs placement:

Field imports are different from all other imports in one important respect: they declare a reusable shape without placing a field anywhere.

A field cannot exist on its own outside an entity. So the field import directive declares an available named field shape at file scope, and the author then uses that name as the type of a field in some entity. This makes field imports semantically similar to scalar Named Types (§14.7): the directive declares a reusable shape; the author places fields of that shape elsewhere in the file.

Field imports must appear at file scope, never inside a Container body.

Syntax:

xdbml
reuse { field core.dim_customer.email } from './conformed-dimensions'

The path follows xDBML's standard container.entity.field form. Field paths into nested structures use dot notation: field core.dim_customer.addresses.address.city.

Usage: the imported field's name (or alias) is used as the type of a field in some entity:

xdbml
reuse { field core.dim_customer.email } from './conformed-dimensions'

Container facts [type: schema] {
  Entity fact_sales {
    customer_email  email   // 'email' here is used as the type, not the field name
    customer_id     int     [ref: > core.dim_customer.id]
    amount          decimal(10,2)
  }
}

To rename the imported field shape, use as:

xdbml
reuse { field core.dim_customer.email as canonical_email } from './conformed-dimensions'

Entity fact_sales {
  customer_email  canonical_email
}

What is copied (cloning by value at clone-capture time or parse time):

  • Type: copied verbatim, including nested shapes
  • Settings (not null, unique, default, increment, pk): copied verbatim
  • Validation constraints (pattern, minLength, maxLength, range, enum membership, check): copied verbatim
  • Notes (both inline and block): copied verbatim
  • AI-readiness metadata (synonyms, business_term, tags, granularity): copied verbatim
  • Custom properties (x_ prefixed): copied verbatim
  • Inline ref: settings: copied only if the ref's target is resolvable in the importing file's scope; otherwise dropped with a warning

Per-instance overrides:

When using the imported shape as a field type, the field's declaration may override settings:

xdbml
reuse { field core.dim_customer.email } from './conformed-dimensions'

Entity employees {
  email  email  [unique]   // adds 'unique' to the imported email's settings
}

Field imports support clone blocks:

xdbml
reuse { field core.dim_customer.email } from './conformed-dimensions' [cloned_at: '2026-06-06T14:30:00Z'] {
  email varchar [unique, not null, pattern: '^[^@]+@[^@]+$', tags: ['pii']]
}

26.9 Aliases

When two source files define elements with the same name, use as to rename imports and avoid conflicts:

xdbml
Container app [type: schema] {
  reuse { entity users as auth_users } from './auth'
  reuse { entity users as billing_users } from './billing'
}

Once aliased, the alias is the only accessible name in the importing file under that import. The clone (if present) uses the alias as the declaration's name.

26.10 Name resolution

Name resolution operates in two passes:

Pass 1 (declaration collection): the parser walks the import graph starting from the parse target. For each use/reuse directive, the parser uses the clone block if present, otherwise opens the referenced file. Declarations are collected with their fully-qualified names in the importing file's namespace (placement applied, aliases applied).

Pass 2 (reference resolution): references are resolved against the declarations collected in pass 1. Forward references within and across files are valid.

Conflict policy: two imports conflict only if their fully-qualified names in the importing file's namespace collide. Two Entity users imported into different Containers (ordering.users and audit.users) do not conflict; two imports of users into the same Container without aliases DO conflict. On conflict, the parser MUST emit an error and require explicit aliasing.

Local declarations always win over imported declarations of the same name.

26.11 Version directive across files

Each file is parsed using its own xdbml: version directive (or as DBML if no directive is present). Different files in an import graph MAY use different version directives:

  • A xdbml: 0.2 file MAY import a xdbml: 0.1 file
  • A xdbml: 0.2 file MAY import a DBML file (no directive)
  • A xdbml: 0.1 file MAY import a DBML file
  • A xdbml: 0.1 file MAY NOT use v0.2-only constructs such as clone blocks, scalar Named Types, or field imports

When a clone is captured from a source file, the clone content is parsed under the importing file's version directive, not the source's.

26.12 Multi-file Project rule

Every xDBML file MAY declare a Project, but only the entry file's Project survives in the merged AST. When file A imports from file B:

  • A's Project is part of the merged project AST
  • B's Project is ignored (B's declarations contribute, B's Project metadata does not)

The same file plays different roles depending on whether it's the entry point or being imported.

26.13 Path resolution

This phase of v0.2 supports relative paths only. URL imports are deferred to a later phase.

A relative path:

  • Begins with ./ or ../
  • Uses forward slashes as path separators on all platforms
  • Is resolved relative to the importing file's directory
  • Has an optional file extension (./auth, ./auth.xdbml, ./auth.dbml all work)

When the extension is omitted, the parser searches <path>.xdbml then <path>.dbml. If neither exists AND no clone block is present, the parser reports an error.

When a clone block IS present, path resolution failure is NOT an error -- the clone is authoritative. The parser MAY emit a warning.

Paths without a leading ./ or ../ are reserved for future use and rejected in this phase.

Cross-platform compatibility: forward-slash relative paths are accepted by all modern file APIs on Linux, macOS, and Windows. Browser-based renderers using the File System Access API support the same path syntax.

Browser-based renderer considerations: when an xDBML file is parsed in a browser-based renderer, reading referenced files may require explicit user permission via the File System Access API or equivalent. Clone blocks eliminate this concern entirely: a file with clones for all its imports requires no file system access to referenced files. Authors writing files for browser-based consumption are encouraged to include clones.

26.14 Circular imports

Circular imports are allowed, matching DBML's behavior. The two-pass name resolution handles cycles naturally.

Best practice (not a language rule): circular imports are often a signal that file boundaries could be improved.

26.15 Worked examples

Entity import inside a Container:

xdbml
// catalog.xdbml
xdbml: 0.2
Project catalog { targets: [PostgreSQL] }

Container core [type: schema] {
  Entity products {
    id    int [pk]
    sku   varchar [unique, not null]
    name  varchar [not null]
    price decimal(10,2)
  }
}
xdbml
// order-system.xdbml
xdbml: 0.2
Project order_system { targets: [PostgreSQL] }

Container ordering [type: schema] {
  Entity orders {
    id          int [pk]
    customer_id int
    placed_at   timestamp
  }

  reuse { entity core.products } from './catalog' [cloned_at: '2026-06-06T14:30:00Z'] {
    Entity products {
      id    int [pk]
      sku   varchar [unique, not null]
      name  varchar [not null]
      price decimal(10,2)
    }
  }

  Entity order_lines {
    order_id    int [ref: > ordering.orders.id]
    product_id  int [ref: > ordering.products.id]
    quantity    int
  }
}

The imported products entity becomes ordering.products in the importing file. The source's core container appears in the directive's path but does not survive into the importing file's AST.

Field-level import:

xdbml
// legacy-customer.xdbml
Container ops [type: schema, target: Oracle] {
  Entity customer_legacy {
    id            number(10)
    legacy_email  varchar2(255) [pattern: '^[^@]+@[^@]+$', tags: ['pii']]
  }
}
xdbml
// new-system.xdbml
xdbml: 0.2
Project new_system { targets: [PostgreSQL] }

reuse { field ops.customer_legacy.legacy_email as contact_email }
    from './legacy-customer'
    [cloned_at: '2026-06-06T14:30:00Z'] {
  contact_email varchar2(255) [pattern: '^[^@]+@[^@]+$', tags: ['pii']]
}

Container app [type: schema] {
  Entity new_customer {
    id            int           [pk]
    contact_email contact_email
  }
}

The field import declares contact_email as a reusable shape at file scope. The Entity uses contact_email as the type of one of its fields.

26.16 Parser implementation requirements

A v0.2-conformant parser supporting the module system MUST:

  1. Accept any xDBML file as the parse target. The target file's Project is the merged project's Project.
  2. For each use/reuse directive: if a clone block is present, use it as authoritative content. If absent, open the referenced path.
  3. Walk the import graph by reading directives at file scope and inside Container bodies, resolving paths relative to the importing file's directory.
  4. Visit each unique file at most once; circular imports must terminate.
  5. Parse each referenced file with its own xdbml: version directive. Parse clone content with the importing file's version directive.
  6. Place each imported element at the directive's location in the importing file's merged AST.
  7. Reject field imports inside Container bodies; field imports MUST be at file scope.
  8. Reject use/reuse directives inside Entity, Edge, View, or Type bodies.
  9. Implement use vs reuse semantics correctly.
  10. For field imports, copy the field declaration as a reusable named shape available for use as a field type, with ref: settings dropped (with warning) when the ref target is not in the importing file's scope.
  11. When extracting a clone for an entity import, strip any surrounding Container wrapper. When extracting a clone for a container import, preserve the Container with its entities. Apply directive aliases in the clone's declaration names.
  12. When a directive carries multiple imports and a clone block, match each clone declaration to its corresponding import item by name and element type.
  13. Resolve names in two passes; an imported element's name follows its placement location, not its source path.
  14. Detect and reject unresolved name conflicts.
  15. Reject paths not starting with ./ or ../.

The parser SHOULD:

  • Cache parsed file contents during a single parse invocation
  • Emit warnings for shadowed-but-unaliased imports
  • Emit warnings for circular import chains
  • Emit warnings when a clone block is present but the referenced path could not be accessed
  • Emit warnings when a captured clone has unresolved external references

Drift detection between clones and their sources is out of scope for the specification.

27. AST representation

Abstract Schema Tree

Project
├── settings (targets, ...)
├── Note (project-level)
├── Container (0..n)
│   ├── settings (type, target, replication, location, ...)
│   ├── Note (container-level)
│   ├── Entity (n per container)
│   │   ├── Field (n; recursively nested)
│   │   ├── Index (n)
│   │   ├── Check (n)
│   │   ├── Records (0..1)
│   │   └── Note
│   ├── View (n per container)
│   │   ├── settings (source_query, materialized, ...)
│   │   ├── Field (n)
│   │   └── Note
│   └── Edge (n per container)
│       ├── settings (source, target, source_cardinality, target_cardinality, undirected, ...)
│       ├── Field (n)
│       ├── Index (n)
│       └── Note
├── Entity (0..n; project-level)
├── View (0..n; project-level)
├── Edge (0..n; project-level)
├── Type (0..n)
├── Enum (n)
├── Ref (n; project-level)
│   └── Cardinality (source/target min and max)
├── TablePartial (n)
├── TableGroup (n)
├── DiagramView (n)
└── Note (n; sticky notes)

27.1 Field recursion

A Field node carries a type expression. Complex types contain nested Field nodes; tree depth is unbounded.

27.2 Path segments

Field paths are ordered lists of PathSegment nodes, each tagged by kind: field, array_index, array_iter, map_key, map_iter, alternative. Implicit forms are normalized to explicit forms during parsing.

27.3 AST flavors

  • Raw -- references stored as strings, for partial-document analysis.
  • Normalized -- references resolved into direct pointers, for traversal and code generation.

28. Round-trip semantics

28.1 Round-trip targets

FormatDirectionNotes
DBML 3.13.6xDBML ↔ DBMLStrict superset. xDBML-only constructs emit as comments when downgrading.
Oracle DDLxDBML → SQLCREATE TABLE, CREATE TYPE, CHECK, FKs. JSON via native JSON type (21c+). Views as CREATE VIEW / CREATE MATERIALIZED VIEW.
PostgreSQL DDLxDBML → SQLJSON via jsonb.
MongoDB validatorsxDBML ↔ $jsonSchemaObject/array/oneOf/union direct. BSON types preserved.
Avro schemasxDBML ↔ AvroNamed types → Avro named records. Union element ordering preserved.
Parquet schemasxDBML → ParquetNested types → Parquet group types.
JSON SchemaxDBML ↔ JSON SchemaDirect correspondence for most constructs. Named types → $defs.
OpenAPIxDBML → OpenAPIEntities and types → components.schemas.
Neo4j / Cypher schemaxDBML → CypherEntities → node labels; Edges → relationship types with properties.
RDF-starxDBML ↔ RDF-starEdges → quoted triples with annotations.
ODCS (schema section)xDBML → ODCSxDBML supplies schema portion; other ODCS sections out of scope.
Hackolade PDMxDBML ↔ PDMLossless round-trip. PDM Description fields receive xDBML notes (markdown).

28.2 Lossiness boundary

Lossless: Container declarations and settings; Entity, View, Edge, Field, Index, Ref structure; nested types; polymorphism; named types (where target supports $ref); validation constraints; cardinality; notes.

Lossy when target lacks the construct:

  • Polymorphism to relational targets (lowered to discriminator + checks)
  • Named types to formats without $ref (inlined)
  • Edges to non-LPG relational targets (lowered to junction tables)
  • Views with non-portable source queries (passed verbatim; may fail on target)

Generators emit warnings on lossy lowering.

29. Relationship to other standards

┌──────────────────────────────────────────────────────────────────────┐
│  Data product layer                   ODPS                           │
├──────────────────────────────────────────────────────────────────────┤
│  Inferential / reasoning layer        OWL, knowledge graphs          │
├──────────────────────────────────────────────────────────────────────┤
│  Semantic / business layer            OSI, dbt MetricFlow, Cube      │
├──────────────────────────────────────────────────────────────────────┤
│  Contract layer                       ODCS                           │
├──────────────────────────────────────────────────────────────────────┤
│  Validation layer                     JSON Schema, SHACL             │
├──────────────────────────────────────────────────────────────────────┤
│  Schema / shape layer                 xDBML ← here                   │
├──────────────────────────────────────────────────────────────────────┤
│  Engine DDL layer                     SQL DDL, Cypher, CQL, ...      │
└──────────────────────────────────────────────────────────────────────┘
StandardLayerxDBML relationship
SQL DDLEngineForward-engineering target
Cypher schemaEngineForward-engineering target for LPG
JSON SchemaSchema (JSON-only)Bidirectional translation
OpenAPISchema (HTTP APIs)Same shapes
Avro / ParquetSchema (serialization)Forward-engineering target
SHACLValidationBidirectional translation
ODCSContractSupplies schema section
ODPSData productSits above ODCS
OSISemanticComplementary via §23 shared vocabulary
RDFS / OWLInferentialShapes lift into knowledge graphs
RDF-starInferentialBidirectional for edges with properties

xDBML does not attempt to be SQL, OSI, ODCS, OWL, or JSON Schema. Each addresses part of the data-description problem at one layer.

30. Conformance

A conforming xDBML v0.2 implementation:

  1. Parses every valid xDBML v0.2 document to the AST described in §27.
  2. Parses every valid xDBML v0.1 document with v0.1 semantics (the version directive selects).
  3. Rejects malformed documents with informative error messages.
  4. Honors the version declaration per §4.1.
  5. Honors the experimental: opt-in per §4.2.
  6. Normalizes implicit forms to canonical AST representations for paths.
  7. Preserves declared keyword choices in the raw AST flavor.
  8. Computes default cardinality per §11.8 when not declared.
  9. Implements the module system (§26) including relative-path imports, clone blocks, field-level imports, and scoped imports per §26.6.

Generators and importers are not required for spec conformance.

Appendix A. Reserved keywords

Document-level: xdbml, experimental, Project

Container-level: Container, Schema, Database, Keyspace, Namespace, Dataset, Bucket

Entity-level: Entity, Table, Collection, Record

Other top-level constructs: Edge, View, Type, Enum, enum, Ref, TablePartial, TableGroup, DiagramView, Note

Field types: object, struct, record, array, list, map, dict, dictionary, set, union, oneOf, anyOf, allOf, json, jsonb, variant

Block keywords: records, indexes, checks

Module system keywords (new in v0.2): use, reuse, from, as

Module system element-type slot values (new in v0.2): table, entity, collection, record, enum, tablepartial, note, schema, container, tablegroup, type, edge, view, diagramview, field

Module system directive setting (new in v0.2): cloned_at

Relationship operators: <, >, -, <>

Field settings: pk, primary key, null, not null, unique, default, increment, note, ref, check

Cardinality settings: source, target, min_source, max_source, min_target, max_target, source_cardinality, target_cardinality, undirected

Relationship settings (new for v0.2): inactive (flag for visualization-only deactivation), color (#rgb / #rrggbb for the relationship line)

TableGroup settings (new for v0.2): color (#rgb / #rrggbb for the group's frame in diagrams)

Check settings: name (constraint name in generated DDL), note (free-text annotation)

View settings: source_query, materialized, refresh_schedule, refresh_on, source_database, storage_options

AI-readiness settings: synonyms, business_term, granularity, tags

Custom-property prefix: x_ (reserved as a convention for organization-specific extensions; see §23.5)

Quoted identifiers ("keyword") may be used for entities or fields that share a name with a reserved keyword.

Appendix B. Lexical grammar reference

Token classPattern
Identifier[A-Za-z_][A-Za-z0-9_]*
Quoted identifier" any chars with \", \\ escapes "
String literal' any chars with \', \\ escapes '
Multi-line string''' any chars with indentation normalization '''
Numberinteger or decimal, optional sign and exponent
Expression` any chars `
Cardinality string'N..M' where N, M are non-negative integers or *
Comment (single-line)// to end of line
Comment (block)/* ... */
Container/entity-qualified nameidentifier.identifier
Composite referenceidentifier.(identifier, identifier, ...)
Field pathidentifier followed by zero or more .identifier, .[N], .[*], .["literal"], or ."quoted" segments

A formal ANTLR4 grammar is published at grammar/xDBML.g4 in this repository.

Appendix C. Worked end-to-end examples

C.1 Polyglot model: Oracle + MongoDB + events + graph

xdbml
xdbml: 0.1

Project polyglot_example {
  targets: [Oracle, MongoDB, Avro, Neo4j]
}

Type Address {
  street  varchar [not null]
  city    varchar [not null]
  country varchar
}

Type MonetaryAmount {
  amount   Decimal128 [not null]
  currency string     [not null, pattern: '^[A-Z]{3}$', minLength: 3, maxLength: 3]
}

Container core [type: schema, target: Oracle] {
  Table customers {
    id              int     [pk]
    email           varchar [unique, not null, pattern: '^[^@]+@[^@]+$']
    display_name    varchar [not null]
    primary_address Address
  }
}

Container orders_store [type: database, target: MongoDB] {
  Collection orders {
    _id          objectId  [pk]
    customer_id  int32     [not null]
    placed_at    Date      [granularity: second]
    total        MonetaryAmount
    line_items   array [
      line_item object {
        sku         string         [not null]
        quantity    int32          [not null, minimum: 1]
        unit_price  MonetaryAmount
      }
    ]
    payment_method oneOf {
      card   object { last4 string [maxLength: 4], brand string }
      bank   object { iban string }
      wallet object { provider string, account string }
    } [discriminator: method_kind]
  }
}

Container events [type: namespace, target: Avro] {
  Record OrderPlaced {
    event_id    string     [pk]
    occurred    long       [granularity: millisecond, not null,
                            note: 'Unix epoch milliseconds; Avro logical-type timestamp-millis']
    order_id    string     [not null, note: 'String form of order _id from orders_store']
    customer_id int        [not null]
    total       MonetaryAmount
  }
}

Container social [type: database, target: Neo4j] {
  Edge FOLLOWS [source: core.customers, target: core.customers,
                source_cardinality: '0..*', target_cardinality: '0..*'] {
    since      date    [not null]
    is_close   boolean [default: false]
  }
}

Container catalog [type: schema, target: Oracle] {
  Entity products {
    sku       varchar [pk]
    name      varchar [not null]
    price     MonetaryAmount
    metadata  json {
      weight_grams int
      dimensions   object { length int, width int, height int }
      tags         array [varchar]
    }
  }

  View top_sellers [materialized: true, refresh_schedule: 'daily'] {
    source_query: '''
      SELECT p.sku, p.name, COUNT(*) AS order_count
      FROM catalog.products p
      JOIN orders_store.orders o ON o.line_items.sku = p.sku
      GROUP BY p.sku, p.name
      ORDER BY order_count DESC
      FETCH FIRST 100 ROWS ONLY
    '''
    sku         varchar [pk]
    name        varchar
    order_count int
  }
}

Ref: orders_store.orders.customer_id > core.customers.id [source: '1..*', target: '1..1']
Ref: orders_store.orders.line_items.[*].sku > catalog.products.sku
Ref: events.OrderPlaced.order_id > orders_store.orders._id
Ref: events.OrderPlaced.customer_id > core.customers.id

C.2 Heterogeneous tuple

xdbml
xdbml: 0.1

Entity customers {
  id        int  [pk]
  name      varchar
  addresses array [
    [0] billing  object { street varchar, city varchar, zip varchar }
    [1] shipping object { street varchar, city varchar, zip varchar }
  ]

  indexes {
    addresses.[0].city
    addresses.[1].zip
  }
}

C.3 Recursive named type

xdbml
xdbml: 0.1

Type TreeNode {
  id       int
  value    varchar
  children array [child TreeNode]
}

Entity organization_chart {
  root_id int [pk]
  tree    TreeNode
}

C.4 Graph model with property edges

xdbml
xdbml: 0.1

Project social_network {
  targets: Neo4j
}

Container social [type: database, target: Neo4j] {
  Entity Person {
    id    int     [pk]
    name  varchar [not null]
    email varchar [unique]
  }

  Entity Movie {
    id    int     [pk]
    title varchar [not null]
    year  int
  }

  Edge KNOWS [source: Person, target: Person,
              source_cardinality: '0..*', target_cardinality: '0..*'] {
    since    date  [not null]
    intimacy int   [minimum: 0, maximum: 10]
  }

  Edge ACTED_IN [source: Person, target: Movie] {
    role      varchar [not null]
    billing   int     [note: 'order in credits']
  }

  Edge RATED [source: Person, target: Movie] {
    rating    int       [not null, minimum: 1, maximum: 5]
    rated_at  timestamp [granularity: second]
    review    varchar
  }
}

Appendix D. Compatibility with DBML 3.13.6

xDBML is a strict superset of DBML 3.13.6:

  1. Every valid DBML 3.13.6 document is a valid xDBML document at the lowest compatibility level (no version declaration).
  2. A document declaring xdbml: 0.1 opts into the v0.1 feature set; all DBML 3.13.6 constructs remain valid.
  3. A document declaring xdbml: 0.2 opts into the v0.2 feature set (which includes the v0.1 feature set plus the module system, scalar Named Types, and field-level imports); all DBML 3.13.6 constructs remain valid.
  4. xDBML extension constructs are not recognized in documents without a version declaration.
  5. DBML's implicit-schema syntax (Table core.users { ... }) remains valid; produces an implicit Container.
  6. DBML's Table keyword remains valid as a synonym for Entity.
  7. DBML's use and reuse module-system directives (without clone blocks, without xDBML-specific element types) behave identically in xDBML. v0.2 adds clone blocks, field-level imports, the extended element-type vocabulary, and Container-scoped imports -- all available only when the file declares xdbml: 0.2.

A DBML 3.13.6 document is upgraded to xDBML by adding xdbml: 0.2 at the top (or xdbml: 0.1 for the previous feature set). No structural changes required.

Appendix E. Sources and acknowledgments

xDBML builds on the work of Holistics and contributors to DBML (Apache License 2.0). The DBML language reference at dbml.dbdiagram.io served as the foundation for the constructs documented in §5, §7, §9, §10, §11, §15, §16, §17, §18, §25, §26.

xDBML acknowledges intellectual debt to:

  • JSON Schema -- validation vocabulary, oneOf/anyOf/allOf
  • Avro -- named records, namespaces, union semantics
  • OpenAPI -- discriminator conventions, components.schemas reuse
  • ODCS (bitol.io) -- Open Data Contract Standard: schema/contract layering
  • OSI -- Open Semantic Interchange: schema/semantic-layer boundary vocabulary
  • openCypher / Neo4j -- labeled property graph semantics; edge properties
  • RDF-star -- annotation properties on triples
  • Mermaid ER code -- cardinality notation principles
  • Hackolade Studio -- polyglot data modeling principles, AI-readiness settings, path notation

End of xDBML Specification v0.2.

Spec under Apache License 2.0 · Examples under CC0 1.0