OMGDB DOCS
// Querying

Schema Validation

Declarative per-collection validation rules — required fields and per-field types — enforced on insert and stored in the op-log so the database stays self-describing.


OMGDB supports lightweight, declarative schema validation per collection. A collection can carry a spec that lists which fields are required and the expected type of named fields. Rules are enforced when documents are written, and they live in the op-log as a define operation — so a packed or replayed store carries its own validation rules and remains self-describing.

Validation is deliberately minimal: it covers required-field presence and exact per-field type checks, nothing more. There is no JSON-Schema-style constraint language (no patterns, ranges, enums, nested sub-schemas, or uniqueness). See data model for the value types these rules reference.

Defining a collection spec

A spec is a JSON object describing one or more collections:

{
  "collections": {
    "users": {
      "required": ["name"],
      "fields": { "age": "long" }
    }
  }
}
  • required — an array of field names that must be present on every document. Each entry must be a string.
  • fields — an object mapping a field name to its expected type-name string (see the type table). Each value must be a string.

Both keys are optional. A collection with neither required nor fields imposes no constraints.

You attach a spec to a store at creation time with --from:

omgdb create app.omgdb --from spec.json

The CLI reads spec.json, iterates the top-level collections object, and defines each entry on the new store. On success it prints how many collections were defined:

created store at app.omgdb (1 collection(s) defined)

Note: The spec file is parsed as JSON, not YAML. The top level must be a JSON object containing a collections object; otherwise omgdb create --from fails with spec must contain a 'collections' object.

Defining a collection is idempotent with respect to replay: each define is recorded as a define op in the log, and replaying the log rebuilds the validation catalog. Re-defining a collection replaces its rules. Because the rules are persisted in the log, reopening, packing, or compacting the store preserves them — the database describes its own constraints.

Supported field type names

A field’s expected type is given as one of the canonical type-name strings below. These are the exact tokens produced by the engine’s Value::type_name(); the spec string is compared to that token verbatim, so spelling and case must match exactly.

Type nameValue kindExample value
nullJSON nullnull
boolBooleantrue
long64-bit signed integer42
double64-bit float3.14
stringUTF-8 string"ada"
binDataBinary data{"$binary":"..."}
arrayArray[1, 2, 3]
objectEmbedded document{"k": 1}
objectId12-byte ObjectId{"$oid":"..."}
dateTimestamp (ms since epoch){"$date": 0}

Note: A bare JSON integer such as 36 is a long, while 36.0 is a double — they are distinct types. If you declare a field as long, a value written with a decimal point will fail validation.

How rules are enforced

Validation runs on write, at the moment a document enters a collection that has a spec:

  • Store::insert_one validates the document before it is appended to the log. A violation returns a validation error and writes nothing.
  • Inserts inside a transaction (Txn::insert_one) are validated the same way; a violation makes the closure return an error, which aborts the whole transaction.
  • Store::replace_one (and Txn::replace_one) validate the replacement document, so an update cannot move a document into an invalid state.

The check is two-part:

  1. Required fields — every name in required must be present on the document. A missing one fails with missing required field `name` .
  2. Field types — for each field present on the document that also appears in fields, the value’s actual type must equal the declared type. A mismatch fails with field `age` should be `long` but is `string` .

Limitation: Type rules apply only to fields that are present. A field listed in fields but not in required is unconstrained when absent — declaring a type does not make a field mandatory. Use required for presence and fields for typing. There is no constraint on fields that appear in a document but not in the spec; extra fields are always allowed.

A passing and a failing insert

Given the users spec above (required: ["name"], fields: {"age": "long"}):

# Passes: name is present, age is a long.
omgdb insert app.omgdb users '{"name":"ada","age":36}'

# Fails: missing the required field `name`.
omgdb insert app.omgdb users '{"age":5}'

# Fails: `age` is declared `long` but the value is a string.
omgdb insert app.omgdb users '{"name":"x","age":"old"}'

A passing insert prints the assigned _id (an ObjectId, e.g. {"$oid":"..."}). A failing insert is rejected with a validation error and leaves the store unchanged.

Auditing existing documents

Specs are enforced going forward, but a collection may already hold documents — for example a spec defined after data was inserted, or documents introduced by an external producer that appended to the log. The validate command scans a collection and reports every document that does not satisfy the current rules:

omgdb validate app.omgdb users

When every document is valid (or the collection has no rules) it prints a single success line:

valid: no violations in `users`

When there are offenders it prints one line per violating document — the document’s _id as canonical JSON, followed by the reason — and exits non-zero:

{"$oid":"6630c0..."}: missing required field `name`
{"$oid":"6630c1..."}: field `age` should be `long` but is `string`
2 document(s) violate validation rules

Each document is reported with the first rule it violates. validate is read-only: it never modifies documents or the log, so it is safe to run as an audit step in CI or before compaction.

End-to-end example

# spec.json: {"collections":{"users":{"required":["name"],"fields":{"age":"long"}}}}
omgdb create app.omgdb --from spec.json
omgdb insert app.omgdb users '{"name":"ana","age":30}'
omgdb validate app.omgdb users   # -> valid: no violations in `users`

The rules survive reopening the store: because the define op is in the log, a fresh omgdb process replays it and enforces the same constraints. To see the inferred shape of a collection (independent of declared rules), use introspection via omgdb describe.

Limitations

Limitation: Validation is intentionally narrow and covers only what is listed here.

  • Only required (presence) and fields (exact type) are supported. No ranges, regex/patterns, enums, length limits, uniqueness, or conditional rules.
  • Type checks are exact on the top-level field’s value type. There is no coercion (a double is not accepted where a long is declared) and no validation of array element types or nested sub-document fields.
  • Rules apply only to insert and replace. They are not re-checked retroactively when a spec changes — use omgdb validate to find pre-existing offenders.
  • Uniqueness is not part of validation; the only uniqueness guarantee is the implicit unique _id (a duplicate _id insert is rejected separately as a duplicate-key error, not a validation error).

For more on the value types referenced by fields, see data model. For a hands-on walkthrough of creating a store and inserting documents, see the quickstart.

Edit this page on GitHub →