OMGDB treats Markdown as a first-class input. A .md file — a spec, a note, a piece of documentation — can be imported as a single document where the YAML-ish frontmatter becomes typed, queryable top-level fields and the heading structure is captured under _sections. The raw Markdown body is stored too, so nothing is lost. This is the knowledge layer an agent reads from: notes and specs land as ordinary documents you can query by their frontmatter.
Import is exposed through the CLI command omgdb import-md. Internally it parses the file, builds a document, inserts it, and prints the new _id.
Importing a file
omgdb create app.omgdb
omgdb import-md app.omgdb docs spec.md
# -> prints the new document's _id as canonical JSON
The command takes three positional arguments: the store path, the target collection, and the path to the Markdown file. The whole file becomes exactly one document in the named collection.
| Argument | Description |
|---|---|
<path> | Store path, e.g. app.omgdb. |
<collection> | Collection the document is inserted into. |
<file> | Path to the .md file to parse and import. |
The document shape
A parsed Markdown file is converted into a document with three parts:
- Frontmatter fields — each
key: valueline from the frontmatter block is lifted to a top-level field, typed where possible. body— the raw Markdown after the frontmatter, stored verbatim as a string._sections— an array of{ level, heading, text }objects, one per ATX heading, in document order.
Frontmatter parsing
Frontmatter is recognized only when the file begins with a literal --- fence line and a matching closing --- fence follows. Inside the block, each non-empty line is split on its first : into a key and a value. The value is parsed as JSON when possible and otherwise kept as a string:
views: 10becomes an integer (I64).tags: ["rag", "db"]becomes an array.title: Hello Worldstays a string (unquoted prose is not valid JSON, so it falls back to a string).
Limitation: Frontmatter is a minimal line-based subset, not a full YAML parser. It splits on the first
:per line and parses each value as JSON-or-string. Block scalars, nested maps, and multi-line values are not supported. A file with no---fences yields no frontmatter fields, and the entire text becomesbody.
Sections
Every ATX heading (# through ######) starts a new section. A line is treated as a heading only when, after trimming leading whitespace, it begins with 1–6 # characters immediately followed by a single space. Each section records its heading level, the heading text (with the leading #s stripped), and the text body up to the next heading.
Note: Despite the “section tree” phrasing,
_sectionsis a flat, ordered array. Heading levels are recorded in thelevelfield, but no parent/child nesting is built. Any content before the first heading is not captured in a section (it still lives inbody).
Example
Given this spec.md:
---
title: Hello World
views: 10
tags: ["rag", "db"]
---
# Intro
the intro text
## Details
more text
Importing it produces a document of this shape:
{
"title": "Hello World",
"views": 10,
"tags": ["rag", "db"],
"body": "# Intro\n\nthe intro text\n\n## Details\n\nmore text\n",
"_sections": [
{ "level": 1, "heading": "Intro", "text": "the intro text\n" },
{ "level": 2, "heading": "Details", "text": "more text" }
]
}
Note that title came through as a string, views as a number, and tags as an array — each frontmatter line was typed according to whether its value parsed as JSON.
Querying frontmatter
Because frontmatter fields are stored as ordinary top-level fields, they are queryable like any other field — no special syntax. After import you can find on a frontmatter value:
omgdb find app.omgdb docs {"title":"Spec"}
# result contains "Intro" and "_sections"
The same applies to the typed fields. A numeric frontmatter field works with the comparison operators, and an array field works with array operators:
omgdb find app.omgdb docs {"views":{"$gte":10}}
omgdb find app.omgdb docs {"tags":"rag"}
The _sections array travels with the document, so the heading structure is available to read back, and the raw body is always present for full-text or downstream processing.
Limitations and planned work
The importer captures structure and types, but a few capabilities are explicitly follow-ups rather than current behavior:
Limitation: A byte-offset source map for sections is planned (not yet implemented). Each
Sectionstores onlylevel,heading, andtext— there are no source byte offsets, so a section cannot be mapped back to its exact position in the original file.
Limitation: Addressable section edits (
setSection/patchFrontmatter) are planned (not yet implemented)._sectionsis a plain array snapshot with no stable per-section identity, so there is no API to surgically rewrite a single heading-delimited section in place.
Note: Heading-aware chunking into embeddings is not part of import. Embedding-based retrieval over sections is provided by the vector layer — see vector search.
See also
- Data model — how documents, types, and field paths work, including the conventions imported Markdown follows.
- Vector search — embedding and retrieval over imported content, including section-level chunking.
- Query operators — filtering on the frontmatter fields produced by import.