Three complete tasks, start to finish, using only commands the omgdb binary actually implements. Each recipe stands alone: run it in an empty directory and every step works as shown. If you haven’t installed OMGDB yet, start with install & build.
Note: As everywhere in the CLI, documents, filters, and updates are passed as JSON strings — on a POSIX shell, single-quote them so the shell does not eat the inner double quotes.
Load 50,000 documents in about two seconds
import-jsonl streams a JSON Lines file into a collection in atomic batches — each batch is one transaction and one fsync. That amortization is the whole trick: durable writes are fsync-bound, so sharing one fsync across thousands of documents is how a bulk load goes fast.
First, generate a 50,000-line JSONL file (any tool works; here seq + awk):
seq 1 50000 | awk '{ printf "{\"n\":%d,\"name\":\"user-%d\",\"group\":%d}\n", $1, $1, $1 % 10 }' > users.jsonl
Create a store and import the file. --batch-size sets how many documents share each atomic commit (default 1000):
omgdb create app.omgdb
# created store at app.omgdb
omgdb import-jsonl app.omgdb users users.jsonl --batch-size 5000
# {"inserted":50000,"batches":10}
On a development machine this takes about 2 seconds. Every line must be a JSON object; a malformed line fails with its line number, and only complete batches are committed — a failed import never leaves a half-applied batch.
Now prove the store is consistent. verify re-reads the op-log and checks that replaying it reproduces the live state, including every derived cache — the whole proof runs in about a second at this size:
omgdb verify app.omgdb
# OK: 50020 record(s), 50000 document(s) in 1 collection(s); log reproduces state
The record count is honest bookkeeping: 50,000 inserts plus a begin/commit marker pair for each of the 10 batches. From here, add an index and query as usual:
omgdb create-index app.omgdb users group
# created index on `users.group`
omgdb find app.omgdb users '{"group":7}' --limit 2
# {"_id":{"$oid":"..."},"n":7,"name":"user-7","group":7}
# {"_id":{"$oid":"..."},"n":17,"name":"user-17","group":7}
See indexes for what the planner can and cannot accelerate.
Give your agent a searchable knowledge base
A folder of Markdown notes becomes a queryable, semantically searchable collection your coding agent can retrieve from — locally, with no embedding service. Frontmatter becomes queryable fields, headings become a section tree under _sections, and the full text lands in the body field.
Import the folder with a shell loop (import-md takes one file at a time and prints each new document’s _id):
omgdb create kb.omgdb
for f in notes/*.md; do omgdb import-md kb.omgdb notes "$f"; done
# {"$oid":"018f...a1"}
# {"$oid":"018f...a2"}
# ...
Persist embeddings for the body field. vsync is incremental: it embeds only documents whose stored vector is missing or stale, so re-running it after edits is cheap, and searches reuse the persisted vectors instead of re-embedding:
omgdb vsync kb.omgdb notes body
# synced 42 embedding(s) into `notes.__vectors` (fresh ones skipped)
Search semantically, and combine with a structured pre-filter for hybrid search. If your notes carry frontmatter like tags: ["ops"], array-contains semantics make the filter natural:
omgdb vsearch kb.omgdb notes body "how do we deploy to production" --k 3 --filter '{"tags":"ops"}'
# 0.6412 {"_id":{"$oid":"..."},"title":"Deploy runbook","tags":["ops"],"body":"...","_sections":[...]}
# 0.4108 {"_id":{"$oid":"..."},"title":"Release checklist","tags":["ops"],...}
# ...
For handing context to the agent, build a token-budgeted pack instead: the most relevant chunks, best-first, cut off at the budget, each carrying its source _id as a citation:
omgdb context kb.omgdb notes body "draft the deploy runbook" --budget 800 --filter '{"tags":"ops"}'
# {"query":"draft the deploy runbook","budgetTokens":800,"usedTokens":763,"truncated":true,
# "chunks":[{"id":{"$oid":"..."},"score":0.6412,"tokens":512,"text":"..."},...],
# "citations":[{"$oid":"..."},...]}
To let the agent drive this itself, point an MCP server at the store — vsearch and context_pack are available even at read-only scope:
omgdb mcp --scope read
Note: The bundled embedder is a deterministic offline baseline, not a neural model, so relevance is approximate. See vector search and context packs for the honest details.
Make a bulk change you can undo
Bulk updates are where a bad filter does the most damage, so OMGDB splits them into plan → apply → rollback: a dry run you can inspect, an atomic commit by token, and an undo that restores the recorded before-state. Full background in agent-safe mutations.
Seed a collection where each document holds an array of tasks:
omgdb create app.omgdb
omgdb insert app.omgdb projects '{"project":"atlas","tasks":[{"name":"write spec","status":"todo"},{"name":"review","status":"done"}]}'
omgdb insert app.omgdb projects '{"project":"atlas","tasks":[{"name":"ship","status":"todo"}]}'
Plan the change: mark every todo task as done across the whole project. The $[t] placeholder in the update path is bound by --array-filters, so only matching array elements are touched. Nothing is written — this is a dry run:
omgdb plan-update app.omgdb projects '{"project":"atlas"}' \
'{"$set":{"tasks.$[t].status":"done"}}' \
--array-filters '[{"t.status":"todo"}]'
# {"token":"<token>","matched":2,"sampleBeforeAfter":[
# {"before":{...,"tasks":[{"name":"write spec","status":"todo"},...]},
# "after":{...,"tasks":[{"name":"write spec","status":"done"},...]}},
# ...]}
Before applying, read the plan itself. It lives under pending/ as plain NDJSON — one line per matched document, holding both the original and the proposed version:
cat app.omgdb/pending/<token>.ndjson
# {"ns":"projects","id":{"$oid":"..."},"before":{...},"after":{...}}
# {"ns":"projects","id":{"$oid":"..."},"before":{...},"after":{...}}
Note the review task stays done-as-it-was: the array filter only rewrote elements whose status was todo. Apply the plan by its token — one transaction, all-or-nothing:
omgdb apply app.omgdb <token>
# {"changeId":"<token>","applied":2}
On success the plan file moves from pending/ to changes/, where it becomes the change journal: a readable NDJSON record of exactly what changed, before and after, for every touched document. That journal is what makes the undo trustworthy — you can cat precisely what a rollback will restore:
cat app.omgdb/changes/<token>.ndjson
# the same {ns, id, before, after} lines, now an audit-and-undo record
omgdb rollback app.omgdb <token>
# rolled back 2 document(s)
Every task is back to its recorded before-state. The same three steps are exposed to agents as MCP tools (plan_update, apply, rollback) at read-write scope, which is what makes the pattern practical: an agent plans, a human (or a stricter agent) reviews the sample and the count, and only then does anyone apply. See agent-safe mutations and update operators.
Where to next
- Quickstart — the guided tour of every core command.
- CLI reference — every command and flag in one place.
- MCP server — hand the store to an agent with enforced capability scopes.
- Storage & op-log — why all of the above is rebuildable from one text file.