OKF is a tiny, open standard for writing down the knowledge that normally lives in people's heads, scattered wikis, and proprietary catalogs — in a format that both humans and AI agents can read. No SDK. No database. Just folders of Markdown files.
Strip away the jargon and OKF is three plain ideas stacked together.
Each "thing you know" — a table, a dataset, a metric, a playbook — is one Markdown file. Group them in folders. That folder is a bundle.
Each file starts with a small YAML frontmatter header (type, title, description…). Machines read the header; humans read the rest.
Files reference each other with ordinary Markdown links, quietly forming a knowledge graph an agent can walk.
💡 Why this is clever: the same file is readable by a person in any text editor and parseable by a program — so there's no translation layer, no vendor lock-in, and the knowledge travels with your code in version control.
When an AI agent tries to answer a real data question, the answer is smeared across a dozen incompatible places. Every team rebuilds the same plumbing from scratch.
"Every agent builder is solving the same context-assembly problem from scratch, every catalog vendor is reinventing the same data models, and the knowledge itself is locked behind whichever surface created it."
Knowledge is written once in a neutral format. Humans and agents read the same files. It ships as a tarball, lives in a git repo, and survives moving between tools, teams, and companies.
OKF formalizes a pattern Karpathy described: instead of an AI re-reading raw documents every time you ask something, it maintains a living, cross-linked wiki — a compounding artifact that's compiled once and kept current.
New sources get read, summarized, and folded into existing pages — updating entities, revising summaries, flagging contradictions.
Questions are answered by searching the relevant wiki pages. Good answers can be filed back as brand-new pages.
Health checks hunt for contradictions, stale claims, orphans, and missing cross-references — keeping the knowledge honest.
"LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The bookkeeping that causes humans to abandon personal wikis is exactly what LLMs are good at."
— Andrej Karpathy, the LLM-Wiki gist that inspired OKFThe punchline: the boring part of a knowledge base isn't reading or thinking — it's bookkeeping. That's the chore humans quit and machines never tire of. OKF is the file format that makes that machine-maintained wiki portable.
A bundle is just a directory tree. Each concept is one file; folders group related concepts; an optional index.md describes what's inside.
sales/ ├── index.md # optional table of contents ├── log.md # optional update history ├── datasets/ │ ├── index.md │ └── orders_db.md ├── tables/ │ ├── index.md │ ├── orders.md # one concept = one file │ └── customers.md └── metrics/ ├── index.md └── weekly_active_users.md
A file's identity is just its path minus .md — so tables/orders.md is the concept tables/orders. Other files link to it by that path.
index.md = a friendly listing for progressive disclosure. log.md = a dated history of changes. Everything else is a concept.
Zip it as a tarball, push it to GitHub (it even renders!), or drop it next to your code. No server, no runtime, no required SDK.
Every concept file has two parts: the YAML frontmatter (the structured header machines query) and the Markdown body (the rich docs humans read). Tap any highlighted line below.
--- type: BigQuery Table title: Orders description: One row per completed customer order. resource: https://console.cloud.google.com/bigquery?... tags: [sales, revenue] timestamp: 2026-05-28T14:30:00Z --- # Schema | Column | Type | Description | |-------------|--------|----------------------------| | order_id | STRING | Globally unique order id. | | customer_id | STRING | FK to [customers](/tables/customers.md). |
The lines between the --- markers are frontmatter — structured fields a program can query. Everything after is normal Markdown that a human reads. Click a line on the left to see exactly what each field is for.
Tip: only one field is actually required — type.
OKF is deliberately tiny. These principles are why it can be a shared standard instead of yet another platform.
Only type is required. OKF won't force a taxonomy on you — producers define their own content models and even add custom fields.
The format is the contract; tooling is swappable. A human can write a bundle an agent consumes, or an agent can write one a human reads. Neither needs the other's tools.
Never requires a proprietary account, runtime, or SDK. Vendor-neutral by design — the goal is a lingua franca knowledge can be exchanged in, like JSON or Markdown.
🧩 What OKF deliberately does NOT do: it doesn't replace domain schemas like Avro, Protobuf, or OpenAPI; it doesn't dictate where you store files; and consumers are expected to tolerate broken links and unknown types gracefully rather than crash. Forgiving by design.
OKF sits between dumping raw documents at an AI (classic RAG) and burying knowledge in a proprietary catalog.
| Classic RAG | Proprietary catalog | OKF | |
|---|---|---|---|
| Human-readable | Sometimes | Behind a UI | Always (plain Markdown) |
| Machine-readable | Yes | Via vendor API | Yes (YAML frontmatter) |
| Portable / vendor-neutral | Depends | Locked in | Fully portable |
| Lives in version control | Rarely | No | Yes — diff & review like code |
| Persistent & compounding | Re-derived each query | Yes | Yes — a living wiki |
| Setup cost | Vector DB + pipeline | Onboard a platform | Make a folder |
OKF isn't just a document — it launched with open-source tooling and example bundles to prove the idea.
Walks your BigQuery datasets, drafts OKF documents automatically, and enriches them with schemas and documentation — bootstrapping a bundle for you.
Turns any OKF bundle into an interactive knowledge graph you can browse — and it needs no backend at all.
GA4 e-commerce, Stack Overflow, and Bitcoin datasets — ready-made proof-of-concept knowledge bundles.
Published as open source on GitHub and wired into Google Cloud's Knowledge Catalog, with an explicit invite for community implementations.
The handful of words you need to sound fluent in OKF.
A directory of Markdown files that together describe a body of knowledge — e.g. everything about your "sales" data. The unit you ship around.
A single thing worth knowing — a table, dataset, metric, API endpoint, or runbook — captured as one Markdown file inside a bundle.
The small YAML block at the top of a file (between --- lines) holding structured, queryable fields like type, title, and tags. The machine-readable part.
A concept's address: its file path without the .md. tables/orders.md → tables/orders. Other files link to it using this.
A free-text label categorizing the concept — "BigQuery Table", "Playbook", "API Endpoint". The one field every concept must have.
index.md is an optional human-friendly listing of a folder's contents; log.md is an optional dated history of updates. They're not concepts themselves.
A bundle that follows the rules: every non-reserved .md file has parseable frontmatter with a non-empty type, and reserved files follow their format. That's basically it.
Because AI agents are only as good as the context you give them — and right now that context is trapped in silos. OKF makes that context writable once, readable by everyone (human or machine), and portable forever. It's not a database or a product. It's a humble, shared format — and that humility is exactly what lets it become a standard.