---
title: "The Karpathy-Style Wiki: A Knowledge Base Your AI Maintains"
description: "A Karpathy-style wiki is a plain-Markdown knowledge base an AI agent writes and maintains — not a vector store. Here's how it works, and how to run one with Claude and MDflow."
author: "MDflow"
date: 2026-06-30
reading_time: "16 min"
canonical_url: https://mdflow.cz/blog/karpathy-style-wiki
md_url: https://mdflow.cz/blog/karpathy-style-wiki.md
---

# The Karpathy-Style Wiki: A Knowledge Base Your AI Maintains

*Published June 30, 2026 · 16 min read*


Every personal wiki dies the same way. You start it with enthusiasm, write a dozen beautiful interlinked notes, and then real life arrives. A page goes stale. A cross-reference breaks. You stop trusting it, so you stop opening it, so it rots. The bottleneck of a knowledge base was never *writing* it — it was the relentless, boring upkeep that humans quietly abandon.

In early April 2026, Andrej Karpathy described a way around that, and it went viral almost immediately. The fix is not a better note-taking app. It is a change of author: **let the LLM keep the wiki.** You supply the sources and the questions; the model does the writing, the linking, and the maintenance. The community quickly gave the pattern a name — the **Karpathy-style wiki** — and started building tooling around it.

> **TL;DR** — A Karpathy-style wiki is a personal knowledge base stored as plain Markdown that an AI agent writes and maintains for you. You curate immutable sources and ask good questions; the model distills them into a cross-linked wiki that *compounds* over time, instead of re-deriving answers from scratch the way RAG does. [MDflow](/) is a natural home for one: Markdown-native storage, folder descriptions as the schema layer, [`mdflow_get_context`](/docs/mcp) for retrieval, and automatic version history as a safety net for every AI edit.

## What is a Karpathy-style wiki?

**A Karpathy-style wiki is a personal knowledge base, stored as plain Markdown, that an LLM writes and maintains on your behalf — not a vector database you query with retrieval-augmented generation.** The human curates source material and asks questions; the model reads those sources and produces an interlinked set of Markdown articles that it keeps current over time.

The idea is Karpathy's own, and recent. In early April 2026 he posted that a growing share of his own LLM usage had shifted away from code: *"a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating [knowledge]."* When that post went viral, he followed up with a "possibly slightly improved version" in the form of an **"idea file"** — a GitHub gist named `llm-wiki.md` — and explained the logic of sharing it that way: *"in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs."* The instruction at the bottom is the whole pitch in one line: *"give this to your agent and it can build you your own LLM wiki and guide you on how to use it."*

The design in that gist has **three layers**:

```text
my-wiki/
  CLAUDE.md            # 3. schema — structure, conventions, how to ingest & link
  raw/                 # 1. immutable sources: the LLM reads, never edits
    2026-paper-xyz.pdf
    interview-notes.md
    dataset.csv
  wiki/                # 2. the wiki: the LLM writes & maintains all of this
    index.md
    topics/
      retrieval.md     #    interlinked articles that compound over time
      evaluation.md
```

1. **Raw sources** — your curated collection of documents, papers, images, and data files. As Karpathy puts it, *"these are immutable — the LLM reads from them but never modifies them."* This is your ground truth.
2. **The wiki** — the interlinked Markdown pages the LLM generates and keeps current. The defining inversion of the whole method lives here: *"You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions."*
3. **A schema document** — a config file (Karpathy's example is a `CLAUDE.md`) that tells the agent how the wiki is structured: the directory layout, page formats, linking conventions, how to ingest a new source, and how to handle contradictions. Karpathy is explicit that there is no one true layout — *"the exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice."*

The sharpest way to understand it is by contrast with RAG. Retrieval-augmented generation re-synthesizes an answer from raw chunks *on every single query* and remembers nothing afterward. A Karpathy-style wiki does the synthesis **once**, writes it down, and then accumulates: *"the wiki is a persistent, compounding artifact."* Ask a related question next week and the agent extends an existing page instead of starting over.

A note on provenance, because the web is already full of misattributions. The "Karpathy-style wiki" label is a **community coinage** for a pattern Karpathy described and popularized in 2026 — not a product he shipped, and *not* a repackaging of his 2017 [*Software 2.0*](https://karpathy.medium.com/software-2-0-a64152b37c35) essay, which is about neural-network weights and never mentions Markdown, wikis, or agents. The wiki sits much later in his timeline, downstream of the mental models he is better known for: LLMs as *"the kernel process of a new Operating System"* (the "LLM OS," September 2023), *"the Decade of Agents"* and *"BUILD FOR AGENTS"* from his June 2025 *Software Is Changing (Again)* talk, and his *"+1 for 'context engineering' over 'prompt engineering'"* the same month. The LLM wiki is what context engineering looks like aimed at your own knowledge instead of a coding task.

## Why a Karpathy-style wiki works

It works because it moves the one job humans are worst at — diligent, never-ending maintenance — to a worker that does not get bored doing it. An LLM re-links a page and fixes cross-references every time you add a source, so the wiki actually stays alive. The benefits split cleanly between the people who keep it and the agents that read it.

### For developers and knowledge workers

- **It is plain text you own.** The wiki is just Markdown files. It renders on GitHub, opens in any editor, lives in git, diffs in a pull request, and is grep-able by every tool you already use. There is no proprietary format to be locked into and no "export" button that loses half your structure.
- **Knowledge becomes a compounding asset.** Each source you add makes every future answer better, because the agent folds it into pages that persist. A traditional chat history evaporates; a wiki accrues.
- **Your job gets more leveraged, not eliminated.** You stop transcribing and start doing the parts only you can do — choosing what is worth ingesting, asking sharper questions, and spotting what is missing. The model handles the typing.
- **It is model-agnostic.** Curated Markdown reads the same whether the agent behind it is Claude, GPT, or whatever ships next quarter — you are not betting your second brain on one vendor's roadmap.

### For AI agents

- **It is memory that survives a context reset.** This is the same technique Anthropic documents in [*Effective context engineering for AI agents*](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents): an agent that *"regularly writes notes persisted to memory outside of the context window"* — "like Claude Code creating a to-do list, or your custom agent maintaining a `NOTES.md` file" — can, after a reset, *"read its own notes and continue,"* building knowledge bases over time and maintaining state across sessions. A Karpathy-style wiki is that idea, made the centerpiece instead of a side effect.
- **Curated beats re-derived.** When an agent reads a distilled, cross-linked page, it gets meaning someone already got right. When it stitches together loose retrieved fragments, it guesses — and guesses hallucinate.
- **Provenance is traceable.** Every claim in the wiki links back to a specific file in `raw/`, so an answer is auditable down to a human-readable source rather than an opaque embedding. And navigation is just links: `index.md` files and Markdown cross-links let an agent traverse the graph the way a person clicks through a wiki.

None of this makes the wiki infallible. Garbage sources produce a garbage wiki, and an LLM left fully unsupervised can entrench an early mistake across many pages — which is exactly why version history and the occasional human review (covered below) matter so much for this pattern. The honest framing is the one Karpathy gives: the human is still responsible for sourcing and for asking the right questions. The model is responsible for the upkeep.

## What belongs in a Karpathy-style wiki

The pattern shines anywhere knowledge is worth keeping but tedious to maintain by hand. A few categories benefit immediately:

1. **Personal research and "second brain" notes.** The original use case: papers, articles, interviews, and half-formed ideas distilled into topic pages that grow as your reading does — without you copy-pasting a thing.
2. **Engineering team knowledge.** Runbooks, architecture decision records, and "how does this system actually work" pages — knowledge that already wants to be Markdown in a repo, and that an on-call agent can both read and help keep current.
3. **Project memory for coding agents.** The `AGENTS.md` / `CLAUDE.md` file is the schema layer; alongside it, a wiki of accumulated decisions, gotchas, and conventions gives the agent durable memory across sessions.
4. **Literature and domain digests.** Point the agent at a folder of PDFs and let it produce a cross-linked map of a field — definitions, open questions, who-said-what — that compounds with each new paper.
5. **Customer and product knowledge.** Support transcripts, policies, and troubleshooting steps turned into a governed, citable wiki a copilot reads from, instead of a brittle prompt full of pasted help-center articles.

The common thread: anywhere you would *want* a wiki but never manage to keep one, an agent-maintained version finally makes it stick.

## How to run a Karpathy-style wiki with Claude and MDflow

Karpathy's method needs four things: a place where the wiki lives as Markdown, an agent that can both **read and write** it, a way to retrieve the right pages on demand, and a safety net for the edits an AI makes on its own. [MDflow](/) provides all four, and connects to Claude over the [Model Context Protocol](/docs/mcp) — so "give this to your agent" is a single integration, not a pile of scripts.

### Set it up with Claude: the exact steps

Once Claude is connected, you can build and run the whole wiki from the chat box. Here is the end-to-end path.

**Step 1 — Create the wiki's home and a token.** [Start free](/login), then create a **Personal Access Token** in [settings](/settings) — it starts with `mdf_`. That token is what lets Claude act as the wiki's maintainer, so keep it secret: it grants read and write access to your workspace.

**Step 2 — Connect Claude to MDflow over MCP.** Point Claude at MDflow's hosted [MCP server](/docs/mcp) — no local process, no repo to clone. In **Claude Code**, that is one command:

```bash
claude mcp add --transport http mdflow https://mdflow.cz/api/mcp \
  --header "Authorization: Bearer mdf_your_token_here"
```

On **Claude Desktop**, add the same hosted server through the `npx mcp-remote` bridge (the full config is on the [MCP docs page](/docs/mcp)). Either way, Claude now has the `mdflow_*` tools and can read and write your workspace.

**Step 3 — Lay out the three layers.** This is Karpathy's "give this to your agent" moment — let Claude build the skeleton. A literal prompt:

```text
Using mdflow, create a workspace "Research". Inside it create a folder
"raw" (described as immutable source documents the LLM reads but never
edits) and a folder "wiki" (described as the interlinked pages the LLM
writes and maintains). Add a wiki page "index" that links to topic pages.
```

Claude calls `mdflow_create_workspace`, then `mdflow_create_folder` for each layer, then `mdflow_create_document` for the index. The **folder description you give each folder is the schema layer** — and in MDflow it doubles as the *primary ranking signal* retrieval uses later, so write it like you mean it:

```yaml
# Folder: wiki/retrieval
description: >
  Working knowledge on retrieval systems for our search team:
  embedding models we have benchmarked, chunking strategies and
  their trade-offs, eval results, and open questions.
```

A separate [workspace](/faq) per subject ("Research" apart from "Ops") keeps each agent's context clean — which matches Karpathy's point that the right structure is domain-dependent.

**Step 4 — Feed it sources and let Claude update the wiki.** Add raw material to the `raw` folder: paste it, capture it with the [Web Clipper](/clipper), or hand Claude a link and let it create the raw document. Then ask it to ingest:

```text
Read the new documents in Research/raw on mdflow, then create or update
the pages in Research/wiki: pull out the key claims, cross-link related
pages, and refresh the folder description and index. Never edit raw.
```

Claude reads with `mdflow_get_document`, writes pages with `mdflow_create_document` and `mdflow_update_document_body`, and keeps the schema honest with `mdflow_update_folder_description`. Re-run it whenever you add material — the wiki **compounds** instead of starting over.

**Step 5 — Ask questions, and review what the AI wrote.** Query the wiki in plain language:

```text
Use mdflow to get context on how our chunking strategies compare,
and answer from the wiki.
```

Claude calls [`mdflow_get_context`](/docs/mcp), which ranks **folder descriptions first**, then titles, and returns only the two or three best-matching pages — not the whole workspace. And because an autonomous writer occasionally gets something wrong, **every AI edit lands in [version history](/blog/version-control-for-documents)** automatically — editor, API, and agent alike — with line diffs and one-click restore. Open a page's history, see exactly what Claude changed, and roll back if needed. That review-and-revert safety net is what makes trusting the model with the upkeep reasonable.

Underneath, everything stays portable and governed: every page is plain Markdown, any page you share gets a raw `.md` twin (YAML frontmatter, open CORS) an agent can fetch and cite in one request, the same operations are available over the [HTTP API](/docs/api) if you would rather script the loop, and [collections](/faq), private and public sharing, anchored comments, and optional client-side encryption keep the wiki controlled as it grows.

### Where we are headed

OKF-style typing and a few retrieval upgrades make the pattern even sharper. The following is **direction, not a dated commitment**, but it is the shape of our thinking:

- **Typed documents and tags** — adopting fields like `type` and `tags` so an agent can filter the wiki ("only the runbooks in this folder"), not just rank it.
- **Agent-assisted curation** — letting Claude propose folder descriptions, cross-links, and a starting schema for knowledge you already have, so the setup step is itself agent-driven.
- **A collections API and richer remote MCP** — serving a whole wiki to an agent as one cross-linked bundle in a single call, instead of page by page.
- **Capture-to-knowledge** — the [Web Clipper](/clipper) already turns a web page into clean Markdown; the next step is dropping a clipped source straight into the `raw/` layer of a wiki for the agent to ingest.

## The bottom line

The Karpathy-style wiki is a small idea with a large consequence: stop being the author of your knowledge base and become its editor instead. Curate the sources, ask the questions, and let the model do the writing and the endless upkeep that killed every wiki you ever started. What you get back is the thing personal knowledge management always promised and rarely delivered — a living, cross-linked, *compounding* record that gets more valuable every time you feed it, not more stale.

MDflow is where that wiki can live for people and agents at once: write or clip Markdown, give your folders meaning, connect Claude over MCP to read and maintain the pages, and lean on automatic version history to keep an autonomous writer honest. Karpathy's "give this to your agent" step, with somewhere real for the wiki to live.

[Start free](/login) · [Connect an AI agent](/docs/mcp) · [Read the API docs](/docs/api)

## Frequently asked questions

### What is a Karpathy-style wiki?

A Karpathy-style wiki is a personal knowledge base stored as plain Markdown files that an AI agent writes and maintains, instead of a vector database you query with RAG. Andrej Karpathy described the pattern in a viral post in early April 2026 and a follow-up "idea file": you curate immutable source documents and ask questions, and the LLM builds and keeps an interlinked wiki that compounds over time. The human sources and explores; the model does the writing and the bookkeeping.

### Is a Karpathy-style wiki the same as RAG?

No. Retrieval-augmented generation re-synthesizes an answer from raw chunks on every query and keeps nothing. A Karpathy-style wiki is a persistent, compounding artifact: the LLM distills sources into curated, cross-linked Markdown pages once, then reads and extends them over time. They are complementary — you can still retrieve over the wiki — but the wiki itself is durable, human-readable, and traceable to specific files rather than opaque embeddings.

### Did Andrej Karpathy invent the LLM wiki?

Karpathy popularized it. In early April 2026 he posted that a large share of his LLM usage had shifted from manipulating code to manipulating knowledge, then published an "idea file" (a GitHub gist, `llm-wiki.md`) describing a three-layer design: immutable raw sources, an LLM-maintained Markdown wiki, and a schema document. The community named it the "Karpathy-style wiki" or "LLM wiki pattern" and built tooling around it. Karpathy stresses the exact structure is domain-dependent.

### Do I write a Karpathy-style wiki myself?

Rarely. In Karpathy's framing you "never (or rarely) write the wiki yourself — the LLM writes and maintains all of it," while you stay in charge of sourcing material, exploring, and asking the right questions. Your job shifts from note-taking to curation and good questions; the model handles ingestion, cross-linking, and upkeep — the bookkeeping that makes humans abandon their wikis.

### How do I run a Karpathy-style wiki with Claude and MDflow?

Connect Claude to MDflow over its MCP server with a Personal Access Token. Store the wiki as Markdown documents in folders whose descriptions act as the schema and ranking layer, let Claude read sources and write pages with `mdflow_create_document` and `mdflow_update_document_body`, keep the index current with `mdflow_update_folder_description`, and retrieve answers with `mdflow_get_context`. Automatic version history captures every AI edit with a line diff and one-click restore, so an agent maintaining the wiki stays safe to trust.

## Further reading

- Andrej Karpathy — ["idea file" gist, `llm-wiki.md`](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) (the canonical primary source for the pattern)
- Andrej Karpathy on X — [the original LLM-wiki post](https://x.com/karpathy/status/2039805659525644595) and the [idea-file follow-up](https://x.com/karpathy/status/2040470801506541998)
- Andrej Karpathy on X — ["+1 for context engineering over prompt engineering"](https://x.com/karpathy/status/1937902205765607626)
- Andrej Karpathy — [Software Is Changing (Again), YC AI Startup School](https://www.ycombinator.com/library/MW-andrej-karpathy-software-is-changing-again) (June 2025)
- Anthropic — [Effective context engineering for AI agents](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)
- MDflow — [Context Engineering for AI Agents](/blog/context-engineering-for-ai-agents) · [Google's Open Knowledge Format (OKF)](/blog/google-open-knowledge-format-okf) · [Version Control for Documents](/blog/version-control-for-documents)
- MDflow — [Markdown for AI agents](/markdown-ai) · [MCP documentation](/docs/mcp) · [API documentation](/docs/api) · [FAQ](/faq)

