Setting up your first knowledge base
This document outlines how to set up your first knowledge base.
Goal
Set up a knowledge base session for your repository and interact with it via chat.
Prerequisites
An LLM API key must be exposed via environment variables.
Steps
- Initialize your knowledge base: Run
kb initin your repository’s root directory. This command creates the foundational knowledge base structure by setting up the.kb-index.sqlitedatabase in your repository’s root directory, which will store your knowledge. - Add content to your knowledge base: After initialization, you will need to add documents or other relevant content to your knowledge base using
kb addcommands. - Start a chat session: Once content is indexed, you can begin interacting with your knowledge base via chat by running
kb chat.
Verification
To confirm kb init was successful, verify that the .kb-index.sqlite file has been created in your repository’s root directory.
After adding content, you can verify the overall setup by querying the knowledge base via chat or by using commands like kb query, kb graph, or kb docs. Note that kb query is designed for programmatic, one-shot retrieval by agents, whereas chat mode offers an iterative, user-facing conversational experience.
References
- installSkillIntoProject() targets user profile-level MDs (~/.claude/CLAUDE.md, ~/.codex/AGENTS.md) rather than local repo CLAUDE.md/AGENTS.md. If neither profile MD exists, it creates ~/.claude/CLAUDE.md. The cwd parameter was removed from the function signature. —
fact://0c5223b17f53aab1 - kb query is the agent-facing, one-shot retrieval command: it runs the full intent-rewrite pipeline, runQueryTruthRetrieval, and enrichReadDocumentsAnswerWithLLM in a single pass. It is designed for programmatic use — agents (Claude Code, Cursor, Codex) call it to get a complete answer they can act on immediately. Chat mode is the user-facing, iterative experience: it uses executeChatQueryTruthRetrieval with auto-deepening, live in-memory message history, and conversational turn resolution across multiple back-and-forth exchanges. The two paths are intentionally different and should not be collapsed — query is for machines needing a deterministic answer, chat is for humans building understanding over time. —
fact://c2bd509a11e0ca2f -
- CLI wording should make this explicit: these are knowledge-base cleanup operations, not codebase refactors. —
fact://5c1acfe883057f35
- CLI wording should make this explicit: these are knowledge-base cleanup operations, not codebase refactors. —
-
- Source code, tests, and unrelated repo assets are out of scope unless a separate code-editing command explicitly owns that responsibility. —
fact://d30ad245d3660a52
- Source code, tests, and unrelated repo assets are out of scope unless a separate code-editing command explicitly owns that responsibility. —
-
- Scope must be limited to the active KB store, not arbitrary repo files. —
fact://6f6e71a313457782
- Scope must be limited to the active KB store, not arbitrary repo files. —
- For KB maintenance tools that mutate stored knowledge (for example
invalidate_fact/kb invalidate): —fact://b9b297900918b63b -
- Examples exist for common use cases —
fact://3f3dcbb43d235ab2
- Examples exist for common use cases —
- This means a query for “KbGraphWriter” can still surface facts that mention “SQLite” or “property graph” if those edges exist in the graph — even when the literal query string did not include those words. —
fact://147979da16325171 - The expanded term set is capped and concatenated to the original query for fact retrieval. —
fact://c6d9430a60abd141 - For every live edge touching those entities, expansion adds semantic triplets as natural-language phrases (
Subject <predicate phrase> Object, plus the stored predicate slug and a spaced variant, e.g. —fact://ce5bc9af5d2198f3 -
weight: 1.0 for live edges, 0 for soft-deleted edges (set bykb invalidate) —fact://13573a3c8a36465a
-
typeon relationships: canonical extractor labels (depends_on,contradicts,related_to,replaces,implements,uses) or any snake_case label you set viakb graph edge add --verb(free text is normalized to snake_case for storage) —fact://597b6f9d78afadf0
- Graph tables live in
<base-dir>/.kb-index.sqlite(kb_graph_entities,kb_graph_relationships), alongside documents, chunks, and facts. —fact://23c53c80ec0ef2a6 - Session override: Pass
--base <name>onkb graph(same as other KB commands) to target a specific session without switching your active base. —fact://e1581d19940acab2 - Export: The full graph can be dumped as Graphviz DOT (for visualisation tools like Gephi or Mermaid) or JSON (for your own analysis). —
fact://4c1f106d8b475c1a - As you build up your knowledge base, the graph gives you a structural view of how ideas connect — something the flat SQLite full-text index cannot express. —
fact://ab58de76fc696b75 - Your single responsibility is to execute the given instruction and report results clearly. —
fact://c6c17d89fd705933 -
- Your tool access is intentionally limited. —
fact://a1c152ed55b7852c
- Your tool access is intentionally limited. —
-
- The parent agent will integrate your output into a larger result. —
fact://a1e1d4415244387e
- The parent agent will integrate your output into a larger result. —
- You expand short or vague knowledge-base queries into targeted sub-queries. —
fact://3447bc2f075af706