Home/Docs/Memory System

Memory System

Zubo has a persistent semantic memory system that combines vector embeddings with full-text search. It automatically remembers conversations, ingested documents, and facts you teach it — and retrieves relevant context for every message. This means your agent gets smarter over time, building up a knowledge base that is always available.

Copy-Paste Task Cards

Secure API Access

Enable API auth and create a key before exposing ports beyond localhost.

zubo config set auth.enabled true
zubo auth create-key my-app

Set Local Model Fallback

Keep responses available during provider outages or API quota issues.

zubo config set failover '["openai","ollama"]'
zubo config set providers.ollama.model llama3.3

Common Errors

401 Unauthorized / Missing Bearer token

If auth.enabled is true, all /api/* calls require Authorization: Bearer <key>. Create a new key if needed.

curl -H "Authorization: Bearer YOUR_KEY" http://localhost:3000/api/dashboard/status
Provider Timeout / Upstream unavailable

Use failover and switch temporarily to a responsive provider or smaller model. Check logs for repeated timeout patterns.

zubo model openai/gpt-4o-mini
zubo logs --follow
Missing local model (Ollama/LM Studio)

If local providers fail, ensure the runtime is running and a model is installed.

ollama serve
ollama pull llama3.3

Memory Quick Checks

How Memory Works

Every piece of content that enters the memory system follows the same pipeline:

  1. Content arrives — This can be a conversation message, an uploaded document, or an explicit memory write via the memory_write tool.
  2. Text is chunked — The content is split into segments of approximately 400 tokens (~1600 characters) with an overlap of approximately 80 tokens (~320 characters) between consecutive chunks.
  3. Chunks are embedded — Each chunk is converted into a 384-dimensional vector using the all-MiniLM-L6-v2 ONNX model. This captures the semantic meaning of the text.
  4. Storage — Chunks, their embeddings, and metadata are stored in SQLite. A full-text search index is updated via triggers.
  5. Retrieval — On every incoming message, Zubo automatically searches memory for relevant context using hybrid search.
  6. Context injection — The top matching results are injected into the LLM context alongside the user's message, giving the agent access to relevant knowledge.

Here is a simplified view of the data flow:

Content --> Chunker --> Embedder --> SQLite (chunks + embeddings)
                                          |
Query --> Hybrid Search <----------------+
           (60% Vector + 40% FTS)
                |
           Top results --> LLM Context

Memory Storage

Zubo stores memory in two complementary layers:

1. File-Based Storage

Memory files live at ~/.zubo/workspace/memory/ and come in two forms:

2. Database Storage

The memory_chunks table in SQLite stores all chunked content with their vector embeddings, source file references, timestamps, and full-text search index entries. This is the primary storage layer that powers memory search. It is fully managed by Zubo — you do not need to interact with it directly.

Search

Zubo supports three search modes, each suited to different scenarios:

Full-Text Search (FTS)

Vector Search

Hybrid Search

You can tune retrieval behavior in ~/.zubo/config.json via memoryRetrieval.contextTopK and memoryRetrieval.minConfidence, or from the dashboard under Settings → General → Memory Retrieval.

Document Ingestion

You can upload documents to populate Zubo's memory with external knowledge. The following file formats are supported:

FormatExtensionNotes
Plain text.txtDirect indexing, no preprocessing needed.
Markdown.mdDirect indexing, preserves structure.
CSV.csvParsed as text with rows preserved.
PDF.pdfRequires pdf-parse (auto-installed on first PDF upload).
Word.docxRequires mammoth (auto-installed on first DOCX upload).
Excel.xlsxRequires xlsx (SheetJS). Each sheet is converted to CSV.
PowerPoint.pptxRequires jszip. Text extracted from each slide.
JSON.jsonPretty-printed before indexing.
XML.xmlTags stripped, text content extracted.
YAML.yaml, .ymlDirect indexing.
Code.ts, .js, .py, .shDirect indexing with syntax preserved.

There are three ways to upload documents:

Chunking Strategy

The chunker is responsible for splitting content into segments that are small enough to embed meaningfully but large enough to preserve context. Here is how it works:

This strategy ensures that each chunk is a coherent unit of information that can be meaningfully compared via vector similarity, while the overlap prevents important context from falling between the cracks.

Memory Pruning

To keep the database fast and the storage footprint reasonable, Zubo automatically prunes old memory chunks when the total count exceeds a configurable limit:

In practice, 10,000 chunks represents a substantial amount of knowledge — roughly equivalent to several hundred pages of text. For most personal assistant use cases, you will never hit this limit.

Using Memory

Memory works automatically in the background, but you can also interact with it directly.

Teaching Your Agent

Tell Zubo facts and it will remember them for future conversations:

You: "Remember that my favorite programming language is Rust"
Zubo: "Got it — I'll remember that your favorite language is Rust."

The agent uses the memory_write tool to save this fact. It will be retrievable in future sessions via semantic search.

Searching Memory

You can ask Zubo to recall information it has stored:

You: "What do you remember about my preferences?"
Zubo: "Based on my memory, I know that your favorite programming
       language is Rust. You prefer metric units and Markdown
       formatting. Your timezone is America/New_York."

Memory search also happens automatically on every message. You do not need to explicitly ask the agent to check its memory — it does so as part of normal message processing.

Via the Dashboard

Memory Tools

Zubo provides three built-in tools for memory operations. These are available to the main agent and to any sub-agent that lists them in its ## Tools section:

ToolDescription
memory_writeSave a fact, note, or piece of content to persistent memory. The content is chunked, embedded, and indexed automatically.
memory_searchSearch memory for relevant information using hybrid search. Returns the top matching chunks with their source and relevance score.
memory_pruneManage memory hygiene. Delete memories by ID, keyword, or age. Remove duplicates. View memory stats (total chunks, date range, storage size). Requires confirmation before deleting.

Knowledge Graph

In addition to vector and full-text search, Zubo maintains a knowledge graph that stores structured entities and the relationships between them. While the chunk-based memory system excels at free-form text retrieval, the knowledge graph captures discrete facts in a queryable entity → relation → entity format. Entities have a name, a type (such as person, project, org, or concept), and optional key-value properties. Relationships link two entities with a labeled edge (for example, works_at, manages, or uses). When you mention a known entity in a message, Zubo automatically looks up its graph context and injects relevant relationships into the LLM prompt — giving the agent structured awareness alongside its semantic memory.

Two built-in tools give the agent full read/write access to the graph:

ToolDescription
kg_querySearch for entities by name, retrieve full entity details and relations, or export a subgraph. Supports actions: search, get, relations, graph.
kg_updateAdd or remove entities and relationships. Supports actions: add_entity, add_relation, remove_entity, remove_relation. Entities are upserted by name+type.

Example interaction:

You: "I just started working at Acme Corp on the Atlas project."
Zubo: "Noted! I've added that to your knowledge graph."

# Under the hood, Zubo calls kg_update twice:
#   add_relation: You --works_at--> Acme Corp (org)
#   add_relation: You --works_on--> Atlas (project)

You: "What do you know about my work?"
# Zubo calls kg_query { action: "relations", name: "You" }
Zubo: "You work at Acme Corp and are on the Atlas project."

How Zubo Learns Over Time

Zubo's memory is built on three complementary layers, each serving a different purpose. Together they give the agent short-term recall, long-term semantic understanding, and structured knowledge about the people, projects, and concepts in your world.

1. Session Memory

Every conversation is persisted as JSONL in ~/.zubo/sessions/. The last 50 messages are loaded as context for each conversation turn. This gives Zubo short-term memory within a conversation — it remembers what you said earlier in the chat without needing to search the database.

2. Semantic Memory

When the agent calls memory_write, facts are:

The agent is instructed to save facts proactively — when you mention your name, job, preferences, or projects, the agent stores it immediately. Over time this builds a rich knowledge base that makes every future conversation more informed.

3. Knowledge Graph

Structured entity-relationship data provides a third dimension of recall:

Cross-Channel Memory

All three layers — session memory, semantic memory, and the knowledge graph — are shared across all channels. A fact learned on Telegram is instantly available on Discord, Slack, WhatsApp, Signal, Email, and WebChat. You never have to repeat yourself when switching between channels.

Best Practices