An open, portable and governed architecture to create a context base any agent can consult without becoming its owner.
v2 sold Hermes as an active core with four roles (router, memory manager, scope controller, policy engine). The field test and Hermes's own official documentation showed that design did not exist: Hermes was never an orchestrator, it is a client. v3 repositions. Memory is a passive file, the client agent is an interchangeable piece, Git is the spine that unifies versioning and synchronization. Governance in two declared modes: cooperative (contract, the recommended floor) and adversarial (enforcement through operating system isolation, below the agent). Capture via the agent's own hooks (PostToolUse running scripts/capture-to-inbox.mjs), depositing into /90-inbox/. Risk-proportional auditing: verified + low self-promotes by TTL; hypotheses and high risk require human decision. MCP and Graphiti become optional layers under volume pain, not floor components.
Every AI tool looks brilliant in the first few minutes. It understands the request, writes code, suggests campaigns, summarizes meetings and even seems to have followed the project from the start. The problem shows up in the second, third or tenth session: the AI forgets decisions, mixes context, repeats questions and needs to be re-educated constantly.
The natural reaction is to build a "super brain": dump everything in one place and let the agent read it. At first this seems to solve it. Then the system degrades. A marketing task pulls in coding conventions. One project contaminates another. Old memories enter as if they were still true. The agent knows too much and, precisely because of that, starts making better mistakes.
This guide proposes a different path: federated memory. Separate domains, readable files, neutral contract, context packages and controlled routing. It is not about letting an agent run the house. It is about creating a base any agent can consult without messing up the environment.
The Markdown vault stores the reviewed memory, with Git as the spine for versioning and sync. Context Packs deliver the minimum sufficient context. The contract guides reading. Agents execute as interchangeable clients, without needing to talk to each other. MCP and Graphiti are optional evolution, when the pain appears.
Reference implementation using a Markdown vault versioned in Git. The architecture, however, is not locked to any tool. The goal is a portable memory layer for any current or future agent.
AGENT.md as a neutral contract, with no dependency on Claude Code, Codex, Cursor or any executor.CLAUDE.md, codex.md, Cursor rules).It is not a closed product. It is not a framework that needs to control all projects. It is not a proposal for automatic memory without review. Permanent memory must pass through human curation.
Before the principles, separate three layers, because confusing them is the most common mistake: the model (the LLM that reasons), the code agent (Claude Code, Cursor, Codex, Hermes, which runs the loop and touches the files) and memory (the persistent, sovereign base that belongs to neither). The federated architecture exists to keep memory outside the agent.
Sovereign memory. The base belongs to the user. Agents come and go; context stays portable. Git is the spine: the vault is a repository, which delivers versioning, authorship, diff and cross-machine synchronization in a single piece.
Isolated domains. Marketing, programming, clients, products and research should not live in the same semantic soup. Isolation is structural, by folders, not an instruction the agent can ignore.
Neutral contract. AGENT.md describes how any agent should consume the memory. Adapters only translate for specific tools. No agent is the core; all are interchangeable clients.
Minimum sufficient context. The agent should receive what is needed for the task, not the entire history of your digital life.
Human as risk-proportional auditor. Capture is automatic; approval is not mandatory for everything. Verified low-risk entries promote on their own with TTL; hypotheses and high risk require human decision. The human is a quality filter, not a capture bottleneck.
Governance in two modes. In cooperative mode, the default, the rule lives in the contract and in the structure, and that is enough. Against an agent that may be hostile, the contract is not enough: enforcement comes from the operating system, below the agent (a read-only container except for the inbox, or a separate user). Never from a core on the agent's side.
A common confusion in multi-agent architectures is assuming agents need to talk to each other. They do not. In a well-designed federated memory, coordination is asynchronous via shared state. This pattern has a name: blackboard architecture.
Think of a classroom blackboard. Different people write on it at different times. Whoever enters later reads what is there. Nobody needs to be present when another wrote. Coordination happens through the board.
Practical implications of this choice:
When you use Claude Code in the morning and Codex at night, only the executor changes. The curated memory is the same. The night agent reads what was approved in the morning. No real-time orchestration. No message queue. No broker.
| Component | Function | Replaceable by |
|---|---|---|
| Obsidian | Human-reviewed base in Markdown | Logseq, Foam, bare Git repository |
| Vaults / pastas | Isolate domains and reduce contamination | Subfolders in the same vault |
| AGENT.md | Neutral consumption contract | No. It is the portability core. |
| Context Packs | Minimum context per task | Curated RAG, modular prompts |
| Adaptadores | Contract translation for the tool | Specific to each agent |
| Git | Versioning, authorship and sync of the vault | Essential. No substitute delivers all of it together. |
| MCP server | Standardized multi-agent access | Direct reading, CLI, REST API |
| Graphiti | Temporal memory and relations | Add only when it hurts |
Obsidian is the base because it works well with Markdown, links, backlinks, tags and human navigation. The v2 template uses eleven numbered folders, optimized for direct reading by client agents and for growing without becoming a mess.
In this guide, we use "vaults" in the sense of memory domains. The default implementation uses a single Obsidian vault with folders separated by domain. When there is a need for strong isolation, security or selective sharing, these domains can be promoted to separate physical vaults.
/federated-memory
├── 00-global/ # AGENT.md contract, general rules
│
├── 10-projects/ # active projects (start, middle, end)
│ └── project-a/
│ ├── PROJECT.md
│ ├── notes/
│ └── deliverables/
│
├── 20-domains/ # stable domains (engineering, writing, research...)
│ ├── engineering/
│ ├── writing/
│ └── research/
│
├── 30-clients/ # clients, candidate for separate physical vault
│ └── <cliente>/
│
├── 40-workflows/ # repeatable flows (release, incident, review...)
│
├── 50-skills/ # reusable capabilities
│
├── 60-context-packs/ # context packages per task
│ ├── linkedin-writing.md
│ ├── code-review.md
│ ├── research.md
│ └── planning.md
│
├── 70-decisions/ # formal decisions with approved/superseded status
│
├── 80-agent-adapters/ # per-agent adapters
│ ├── claude/{CLAUDE.md, AGENTS.md}
│ ├── cursor/.cursorrules
│ ├── codex/AGENTS.md
│ └── windsurf/.windsurfrules
│
├── 90-inbox/ # suggestions pending human review
│ └── suggested-memory.md
│
└── 99-archive/ # logs, obsolete packs, archived
├── review-log.md
├── pack-usage.log
└── pack-status.md
Objective criteria: create a domain when there is vocabulary, decisions or output style that should NOT appear in another domain's response. If "writing" uses "voice, hook, lead" and "engineering" uses "stack, latency, deploy", and crossing them confuses the agent, they are two domains. If two areas use the same vocabulary and make compatible decisions, it is a folder within a single domain, not a new domain.
When there is a contractual isolation requirement (NDA, sensitive client data), independent synchronization (a different cloud account), or selective read sharing without exposing the rest. The typical case is 30-clients/.
When you are just starting, when domains exchange context frequently, or when you are a single person using a single machine. Start with folders in the same vault; promote later if the pain justifies it.
The vault is Markdown by default. But multimodal support does not require changing the architecture, it only requires a convention for where assets live and how they are referenced.
Multimodal support is a capability of the LLM configured in the agent, not of the agent itself. The agent passes the file, the LLM processes it. Check the provider's updated documentation:
All major LLMs today support at least images and PDFs. Audio and video vary by provider.
Asset structure in the vault:
/federated-memory
/20-domains/
/engineering/
/assets/
arquitetura-v2.png
decisao-banco.pdf
/writing/
/assets/
exemplos-visuais/
/10-projects/
/project-a/
/assets/
wireframes/
screenshots/
bugs/
bug-001/
screenshot-erro.png
tentativa-01-failed.md
tentativa-02-success.md
Markdown reference rule: every asset is referenced in the .md file of the same domain or project. The Context Pack points to the .md. The agent loads the asset when the .md references it.
Example in a DECISION.md:
Reference diagram: ./assets/arquitetura-v2.png
The agent should load this image when analyzing stack decisions.
Videos, datasets and files above 10MB should not live in the vault. Reference by URL or external path. The vault must remain lightweight and versionable.
The AGENT.md is not a giant prompt. It is the contract that teaches any agent how to consume the memory without making a mess. It lives in 00-global/AGENT.md.
# AGENT.md
Purpose:
This repository contains the federated memory used by AI agents.
The memory is owned by the human user. Agents are consumers,
not owners.
Rules:
1. Do not load the entire memory base.
2. Start from the relevant Context Pack in /60-context-packs/.
3. If no Context Pack exists, ask which domain is relevant.
4. Permanent writes are forbidden outside /90-inbox/ in any
execution mode (interactive, headless, scheduled).
5. Memory conflicts: the most recent entry with status: approved
wins. Entries with status: superseded stay in history but are
ignored at runtime. Never infer winner by file timestamp alone.
6. When unsure, create a suggested memory entry in
/90-inbox/suggested-memory.md instead of guessing.
Folders:
- 00-global, 10-projects, 20-domains, 30-clients, 40-workflows,
50-skills, 60-context-packs, 70-decisions, 80-agent-adapters,
90-inbox, 99-archive
Decision frontmatter (in /70-decisions/):
- id, date, status (approved | superseded | pending), supersedes,
domain, owner
Hermes Agent loads context files like AGENTS.md and .hermes.md when assembling the system prompt. If you place your AGENT.md in the path Hermes reads, or reference it from its AGENTS.md, your rules apply automatically.
AGENT.md defines how the agent behaves. RULES.md defines the business and stack rules that apply to every project. They are two distinct files with distinct responsibilities: agent behavior versus developer and company standards.
Create 00-global/RULES.md with two blocks:
# RULES.md
# Global rules, apply to all projects.
# To override a rule for a specific project,
# register an override in that project's 70-decisions/.
## Company Rules
- stack: TypeScript, never MongoDB
- tests: TDD mandatory
- security: never commit secrets, use environment variables
## Dev Rules
- commits: English, imperative, max 72 characters
- PR: never larger than 400 lines
- review: self-review before opening PR
The agent loads RULES.md every session, right after AGENT.md. This is already instructed in AGENT.md via the ## Global Rules section.
When a project needs to deviate from a global rule, the deviation is recorded in the project's 70-decisions/ with required fields. The RULES.md is never changed.
---
id: DEC-OVERRIDE-001
date: 2026-06-01
approved-by: André
status: approved
rule-override: company/stack/mongodb
---
# Override: MongoDB usage in project X
## Reason
Client requires MongoDB by contract. Migration planned for Q3 2027.
## Scope
Valid for this project only. Does not alter RULES.md.
## Review
review_date: 2026-12-01
The agent applies rules in this order: Company Rules (broadest, most stable) > Dev Rules > Project rules. An approved override in 70-decisions/ takes precedence over RULES.md. The agent executes without questioning, the deviation has already been decided and documented.
Each agent has its own context file convention. The adapter is short. It does not replace AGENT.md. It just points to it.
Example adapter for Claude Code:
# CLAUDE.md
Read the shared memory protocol at:
../00-global/AGENT.md
For this project, start with:
../10-projects/project-a/PROJECT_CONTEXT.md
Use Context Packs before reading raw notes:
../60-context-packs/architecture-review.md
Do not modify permanent memory directly.
If a new memory seems useful, write a suggestion to:
../90-inbox/suggested-memory.md
Without a Context Pack, the agent needs to dig through the vault. With a Context Pack, it receives a lean, task-oriented package. Reduces noise, improves consistency, prevents cross-domain contamination.
linkedin-writing.md# Context Pack: linkedin-writing
Goal:
Help an AI agent draft LinkedIn posts in André's voice for
the technology and AI community.
Use:
- /20-domains/writing/STYLE_GUIDE.md
- /20-domains/writing/voice-examples/*.md
- /20-domains/writing/hooks-that-worked.md
- Last 5 entries from /20-domains/writing/recent-posts.md
Avoid:
- /20-domains/engineering/* (different vocabulary)
- /10-projects/* (unless explicitly mentioned)
- Generic LinkedIn templates from external sources
- Em dashes (—). Never. Forbidden.
Sources of truth:
- Voice: informal, irreverent, Brazilian Portuguese
- Structure: strong opening hook, conclusion-first
- Length: 800-1500 characters
- Tone: critical, myth-busting, no flattery
Output expected:
- Single post, ready to publish
- No subtitle or headers inside the post
- No hashtag list at the end unless requested
- No em dashes under any circumstance
Confidence / validity:
- Style guide reviewed monthly. Last update: see file metadata.
- Voice examples should not be older than 90 days.
Source notes:
- /20-domains/writing/STYLE_GUIDE.md
- /20-domains/writing/hooks-that-worked.md
Each v2 Context Pack includes the Validation: field. The client agent records each use in /99-archive/pack-usage.log via a post-execution hook with a classified result (useful / partial / bad); three consecutive bad marks flag the pack for review. Additionally, when assembling the pack, a hook inspects the mtime of files listed under Use:, if any is older than 90 days (default, configurable per pack), the output gets a temporal validity warning. Bad packs have a finite half-life; old content deserves checking before becoming the basis for new decisions.
v1 and v2 of this guide called Hermes a core: first a "lightweight orchestrator", then an "active core with four roles". Field testing and the official Hermes documentation showed that this core does not exist as described. Hermes is a full code agent with its own memory, not a gatekeeper that mediates other agents. This section corrects that.
Separate three layers, because confusing them is the most common mistake: the model (the LLM that reasons), the code agent (Claude Code, Cursor, Codex, Hermes, which runs the loop and touches the files) and federated memory (the Markdown vault, passive and sovereign).
Memory is passive. It does not route, does not enforce policy, does not mediate anything. Code agents are interchangeable clients that read and write in the vault, guided by AGENT.md and by the folder structure. Hermes is one of them. None is the core. The four functions v2 assigned to a core still happen, but elsewhere.
The agent pulls what it needs, guided by the contract and by static versioned files. There is no central router deciding for it.
The conflict rule is not in a component, it is in the files. The most recent entry with status: approved wins; superseded stays in history and is ignored at runtime. Any agent reading the vault applies the same rule, because it lives in the data.
# Example decision in /70-decisions/
---
id: db-engine-postgres
date: 2026-03-15
status: approved
supersedes: [db-engine-mysql]
domain: engineering
owner: andre
---
# Decision: PostgreSQL as main database
## Context
MySQL does not support the JSONB and recursive CTE features we
need. We evaluated the migration 2 months ago.
## Decision
Postgres 16 as main database starting 2026-Q2.
The old decision with id: db-engine-mysql gets status: superseded and stays in history. Any agent assembling an engineering Context Pack ignores it, with no core needed for that.
Pack usage logging and memory capture come from the client agent's hooks, tool-agnostic (section 09c), not from a central manager. Capture via hooks is the main path, not an alternative to Hermes.
In cooperative mode, the write policy is the contract: reading open, permanent writing outside /90-inbox/ turned into a suggestion. There is no hermes.policy.yml with semantic triggers; that was an invention of the old positioning. Real enforcement, against a hostile agent, comes from the operating system, below the agent: a container with a read-only mount except for the inbox, or a separate user with no write access to protected folders.
Passive memory is not a weakness, it is the condition of sovereignty. A vault that needs an active core is tied to that core. A vault that is just files versioned in Git belongs to the user and works with any agent. When enforcement is needed, it comes from the operating system, never from the agent's side.
The biggest curation bottleneck is remembering to ask the agent to save something important. Automatic capture solves this with the code agent's own hooks, independent of any central core. Since Claude Code is one of the architecture's clients, it serves as a concrete example, but the same pattern applies to any agent that exposes lifecycle hooks.
The PostToolUse hook fires after a tool call and passes the event context, via stdin, to a script. The script applies simple heuristics to identify relevant patterns (decision made, bug resolved, preference identified) and appends a classified suggestion to /90-inbox/suggested-memory.md. The human does not need to ask: the inbox feeds itself, and human approval remains a quality filter proportional to risk.
Claude Code hooks are declared in .claude/settings.json, under the hooks key, the same file that holds permissions. There is no separate .claude/hooks.json. In the vault template, add the hooks block to template/.claude/settings.json:
{
"hooks": {
"PostToolUse": [
{
"matcher": ".*",
"hooks": [
{
"type": "command",
"command": "node scripts/capture-to-inbox.mjs"
}
]
}
]
}
}
The matcher uses a regex over the tool name; .* fires after any call.
The scripts/capture-to-inbox.mjs script uses simple heuristics (regex on common strings) to identify three patterns:
confidence: verified, risk: mediumconfidence: preference, risk: lowconfidence: verified, risk: lowIf no pattern is found, the script does nothing. There are no external dependencies beyond Node.js. The review ritual (scripts/review-inbox.{sh,ps1}) processes what the hook recorded, with TTL for verified+low and risk filtering.
Automatic capture does not eliminate human curation. It eliminates the bottleneck of remembering to ask. The inbox grows on its own. The review ritual remains the quality control mechanism, proportional to risk, not mandatory for everything.
The installation that matters is the vault: a Markdown folder turned into a Git repository, with AGENT.md and the folder structure. That is covered in the step-by-step of section 14, and depends on no agent. The code agents (Claude Code, Cursor, Codex, Hermes) are interchangeable clients that you point at this vault. This section shows how to connect a client, using Hermes as the example because it reads the contract natively. The same principle applies to the others, via adapters (section 08).
Hermes Agent is the NousResearch framework. It has a conversation loop, tool calling, its own memory and deployment in CLI, Telegram, Discord, WhatsApp and editors via Agent Client Protocol. What makes it a convenient client for this architecture is that it already reads files like AGENTS.md, SOUL.md and MEMORY.md when assembling the system prompt, so the federated contract loads with no glue code. Note that Hermes's own memory (MEMORY.md) is its own, not the vault's: it competes with sovereign memory if you let it. The goal is to point Hermes at the vault, not to let the vault become Hermes's internal memory.
# clone and setup
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
python -m venv .venv
source .venv/bin/activate
pip install -e .
# run from the vault, so it reads the contract
cd /path/to/your/federated-memory
hermes
Hermes reads context files from the working directory. Run Hermes from the vault root and keep an AGENTS.md there that points to your contract in 00-global/AGENT.md:
# AGENTS.md (vault root, read automatically by Hermes)
This is a federated memory base. Read the consumption protocol
before doing anything:
00-global/AGENT.md
Before responding to any task:
1. Identify the domain (writing, engineering, automation, etc).
2. Load the corresponding Context Pack from /60-context-packs/.
3. Never write to /00-global/ or /20-domains/ without explicit
human approval.
4. New memory suggestions go to /90-inbox/suggested-memory.md.
To feed the inbox with no manual action, use the client agent's hook-based capture, described in section 09c. In Claude Code that is a PostToolUse hook in .claude/settings.json. In any client the principle is the same: a script that writes suggestions to /90-inbox/suggested-memory.md, never directly to approved memory. You need nothing beyond that to start.
The client can choose and load context. The client does not decide on its own what becomes permanent truth. Permanent writing outside /90-inbox/ is a suggestion, not a write. If the client can be hostile, this is not guaranteed by contract: it is guaranteed by operating system isolation (section 09b).
The MCP server is optional. The client agent already reads the Markdown files directly through the filesystem, and the recommended floor (whitepaper section 05) does not require MCP. It enters when multiple external agents (Claude Desktop, Cursor, other MCP clients) need to query the same base without each implementing its own reader, exposing the vault through a standardized layer. There are two community-ready approaches.
Install the "Local REST API" plugin in Obsidian. The plugin starts a local HTTP endpoint. An MCP server connects to this endpoint and exposes standardized tools.
Typical exposed tools:
list_vault_files, lists vault filessearch_vault, search by contentcreate_vault_file, creates file (use with restriction in /90-inbox/)read_vault_file, reads a specific fileAdvantage. More reliable than pointing the agent directly at a directory, because tools are pre-defined. Disadvantage. Requires Obsidian to be open.
Implementations like obsidian-mcp-server access the vault directly on the filesystem, without needing Obsidian running. Works with the app closed, on a server, or via SSH.
Advantage. App-independent. Disadvantage. No access to Obsidian features (Dataview, plugins, resolved links), only raw Markdown.
{
"mcpServers": {
"obsidian-memory": {
"command": "node",
"args": [
"/caminho/para/obsidian-mcp-server/build/index.js",
"/caminho/para/federated-memory"
]
}
}
}
Exposing memory as a tool also exposes risk. Configure write permissions only for /90-inbox/. The folders /00-global/, /20-domains/ and /10-projects/ should be read-only for any agent. Interoperability without governance becomes an attack surface.
The agent never writes permanent memory. When it identifies something that deserves to become truth, it writes a suggestion to the inbox. The human reviews, approves, edits or rejects. Only after that does the memory enter the stable base.
So far everything runs locally. This works well for one machine. When you need two or more agents, on different machines, to access the same vault, the architecture needs a central point. That point is a VPS.
The good news: the separation of concerns in federated memory already solves half the problem. The vault is just files. The MCP server is just a process. Moving to a remote server does not change the contracts, only where the processes run.
| Component | Where it runs | Why |
|---|---|---|
| Vault (markdown files) | VPS | Single source of truth. No version conflict. |
| MCP Server | VPS (local port) | Never directly exposed. Access via tunnel. |
| Hermes Agent | VPS ou local | If running on VPS, no transfer latency. |
| Claude Code | Local machine | Human interface. Connects via SSH tunnel. |
| Git remote | GitHub | Synchronization and history. Not the primary vault. |
The vault is a regular Git repository. Each machine that needs a local copy does clone and pull. The VPS is the origin.
# On the VPS, initial setup
git init --bare /vault-remote.git
# On the local vault, point to the VPS as origin
git remote add origin ssh://user@your-vps:/vault-remote.git
git push -u origin main
# On another machine, clone the vault
git clone ssh://user@your-vps:/vault-remote.git ~/vault
# Sync routine (before working)
git pull origin main
# After inbox approvals
git add 00-global/ 10-projects/ 20-domains/ 60-context-packs/
git commit -m "feat: approved memory, [domain]"
git push origin main
Run git log --oneline -5 on the VPS and on another machine. The hashes must be identical. If they are not, there is divergence, resolve before continuing.
The MCP server should not be exposed directly to the internet. Two options:
MCP runs on localhost on the VPS. You create a tunnel that maps the local port to your machine.
# Create tunnel: port 3000 on the VPS appears as localhost:3000 on your machine
ssh -L 3000:localhost:3000 user@your-vps -N
# To keep it in the background
ssh -L 3000:localhost:3000 user@your-vps -fN
# To close
kill $(lsof -t -i:3000)
With the tunnel active, Claude Code sees the remote MCP as if it were local. No additional configuration.
If more than one machine needs to access MCP simultaneously, put a reverse proxy (nginx or Caddy) in front with TLS and basic authentication.
# Caddy, minimal configuration (~/.caddy/Caddyfile)
mcp.yourdomain.com {
basicauth /* {
agent $2a$14$HASH_GENERATED_WITH_caddy_hash
}
reverse_proxy localhost:3000
}
Do not expose the MCP server directly on a public port without authentication. MCP has read access (and inbox write access) to the entire vault. An open endpoint is an open vault.
With the SSH tunnel active, Claude Code does not know it is talking to a remote server. Configure normally pointing to localhost.
# .claude/settings.json (on the local machine)
{
"mcpServers": {
"obsidian-vault": {
"command": "npx",
"args": ["-y", "mcp-obsidian"],
"env": {
"OBSIDIAN_API_URL": "http://localhost:3000",
"OBSIDIAN_API_KEY": "your-token-here"
}
}
}
}
Completion criterion: run /mcp in Claude Code and see the server listed as connected. Ask the agent to read a file from the vault. If it returns, the tunnel is working.
| What to protect | How |
|---|---|
| SSH access to VPS | SSH key mandatory. Disable password login (PasswordAuthentication no in sshd_config). |
| MCP server token | Environment variable, never hardcoded. Use .env outside the vault. |
| Writing to vault | MCP server with write permission restricted to /90-inbox/. Main vault read-only for the agent. |
| Vault in Git | Private repository. Check .gitignore to avoid committing .env or tokens. |
| MCP port | Never open on VPS firewall. Only accessible via tunnel or authenticated proxy. |
# Verify that port 3000 is NOT publicly exposed
# On the VPS:
ss -tlnp | grep 3000
# Should show: 127.0.0.1:3000. If it shows 0.0.0.0:3000, block it in the firewall
# UFW (Ubuntu)
ufw deny 3000
ufw allow 22 # SSH stays open
Harness Engineering is the discipline of connecting the agent to the environment in a controlled way: which tools it can use, how output is captured, how failures are handled. In federated memory, the harness is not a separate piece. It is distributed across the architecture.
| Component | Harness function |
|---|---|
MCP server (optional) | When present, exposes tools to the agent in a standardized way. Without it, the agent reads the files directly and the contract plus the hooks do the job |
AGENT.md | Defines behavior rules (declarative harness), the agent reads and applies before any action |
| Adapters | Translate the neutral AGENT.md contract for each specific agent (Claude, Cursor, Windsurf...) |
capture-to-inbox.mjs | Captures agent output via PostToolUse hook, fills the inbox automatically without manual intervention |
SESSION.lock | Controls concurrent access per project, identifies agent, machine and user in each session |
review-inbox | Processes what the agent produced and decides the destination, complete audit trail in /99-archive/review-log.md |
/90-inbox/, with OS hardening in adversarial mode (section 09b)SESSION.lock)session-log.md (who did what, when) and review-log.md (what was promoted or rejected)/30-clients/ only accesses client tools; an agent in /20-domains/engineering/ only accesses code tools. The access layer (direct filesystem or MCP, when in use) exposes different sets depending on the input context.It is about making the environment predictable enough for the agent to be reliable. Clear rules produce consistent behavior. An agent in an ambiguous environment is not more capable than one in a well-defined environment, just less predictable.
As you add specialized agents to the ecosystem, a new problem arises: each agent learns useful things, creates skills, writes playbooks, makes technical decisions. If this knowledge stays isolated in each agent's private memory, the others will reinvent the wheel indefinitely.
The hive mind solves this with a federated layer for reading and discovering knowledge between agents. Each agent can consult what others have published, without having access to anyone's private memory.
| Level | Where | Who accesses | What it contains |
|---|---|---|---|
| Private memory | ~/.hermes/ or internal equivalent | Only the agent itself | Session learnings, operational preferences, internal history |
| Published knowledge | /50-skills/published/ | Any agent via MCP (read) | Skills, playbooks, templates, patterns approved or promoted by TTL |
| Approved knowledge | /70-decisions/ and /20-domains/ | Any agent via MCP (read) | Formal decisions, domain principles, global rules |
/50-skills/INDEX.md, search by tag or domain to avoid duplication/90-inbox/ with type: skill and complete metadata (author_agent, domain, confidence, risk, tags)confidence + risk:
verified + low → automatically promoted to /published/ in 7 daysverified + medium → lazy human approvalhypothesis → stays in /proposed/ awaiting human decisionhigh risk → explicit approval required/90-inbox/supersedesINDEX.md is consulted before any new creation50-skills/
INDEX.md # navigable index by domain and agent
README.md # protocol, mandatory metadata, rules
published/ # approved skills, read-only for agents
proposed/ # skills pending review
deprecated/ # old skills, never delete, keep traceability
It is accumulated knowledge that any agent can discover and reuse. The federated vault is the shared board. The blackboard pattern at scale, not as real-time memory, but as a repository of persistent learning.
Every tool has undocumented behaviors, known bugs and workarounds you discover in practice. Without structured memory, the agent repeats the same diagnosis every session, and you pay the cost in time and frustration. Worse: the workaround you discovered in one project does not reach the next.
The solution: a per-tool pattern ledger in /50-skills/tool-patterns/, separate from project and client context. What the agent learns about Brandcraft in one project applies to all others.
| Tier | Occurrence | Status | What happens |
|---|---|---|---|
| 1 | 1st time | observed |
Agent records symptom in inbox with type: tool_pattern. Problem documented, optional workaround. |
| 2 | 2nd time | auto_fix |
Agent finds the pattern in tool-patterns/ and applies the documented fix without asking. |
| 3 | 3rd+ time | root_cause_pending |
Trigger for root cause analysis. The problem is too recurrent to be just a workaround. |
Escalation is automatic: node scripts/escalate-patterns.mjs reads the inbox, increments the counter and updates the status. On the 3rd occurrence, the script explicitly alerts that human analysis is needed.
This is the distinction that makes the mechanism useful. Project memory stays in /10-projects/. Client memory in /30-clients/. Tool patterns stay in /50-skills/tool-patterns/, available to any project, without contaminating any specific context.
50-skills/
├── published/ ← approved skills (procedures)
├── proposed/ ← skills awaiting approval
├── deprecated/ ← history
├── tool-patterns/ ← per-tool ledger ← NEW
│ ├── README.md
│ ├── brandcraft.md ← 1 file per tool
│ └── puppeteer.md
└── INDEX.md
---
type: tool_pattern
tool: brandcraft
symptom: API returns rate limit in non-standard header X-BrandCraft-Limit-Remaining
fix: check X-BrandCraft-Limit-Remaining before X-RateLimit-Remaining
confidence: verified
risk: low
---
node scripts/escalate-patterns.mjs creates tool-patterns/brandcraft.md with status: observed.tool-patterns/brandcraft.md, finds status: auto_fix and applies the documented fix without re-diagnosing.root_cause_pending and warns that the problem needs to be solved at the source, not just worked around.Do not confuse with user preferences (preference), project facts (fact) or formal decisions (decision). Tool patterns are specific to external tool behaviors, not user choices, not business context.
Because the cost of intervention should scale with frequency. The 1st time, the context is still rare, documenting is enough. The 2nd time, it is already recurring, automation is justified. The 3rd time, it is a systemic pattern, it requires root cause investigation, not another workaround.
Graphiti does not replace Obsidian. It enters when memory stops being just notes and starts requiring history, relations, events and changes. Some memories are stable facts. Others change over time. This difference is critical.
Note that Graphiti does not duplicate the memory: it indexes the vault's content to answer temporal and relational questions that the files alone do not answer efficiently. The source of truth remains the Markdown; Graphiti is a derived index, not a substitute.
Cases where Graphiti justifies the effort:
If your memory is predominantly stable facts (writing style, principles, patterns), Graphiti is overhead. Add it when temporal questions start appearing frequently and you feel Obsidian alone does not answer them.
Some features you would put in Graphiti already exist in tools focused only on decisions. DecisionNode (github.com/decisionnode/DecisionNode) is one example: it stores decisions as structured JSON with vector embeddings, exposes them via CLI and an MCP server, has history tracking, soft-delete (deprecate/activate) and automatic conflict detection by semantic similarity.
It is a sub-system, not an alternative to federated architecture. It solves the decision module with semantic search; it does not cover Context Packs, domains, style guides or project context. But if your pain is specifically "agents do not remember architecture decisions made last week", DecisionNode can solve this before you need to spin up Graphiti.
Obsidian as the general base. DecisionNode for the structured decision sub-module with semantic search. Graphiti only when you need entity relations and change history at the temporal graph level. No need to spin up everything at once.
The steps below are sequential. Each has an objective completion criterion.
mkdir federated-memory
cd federated-memory
mkdir -p 00-global 10-projects 20-domains 30-clients 40-workflows \
50-skills 60-context-packs 70-decisions 80-agent-adapters \
90-inbox 99-archive
git init
Done when: the folder structure exists and the git repository is initialized.
Copy the template from section 07 to 00-global/AGENT.md and adjust the domain list to yours.
Done when: the file exists, lists your real domains, and you can explain each rule out loud.
Create a folder for each domain in 20-domains/. Use the criterion from section 06: own vocabulary + outputs that should not mix with others.
Done when: you have between 3 and 6 domains. More than 8 is overhead. Fewer than 3 almost never justifies federated memory.
Crie em 60-context-packs/:
writing-style.md, writing tasks in your stylecode-work.md, code tasks with your conventionsarchitecture-review.md, architecture analysisUse the filled example from section 09 as a model. Each should fit on one screen.
Done when: the three packs exist, each has a goal, Use and Avoid lists, and the temporal validity field.
If using Claude Code, create 80-agent-adapters/claude/CLAUDE.md pointing to AGENT.md and a default Context Pack. Use the template from section 08. Also configure the capture hook in .claude/settings.json (PostToolUse running scripts/capture-to-inbox.mjs, per section 09c). For other clients, the adapter follows the same principle: point to the contract and wire a tool-agnostic capture hook.
Done when: you can run the agent in the vault directory, it reads the adapter automatically, and relevant actions produce suggestions in /90-inbox/suggested-memory.md without manual intervention.
Skip this step if you are not going to use Hermes. The floor from steps 1 to 5 is already enough for a single client. To connect Hermes as an additional client, use the instructions from section 10 and add an AGENTS.md at the vault root referencing 00-global/AGENT.md.
Done when: when running hermes inside the directory, it responds respecting the "do not load everything" rule and using Context Packs.
Skip this step if you do not have multiple external agents consuming the same base. If you need it, choose between Option A (Local REST API) or Option B (direct filesystem) from section 11. Configure in your preferred MCP client. Restrict writing to /90-inbox/.
Done when: an external agent (Claude Desktop, Cursor, ChatGPT with MCP) queries your vault and respects the write limits.
Perform the same task with two different agents consuming the same memory. Check three things:
Done when: all three answers are "yes". If any is "no", the problem is in the Context Pack, not the agent.
Only execute this step when you feel temporal pain: decisions replacing decisions, questions about history, information validity. Before that, it is over-engineering.
The biggest risk of persistent memory is not lack of information. It is excess information with undue authority. Every architecture needs to separate raw conversation, hypothesis, draft and permanent memory.
/90-inbox/.Human approval only works if there is a set time to do it. Without a fixed ritual, /90-inbox/ becomes a trash bin: suggestions pile up, nothing becomes memory, and agents keep repeating the same questions. Set aside a short weekly block, 20 to 40 minutes usually suffices, to process the inbox to zero.
The repository ships two equivalent scripts in /scripts/ that walk through the review entry by entry:
# Linux / macOS
VAULT_PATH=/path/to/vault ./scripts/review-inbox.sh
# Windows (PowerShell)
$env:VAULT_PATH = 'C:\path\to\vault'
./scripts/review-inbox.ps1
For each pending suggestion the script shows the block and offers four decisions:
Suggested destination and removes from inbox.$EDITOR or notepad), adjusts text or destination and asks again.Every decision, including rejections and deferrals, is logged in /99-archive/review-log.md with date, domain, destination and summary. This log is what allows, weeks later, seeing how many suggestions became memory, how many were discarded and where agents err most frequently. If the rejection rate is high, the problem is not the inbox: it is the agent's capture rule.
| Anti-pattern | Symptom | Antidote |
|---|---|---|
| Single super brain | Marketing pulls code, code pulls research, everything mixes | Isolated domains from day one |
| Always-on context | Slow agent, hallucination, out-of-scope answers | Context Packs per task |
| Automatic memory without review | Hypotheses turn into facts, the base degrades in weeks | Mandatory inbox, human approval |
| Adapter as the primary source | Switching tools requires rewriting everything | AGENT.md is the source, the adapter only translates |
| Powerful orchestrator too early | Hermes/LangGraph doing things no one understands | Start with direct reading, add orchestration when it hurts |
| Unscoped MCP/API | Any agent writes to any folder | Write only to /90-inbox/, the rest read-only |
The reference implementation uses a Markdown vault versioned with Git. The architecture, however, was designed to be portable, and any part can be swapped without touching the contract.
Alternatives to Obsidian. Logseq, Foam, Dendron, VSCode with Markdown folders, bare Git repository. The requirement is to be readable, versionable, and easy for humans to review.
Other client agents. LangGraph for stateful flows with explicit graphs, CrewAI for role-based agent teams, Letta for persistent agents, Microsoft Agent Framework / AutoGen for multi-agent, Hermes. The requirement is to classify the task, choose the domain, select the Context Pack, and deliver context without becoming the owner of memory.
Access layer. Direct file reading, local CLI, REST API. MCP is recommended when multiple agents need to consume memory in a standardized way.
Machine synchronization. Git natively solves cross-machine sync: push on one machine, pull on another. Proprietary sync (iCloud, Dropbox, Notion) works but adds an intermediary that may conflict with Git's resolution model. The recommended path is Git + remote repository.
For the decisions sub-module: DecisionNode. If you want to replace the files in /70-decisions/ with something that has semantic search and automatic history, DecisionNode (github.com/decisionnode/DecisionNode) does exactly that: decisions in JSON, embeddings via Gemini, cosine similarity search, MCP server ready for Claude Code, Cursor, and Windsurf, conflict detection, reversible soft-delete. The requirement for using it as a replacement remains the same: it cannot write permanent memory without human approval, and the status: approved/superseded rule must be respected. In DecisionNode, this means configuring the agent in "strict" mode and reviewing what enters before activating.
There is a category of projects that takes the inverse approach to federated: concentrate memory, automation, integrations, and agents into a single monolithic application. They are often called "Personal AI", "Life OS", or "productivity super-app". They win on surface usability: ready-made UI, packaged integrations, visible community, near-zero adoption curve.
The cost appears later. Memory lives inside the application, in the application's format. Switching tools means starting from scratch. The architecture is, by design, centralizing, exactly the anti-pattern that motivated this guide. There is no neutral contract; the adapter is the product. There is no exposed mtime; temporal validity inspection requires manual review, the client agent, or a hook, does here what the application does not expose. The conflict rule is defined by the UI, not by an auditable file.
| Criterion | Federated memory | Centralizing systems / Life OS |
|---|---|---|
| Adoption speed | Slow, requires learning the contract | Fast, install, connect, use |
| Portability | Total (Markdown + Git) | Locked to app |
| Sovereignty | User | Product |
| Temporal validity by inspection | Yes (file mtime) | Not available |
| Switching cost | Swap adapter | Start over |
For those who prioritize adoption speed over sovereignty, the centralizing path is the right choice. For those who prioritize portability and governance, it is a problem dressed as a solution.
You can swap tools. You cannot swap the principles: separate domains, neutral contract, lean context, human approval, write policy restricted to inbox, deterministic conflict resolution, portability.
# Decision: [short title]
Date: YYYY-MM-DD
Scope: [domain or project]
Status: proposed | approved | superseded | archived
Source: [conversation, meeting, document]
Decision: [the decision in one sentence]
Reason: [why this was decided]
Impacts: [what changes]
Supersedes: [link to previous decision, if any]
Related context packs: [packs that should reflect this decision]
# Project: [name]
Goal: [what this project should solve]
Current status: [where it stands now]
Stack: [technologies, tools]
Constraints: [technical limits, deadline, budget]
Important files: [relevant paths]
Relevant domains: [which memory domains apply]
Relevant context packs: [which packs to use]
Open questions: [what is still undecided]
Do not use: [what NOT to bring into this project]
# Memory Update Proposal
Source: [conversation, agent, document]
Suggested domain: [target domain]
Suggested memory type: fact | decision | preference | workflow | risk
Confidence: low | medium | high
Why this should become memory: [justification]
Proposed text: [the exact text to become memory]
Human decision: approved | edited | rejected
Reviewer: [name]
Decision date: YYYY-MM-DD
AGENT.md without reference to specific tools?CLAUDE.md, codex.md, etc.) working?AGENT.md or adapter) automatically on startup?/90-inbox/, without writing permanent memory on its own?/90-inbox/suggested-memory.md exist, receiving agent suggestions?/00-global/ or /20-domains/?/90-inbox/? (optional item, the floor does not require MCP)The references below anchor the technical components. The guide does not depend on a specific tool, but uses concepts and patterns from these ecosystems.
Persistent memory is only useful when isolation exists. Without isolation, the system gets large, slow and confusing. With separate domains, neutral contract, Context Packs and blackboard as coordination pattern, memory becomes infrastructure. Not accumulated noise.
Final thesis. Do not create a super brain. Create a federated memory: human at the origin, segmented by domain, with a neutral contract as the interface and Git as the spine. Client agents read the contract, write to the inbox and never become the memory's owner. They do not need to talk to each other. The shared board is already the coordination.
A note on convergence. Tools like DecisionNode, developed independently, reached conclusions similar to this guide: structured decisions, multi-agent via MCP, auditable history, implicit blackboard pattern. This is not a coincidence. It is a sign that the problem is real and the architectural direction is correct. The market is converging toward federated memory. This guide tries to name the pattern before it becomes just a product feature.
Federated Memory for AI Agents · Guide v3.0
André Almeida · andrealmeidadc.com