Federated Memory for AI Agents

What changed in v3.0

v2 sold Hermes as an active core with four roles (router, memory manager, scope controller, policy engine). The field test and Hermes's own official documentation showed that design did not exist: Hermes was never an orchestrator, it is a client. v3 repositions. Memory is a passive file, the client agent is an interchangeable piece, Git is the spine that unifies versioning and synchronization. Governance in two declared modes: cooperative (contract, the recommended floor) and adversarial (enforcement through operating system isolation, below the agent). Capture via the agent's own hooks (PostToolUse running scripts/capture-to-inbox.mjs), depositing into /90-inbox/. Risk-proportional auditing: verified + low self-promotes by TTL; hypotheses and high risk require human decision. MCP and Graphiti become optional layers under volume pain, not floor components.

01 The problem is not AI, it is memory
02 What this guide builds
03 Architectural principles
04 The blackboard pattern
05 Implementation components
06 Obsidian structure
06b Multimodal, assets in the vault
07 AGENT.md: the neutral contract
08 Adaptadores por ferramenta
09 Context Packs with a real example
09b The agent as client
09c Automatic capture, inbox without manual intervention
10 Installation: connecting a client agent to the vault
11 Obsidian via MCP in practice
12 Operational flow
13 Graphiti: when it enters
14 Executable step by step
15 Minimum governance
16 Anti-patterns
17 Possible substitutions
18 Ready-made templates
19 Validation checklist
20 References and closing

01 / StorytellingThe problem is not lack of AI, it is lack of reliable memory

Every AI tool looks brilliant in the first few minutes. It understands the request, writes code, suggests campaigns, summarizes meetings and even seems to have followed the project from the start. The problem shows up in the second, third or tenth session: the AI forgets decisions, mixes context, repeats questions and needs to be re-educated constantly.

The natural reaction is to build a "super brain": dump everything in one place and let the agent read it. At first this seems to solve it. Then the system degrades. A marketing task pulls in coding conventions. One project contaminates another. Old memories enter as if they were still true. The agent knows too much and, precisely because of that, starts making better mistakes.

This guide proposes a different path: federated memory. Separate domains, readable files, neutral contract, context packages and controlled routing. It is not about letting an agent run the house. It is about creating a base any agent can consult without messing up the environment.

Summary in one sentence

The Markdown vault stores the reviewed memory, with Git as the spine for versioning and sync. Context Packs deliver the minimum sufficient context. The contract guides reading. Agents execute as interchangeable clients, without needing to talk to each other. MCP and Graphiti are optional evolution, when the pain appears.

02 / ScopeWhat this guide builds

Reference implementation using a Markdown vault versioned in Git. The architecture, however, is not locked to any tool. The goal is a portable memory layer for any current or future agent.

Knowledge base in Markdown, human-readable and versionable in Git.
Vaults or folders separated by domain, project or work area.
AGENT.md as a neutral contract, with no dependency on Claude Code, Codex, Cursor or any executor.
Context Packs from the start, because memory only becomes useful when delivered as lean context.
Short per-tool adapters (CLAUDE.md, codex.md, Cursor rules).
Git as the spine of versioning and cross-machine synchronization.
MCP server as an optional layer, exposing the vault in a standardized way for multiple agents.
Graphiti as a temporal upgrade when relations and changes over time matter.

What this guide is NOT

It is not a closed product. It is not a framework that needs to control all projects. It is not a proposal for automatic memory without review. Permanent memory must pass through human curation.

03 / FoundationArchitectural principles

Before the principles, separate three layers, because confusing them is the most common mistake: the model (the LLM that reasons), the code agent (Claude Code, Cursor, Codex, Hermes, which runs the loop and touches the files) and memory (the persistent, sovereign base that belongs to neither). The federated architecture exists to keep memory outside the agent.

Sovereign memory. The base belongs to the user. Agents come and go; context stays portable. Git is the spine: the vault is a repository, which delivers versioning, authorship, diff and cross-machine synchronization in a single piece.

Isolated domains. Marketing, programming, clients, products and research should not live in the same semantic soup. Isolation is structural, by folders, not an instruction the agent can ignore.

Neutral contract. AGENT.md describes how any agent should consume the memory. Adapters only translate for specific tools. No agent is the core; all are interchangeable clients.

Minimum sufficient context. The agent should receive what is needed for the task, not the entire history of your digital life.

Human as risk-proportional auditor. Capture is automatic; approval is not mandatory for everything. Verified low-risk entries promote on their own with TTL; hypotheses and high risk require human decision. The human is a quality filter, not a capture bottleneck.

Governance in two modes. In cooperative mode, the default, the rule lives in the contract and in the structure, and that is enough. Against an agent that may be hostile, the contract is not enough: enforcement comes from the operating system, below the agent (a read-only container except for the inbox, or a separate user). Never from a core on the agent's side.

04 / Core insightThe blackboard pattern: why agents do not need to talk

A common confusion in multi-agent architectures is assuming agents need to talk to each other. They do not. In a well-designed federated memory, coordination is asynchronous via shared state. This pattern has a name: blackboard architecture.

Think of a classroom blackboard. Different people write on it at different times. Whoever enters later reads what is there. Nobody needs to be present when another wrote. Coordination happens through the board.

Figure 1, Asynchronous coordination via blackboard

Practical implications of this choice:

There is no inter-agent communication protocol to design, debug or maintain.
No agent needs to be running when another queries the memory.
Swapping one agent for another does not break the system: the new one reads the same board.
Conflicts are resolved by a human via inbox, not by automatic consensus.
The risk of error cascade between agents disappears because there is no direct propagation.

Practical translation

When you use Claude Code in the morning and Codex at night, only the executor changes. The curated memory is the same. The night agent reads what was approved in the morning. No real-time orchestration. No message queue. No broker.

05 / AnatomyImplementation components

Figure 2, Reference architecture layers

Component	Function	Replaceable by
Obsidian	Human-reviewed base in Markdown	Logseq, Foam, bare Git repository
Vaults / pastas	Isolate domains and reduce contamination	Subfolders in the same vault
AGENT.md	Neutral consumption contract	No. It is the portability core.
Context Packs	Minimum context per task	Curated RAG, modular prompts
Adaptadores	Contract translation for the tool	Specific to each agent
Git	Versioning, authorship and sync of the vault	Essential. No substitute delivers all of it together.
MCP server	Standardized multi-agent access	Direct reading, CLI, REST API
Graphiti	Temporal memory and relations	Add only when it hurts

06 / StructureHow to organize Obsidian

Obsidian is the base because it works well with Markdown, links, backlinks, tags and human navigation. The v2 template uses eleven numbered folders, optimized for direct reading by client agents and for growing without becoming a mess.

Single vault, logical separation

In this guide, we use "vaults" in the sense of memory domains. The default implementation uses a single Obsidian vault with folders separated by domain. When there is a need for strong isolation, security or selective sharing, these domains can be promoted to separate physical vaults.

/federated-memory
├── 00-global/           # AGENT.md contract, general rules
│
├── 10-projects/         # active projects (start, middle, end)
│   └── project-a/
│       ├── PROJECT.md
│       ├── notes/
│       └── deliverables/
│
├── 20-domains/          # stable domains (engineering, writing, research...)
│   ├── engineering/
│   ├── writing/
│   └── research/
│
├── 30-clients/          # clients, candidate for separate physical vault
│   └── <cliente>/
│
├── 40-workflows/        # repeatable flows (release, incident, review...)
│
├── 50-skills/           # reusable capabilities
│
├── 60-context-packs/    # context packages per task
│   ├── linkedin-writing.md
│   ├── code-review.md
│   ├── research.md
│   └── planning.md
│
├── 70-decisions/        # formal decisions with approved/superseded status
│
├── 80-agent-adapters/   # per-agent adapters
│   ├── claude/{CLAUDE.md, AGENTS.md}
│   ├── cursor/.cursorrules
│   ├── codex/AGENTS.md
│   └── windsurf/.windsurfrules
│
├── 90-inbox/            # suggestions pending human review
│   └── suggested-memory.md
│
└── 99-archive/          # logs, obsolete packs, archived
    ├── review-log.md
    ├── pack-usage.log
    └── pack-status.md

Criteria for creating a new domain

Objective criteria: create a domain when there is vocabulary, decisions or output style that should NOT appear in another domain's response. If "writing" uses "voice, hook, lead" and "engineering" uses "stack, latency, deploy", and crossing them confuses the agent, they are two domains. If two areas use the same vocabulary and make compatible decisions, it is a folder within a single domain, not a new domain.

When to promote to a separate physical vault

A separate physical vault makes sense

When there is a contractual isolation requirement (NDA, sensitive client data), independent synchronization (a different cloud account), or selective read sharing without exposing the rest. The typical case is 30-clients/.

A separate physical vault is overkill

When you are just starting, when domains exchange context frequently, or when you are a single person using a single machine. Start with folders in the same vault; promote later if the pain justifies it.

06b / MultimodalAssets in the vault

The vault is Markdown by default. But multimodal support does not require changing the architecture, it only requires a convention for where assets live and how they are referenced.

Multimodal support is a capability of the LLM configured in the agent, not of the agent itself. The agent passes the file, the LLM processes it. Check the provider's updated documentation:

Anthropic: docs.anthropic.com
Google: ai.google.dev
OpenAI: platform.openai.com/docs
Zhipu/GLM: docs.z.ai
MoonshotAI/Kimi: platform.moonshot.cn
xAI/Grok: docs.x.ai

All major LLMs today support at least images and PDFs. Audio and video vary by provider.

Asset structure in the vault:

/federated-memory
  /20-domains/
    /engineering/
      /assets/
        arquitetura-v2.png
        decisao-banco.pdf
    /writing/
      /assets/
        exemplos-visuais/
  /10-projects/
    /project-a/
      /assets/
        wireframes/
        screenshots/
        bugs/
          bug-001/
            screenshot-erro.png
            tentativa-01-failed.md
            tentativa-02-success.md

Markdown reference rule: every asset is referenced in the .md file of the same domain or project. The Context Pack points to the .md. The agent loads the asset when the .md references it.

Example in a DECISION.md:

Reference diagram: ./assets/arquitetura-v2.png
The agent should load this image when analyzing stack decisions.

Rule for large assets

Videos, datasets and files above 10MB should not live in the vault. Reference by URL or external path. The vault must remain lightweight and versionable.

07 / ContractAGENT.md: the most important piece

The AGENT.md is not a giant prompt. It is the contract that teaches any agent how to consume the memory without making a mess. It lives in 00-global/AGENT.md.

# AGENT.md

Purpose:
This repository contains the federated memory used by AI agents.
The memory is owned by the human user. Agents are consumers,
not owners.

Rules:
1. Do not load the entire memory base.
2. Start from the relevant Context Pack in /60-context-packs/.
3. If no Context Pack exists, ask which domain is relevant.
4. Permanent writes are forbidden outside /90-inbox/ in any
   execution mode (interactive, headless, scheduled).
5. Memory conflicts: the most recent entry with status: approved
   wins. Entries with status: superseded stay in history but are
   ignored at runtime. Never infer winner by file timestamp alone.
6. When unsure, create a suggested memory entry in
   /90-inbox/suggested-memory.md instead of guessing.

Folders:
- 00-global, 10-projects, 20-domains, 30-clients, 40-workflows,
  50-skills, 60-context-packs, 70-decisions, 80-agent-adapters,
  90-inbox, 99-archive

Decision frontmatter (in /70-decisions/):
- id, date, status (approved | superseded | pending), supersedes,
  domain, owner

Why this works

Hermes Agent loads context files like AGENTS.md and .hermes.md when assembling the system prompt. If you place your AGENT.md in the path Hermes reads, or reference it from its AGENTS.md, your rules apply automatically.

Hierarchical global rules: RULES.md

AGENT.md defines how the agent behaves. RULES.md defines the business and stack rules that apply to every project. They are two distinct files with distinct responsibilities: agent behavior versus developer and company standards.

Create 00-global/RULES.md with two blocks:

# RULES.md
# Global rules, apply to all projects.
# To override a rule for a specific project,
# register an override in that project's 70-decisions/.

## Company Rules
- stack: TypeScript, never MongoDB
- tests: TDD mandatory
- security: never commit secrets, use environment variables

## Dev Rules
- commits: English, imperative, max 72 characters
- PR: never larger than 400 lines
- review: self-review before opening PR

The agent loads RULES.md every session, right after AGENT.md. This is already instructed in AGENT.md via the ## Global Rules section.

Project-level override

When a project needs to deviate from a global rule, the deviation is recorded in the project's 70-decisions/ with required fields. The RULES.md is never changed.

---
id: DEC-OVERRIDE-001
date: 2026-06-01
approved-by: André
status: approved
rule-override: company/stack/mongodb
---

# Override: MongoDB usage in project X

## Reason
Client requires MongoDB by contract. Migration planned for Q3 2027.

## Scope
Valid for this project only. Does not alter RULES.md.

## Review
review_date: 2026-12-01

Context hierarchy

The agent applies rules in this order: Company Rules (broadest, most stable) > Dev Rules > Project rules. An approved override in 70-decisions/ takes precedence over RULES.md. The agent executes without questioning, the deviation has already been decided and documented.

08 / AdaptersPer-tool translation

Each agent has its own context file convention. The adapter is short. It does not replace AGENT.md. It just points to it.

Figure 3, Adapters per agent

Example adapter for Claude Code:

# CLAUDE.md

Read the shared memory protocol at:
../00-global/AGENT.md

For this project, start with:
../10-projects/project-a/PROJECT_CONTEXT.md

Use Context Packs before reading raw notes:
../60-context-packs/architecture-review.md

Do not modify permanent memory directly.
If a new memory seems useful, write a suggestion to:
../90-inbox/suggested-memory.md

09 / PackagesContext Packs with a filled example

Without a Context Pack, the agent needs to dig through the vault. With a Context Pack, it receives a lean, task-oriented package. Reduces noise, improves consistency, prevents cross-domain contamination.

Figura 4, Anatomy of a Context Pack

Filled example: `linkedin-writing.md`

# Context Pack: linkedin-writing

Goal:
Help an AI agent draft LinkedIn posts in André's voice for
the technology and AI community.

Use:
- /20-domains/writing/STYLE_GUIDE.md
- /20-domains/writing/voice-examples/*.md
- /20-domains/writing/hooks-that-worked.md
- Last 5 entries from /20-domains/writing/recent-posts.md

Avoid:
- /20-domains/engineering/*  (different vocabulary)
- /10-projects/*  (unless explicitly mentioned)
- Generic LinkedIn templates from external sources
- Em dashes (—). Never. Forbidden.

Sources of truth:
- Voice: informal, irreverent, Brazilian Portuguese
- Structure: strong opening hook, conclusion-first
- Length: 800-1500 characters
- Tone: critical, myth-busting, no flattery

Output expected:
- Single post, ready to publish
- No subtitle or headers inside the post
- No hashtag list at the end unless requested
- No em dashes under any circumstance

Confidence / validity:
- Style guide reviewed monthly. Last update: see file metadata.
- Voice examples should not be older than 90 days.

Source notes:
- /20-domains/writing/STYLE_GUIDE.md
- /20-domains/writing/hooks-that-worked.md

Validation: the feedback v1 was missing

Each v2 Context Pack includes the Validation: field. The client agent records each use in /99-archive/pack-usage.log via a post-execution hook with a classified result (useful / partial / bad); three consecutive bad marks flag the pack for review. Additionally, when assembling the pack, a hook inspects the mtime of files listed under Use:, if any is older than 90 days (default, configurable per pack), the output gets a temporal validity warning. Bad packs have a finite half-life; old content deserves checking before becoming the basis for new decisions.

09b / The Agent as ClientMemory is passive, the agent is interchangeable

v1 and v2 of this guide called Hermes a core: first a "lightweight orchestrator", then an "active core with four roles". Field testing and the official Hermes documentation showed that this core does not exist as described. Hermes is a full code agent with its own memory, not a gatekeeper that mediates other agents. This section corrects that.

Separate three layers, because confusing them is the most common mistake: the model (the LLM that reasons), the code agent (Claude Code, Cursor, Codex, Hermes, which runs the loop and touches the files) and federated memory (the Markdown vault, passive and sovereign).

Memory is passive. It does not route, does not enforce policy, does not mediate anything. Code agents are interchangeable clients that read and write in the vault, guided by AGENT.md and by the folder structure. Hermes is one of them. None is the core. The four functions v2 assigned to a core still happen, but elsewhere.

1. Read routing: the agent itself

The agent pulls what it needs, guided by the contract and by static versioned files. There is no central router deciding for it.

2. Conflict resolution: structural, in the data

The conflict rule is not in a component, it is in the files. The most recent entry with status: approved wins; superseded stays in history and is ignored at runtime. Any agent reading the vault applies the same rule, because it lives in the data.

# Example decision in /70-decisions/
---
id: db-engine-postgres
date: 2026-03-15
status: approved
supersedes: [db-engine-mysql]
domain: engineering
owner: andre
---

# Decision: PostgreSQL as main database

## Context
MySQL does not support the JSONB and recursive CTE features we
need. We evaluated the migration 2 months ago.

## Decision
Postgres 16 as main database starting 2026-Q2.

The old decision with id: db-engine-mysql gets status: superseded and stays in history. Any agent assembling an engineering Context Pack ignores it, with no core needed for that.

3. Quality feedback and capture: client hooks

Pack usage logging and memory capture come from the client agent's hooks, tool-agnostic (section 09c), not from a central manager. Capture via hooks is the main path, not an alternative to Hermes.

4. Write policy: contract, not policy engine

In cooperative mode, the write policy is the contract: reading open, permanent writing outside /90-inbox/ turned into a suggestion. There is no hermes.policy.yml with semantic triggers; that was an invention of the old positioning. Real enforcement, against a hostile agent, comes from the operating system, below the agent: a container with a read-only mount except for the inbox, or a separate user with no write access to protected folders.

Derived principle

Passive memory is not a weakness, it is the condition of sovereignty. A vault that needs an active core is tied to that core. A vault that is just files versioned in Git belongs to the user and works with any agent. When enforcement is needed, it comes from the operating system, never from the agent's side.

09c / Automatic captureInbox without manual intervention

The biggest curation bottleneck is remembering to ask the agent to save something important. Automatic capture solves this with the code agent's own hooks, independent of any central core. Since Claude Code is one of the architecture's clients, it serves as a concrete example, but the same pattern applies to any agent that exposes lifecycle hooks.

The PostToolUse hook fires after a tool call and passes the event context, via stdin, to a script. The script applies simple heuristics to identify relevant patterns (decision made, bug resolved, preference identified) and appends a classified suggestion to /90-inbox/suggested-memory.md. The human does not need to ask: the inbox feeds itself, and human approval remains a quality filter proportional to risk.

Configuration in Claude Code

Claude Code hooks are declared in .claude/settings.json, under the hooks key, the same file that holds permissions. There is no separate .claude/hooks.json. In the vault template, add the hooks block to template/.claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": ".*",
        "hooks": [
          {
            "type": "command",
            "command": "node scripts/capture-to-inbox.mjs"
          }
        ]
      }
    ]
  }
}

The matcher uses a regex over the tool name; .* fires after any call.

The scripts/capture-to-inbox.mjs script uses simple heuristics (regex on common strings) to identify three patterns:

Decision: "decidi", "decidimos", "resolved to", "decision:" → confidence: verified, risk: medium
Preference: "prefiro", "prefere", "I prefer", "preference:" → confidence: preference, risk: low
Bug resolved: "resolvido", "fixed", "bug resolved", "solved" → confidence: verified, risk: low

If no pattern is found, the script does nothing. There are no external dependencies beyond Node.js. The review ritual (scripts/review-inbox.{sh,ps1}) processes what the hook recorded, with TTL for verified+low and risk filtering.

Automatic capture ≠ absence of curation

Automatic capture does not eliminate human curation. It eliminates the bottleneck of remembering to ask. The inbox grows on its own. The review ritual remains the quality control mechanism, proportional to risk, not mandatory for everything.

10 / InstallationConnect a client agent to the vault

The installation that matters is the vault: a Markdown folder turned into a Git repository, with AGENT.md and the folder structure. That is covered in the step-by-step of section 14, and depends on no agent. The code agents (Claude Code, Cursor, Codex, Hermes) are interchangeable clients that you point at this vault. This section shows how to connect a client, using Hermes as the example because it reads the contract natively. The same principle applies to the others, via adapters (section 08).

Hermes as a client

Hermes Agent is the NousResearch framework. It has a conversation loop, tool calling, its own memory and deployment in CLI, Telegram, Discord, WhatsApp and editors via Agent Client Protocol. What makes it a convenient client for this architecture is that it already reads files like AGENTS.md, SOUL.md and MEMORY.md when assembling the system prompt, so the federated contract loads with no glue code. Note that Hermes's own memory (MEMORY.md) is its own, not the vault's: it competes with sovereign memory if you let it. The goal is to point Hermes at the vault, not to let the vault become Hermes's internal memory.

# clone and setup
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
python -m venv .venv
source .venv/bin/activate
pip install -e .

# run from the vault, so it reads the contract
cd /path/to/your/federated-memory
hermes

Pointing the client at the contract

Hermes reads context files from the working directory. Run Hermes from the vault root and keep an AGENTS.md there that points to your contract in 00-global/AGENT.md:

# AGENTS.md (vault root, read automatically by Hermes)

This is a federated memory base. Read the consumption protocol
before doing anything:

  00-global/AGENT.md

Before responding to any task:
1. Identify the domain (writing, engineering, automation, etc).
2. Load the corresponding Context Pack from /60-context-packs/.
3. Never write to /00-global/ or /20-domains/ without explicit
   human approval.
4. New memory suggestions go to /90-inbox/suggested-memory.md.

Capturing suggestions

To feed the inbox with no manual action, use the client agent's hook-based capture, described in section 09c. In Claude Code that is a PostToolUse hook in .claude/settings.json. In any client the principle is the same: a script that writes suggestions to /90-inbox/suggested-memory.md, never directly to approved memory. You need nothing beyond that to start.

The rule that holds for any client

The client can choose and load context. The client does not decide on its own what becomes permanent truth. Permanent writing outside /90-inbox/ is a suggestion, not a write. If the client can be hostile, this is not guaranteed by contract: it is guaranteed by operating system isolation (section 09b).

11 / AccessObsidian via MCP in practice

The MCP server is optional. The client agent already reads the Markdown files directly through the filesystem, and the recommended floor (whitepaper section 05) does not require MCP. It enters when multiple external agents (Claude Desktop, Cursor, other MCP clients) need to query the same base without each implementing its own reader, exposing the vault through a standardized layer. There are two community-ready approaches.

Option A, Via Local REST API plugin

Install the "Local REST API" plugin in Obsidian. The plugin starts a local HTTP endpoint. An MCP server connects to this endpoint and exposes standardized tools.

Typical exposed tools:

list_vault_files, lists vault files
search_vault, search by content
create_vault_file, creates file (use with restriction in /90-inbox/)
read_vault_file, reads a specific file

Advantage. More reliable than pointing the agent directly at a directory, because tools are pre-defined. Disadvantage. Requires Obsidian to be open.

Option B, MCP server with direct filesystem

Implementations like obsidian-mcp-server access the vault directly on the filesystem, without needing Obsidian running. Works with the app closed, on a server, or via SSH.

Advantage. App-independent. Disadvantage. No access to Obsidian features (Dataview, plugins, resolved links), only raw Markdown.

MCP client configuration (Claude Desktop example)

{
  "mcpServers": {
    "obsidian-memory": {
      "command": "node",
      "args": [
        "/caminho/para/obsidian-mcp-server/build/index.js",
        "/caminho/para/federated-memory"
      ]
    }
  }
}

Scope warning

Exposing memory as a tool also exposes risk. Configure write permissions only for /90-inbox/. The folders /00-global/, /20-domains/ and /10-projects/ should be read-only for any agent. Interoperability without governance becomes an attack surface.

12 / FlowHow a request becomes execution

Figure 5, Complete operational flow with governance loop

The agent never writes permanent memory. When it identifies something that deserves to become truth, it writes a suggestion to the inbox. The human reviews, approves, edits or rejects. Only after that does the memory enter the stable base.

12b / RemoteRemote deployment: accessing memory from anywhere

So far everything runs locally. This works well for one machine. When you need two or more agents, on different machines, to access the same vault, the architecture needs a central point. That point is a VPS.

The good news: the separation of concerns in federated memory already solves half the problem. The vault is just files. The MCP server is just a process. Moving to a remote server does not change the contracts, only where the processes run.

Where each component lives

Figure 5b, remote architecture: vault and MCP on the VPS, agents accessing from outside

Component	Where it runs	Why
Vault (markdown files)	VPS	Single source of truth. No version conflict.
MCP Server	VPS (local port)	Never directly exposed. Access via tunnel.
Hermes Agent	VPS ou local	If running on VPS, no transfer latency.
Claude Code	Local machine	Human interface. Connects via SSH tunnel.
Git remote	GitHub	Synchronization and history. Not the primary vault.

Synchronization via Git

The vault is a regular Git repository. Each machine that needs a local copy does clone and pull. The VPS is the origin.

# On the VPS, initial setup
git init --bare /vault-remote.git

# On the local vault, point to the VPS as origin
git remote add origin ssh://user@your-vps:/vault-remote.git
git push -u origin main

# On another machine, clone the vault
git clone ssh://user@your-vps:/vault-remote.git ~/vault

# Sync routine (before working)
git pull origin main

# After inbox approvals
git add 00-global/ 10-projects/ 20-domains/ 60-context-packs/
git commit -m "feat: approved memory, [domain]"
git push origin main

Completion criterion

Run git log --oneline -5 on the VPS and on another machine. The hashes must be identical. If they are not, there is divergence, resolve before continuing.

Remote access to MCP server

The MCP server should not be exposed directly to the internet. Two options:

Option A, SSH tunnel (recommended for personal use)

MCP runs on localhost on the VPS. You create a tunnel that maps the local port to your machine.

# Create tunnel: port 3000 on the VPS appears as localhost:3000 on your machine
ssh -L 3000:localhost:3000 user@your-vps -N

# To keep it in the background
ssh -L 3000:localhost:3000 user@your-vps -fN

# To close
kill $(lsof -t -i:3000)

With the tunnel active, Claude Code sees the remote MCP as if it were local. No additional configuration.

Option B, HTTPS with authentication (for multiple agents)

If more than one machine needs to access MCP simultaneously, put a reverse proxy (nginx or Caddy) in front with TLS and basic authentication.

# Caddy, minimal configuration (~/.caddy/Caddyfile)
mcp.yourdomain.com {
    basicauth /* {
        agent $2a$14$HASH_GENERATED_WITH_caddy_hash
    }
    reverse_proxy localhost:3000
}

Do not do this

Do not expose the MCP server directly on a public port without authentication. MCP has read access (and inbox write access) to the entire vault. An open endpoint is an open vault.

Claude Code connecting to remote MCP

With the SSH tunnel active, Claude Code does not know it is talking to a remote server. Configure normally pointing to localhost.

# .claude/settings.json (on the local machine)
{
  "mcpServers": {
    "obsidian-vault": {
      "command": "npx",
      "args": ["-y", "mcp-obsidian"],
      "env": {
        "OBSIDIAN_API_URL": "http://localhost:3000",
        "OBSIDIAN_API_KEY": "your-token-here"
      }
    }
  }
}

Completion criterion: run /mcp in Claude Code and see the server listed as connected. Ask the agent to read a file from the vault. If it returns, the tunnel is working.

Security considerations

What to protect	How
SSH access to VPS	SSH key mandatory. Disable password login (`PasswordAuthentication no` in sshd_config).
MCP server token	Environment variable, never hardcoded. Use `.env` outside the vault.
Writing to vault	MCP server with write permission restricted to `/90-inbox/`. Main vault read-only for the agent.
Vault in Git	Private repository. Check `.gitignore` to avoid committing `.env` or tokens.
MCP port	Never open on VPS firewall. Only accessible via tunnel or authenticated proxy.

# Verify that port 3000 is NOT publicly exposed
# On the VPS:
ss -tlnp | grep 3000
# Should show: 127.0.0.1:3000. If it shows 0.0.0.0:3000, block it in the firewall

# UFW (Ubuntu)
ufw deny 3000
ufw allow 22  # SSH stays open

12c / HarnessHarness Engineering, the layer between agent and world

Harness Engineering is the discipline of connecting the agent to the environment in a controlled way: which tools it can use, how output is captured, how failures are handled. In federated memory, the harness is not a separate piece. It is distributed across the architecture.

Harness mapping in federated architecture

Component	Harness function
`MCP server` (optional)	When present, exposes tools to the agent in a standardized way. Without it, the agent reads the files directly and the contract plus the hooks do the job
`AGENT.md`	Defines behavior rules (declarative harness), the agent reads and applies before any action
Adapters	Translate the neutral `AGENT.md` contract for each specific agent (Claude, Cursor, Windsurf...)
`capture-to-inbox.mjs`	Captures agent output via PostToolUse hook, fills the inbox automatically without manual intervention
`SESSION.lock`	Controls concurrent access per project, identifies agent, machine and user in each session
`review-inbox`	Processes what the agent produced and decides the destination, complete audit trail in `/99-archive/review-log.md`

What v2 already implements

Write permissions via the contract: restricted to /90-inbox/, with OS hardening in adversarial mode (section 09b)
Automatic capture via hooks (PostToolUse → inbox), the inbox fills itself without depending on the operator's memory
Per-project session lock identifying agent, machine and user (SESSION.lock)
Basic observability via session-log.md (who did what, when) and review-log.md (what was promoted or rejected)

What enters the roadmap (when the environment scales)

Toolset by domain: an agent in /30-clients/ only accesses client tools; an agent in /20-domains/engineering/ only accesses code tools. The access layer (direct filesystem or MCP, when in use) exposes different sets depending on the input context.
Full observability: tokens consumed per session, execution time, most used tools, memory capture success rate. Metrics that show whether the harness is working, not just whether the agent responded.

Harness Engineering is not about controlling the agent

It is about making the environment predictable enough for the agent to be reliable. Clear rules produce consistent behavior. An agent in an ambiguous environment is not more capable than one in a well-defined environment, just less predictable.

12d / Hive MindHive Mind, shared knowledge between agents

As you add specialized agents to the ecosystem, a new problem arises: each agent learns useful things, creates skills, writes playbooks, makes technical decisions. If this knowledge stays isolated in each agent's private memory, the others will reinvent the wheel indefinitely.

The hive mind solves this with a federated layer for reading and discovering knowledge between agents. Each agent can consult what others have published, without having access to anyone's private memory.

Three levels of memory

Level	Where	Who accesses	What it contains
Private memory	`~/.hermes/` or internal equivalent	Only the agent itself	Session learnings, operational preferences, internal history
Published knowledge	`/50-skills/published/`	Any agent via MCP (read)	Skills, playbooks, templates, patterns approved or promoted by TTL
Approved knowledge	`/70-decisions/` and `/20-domains/`	Any agent via MCP (read)	Formal decisions, domain principles, global rules

Publication protocol

Before creating: consult /50-skills/INDEX.md, search by tag or domain to avoid duplication
Proposal: goes to /90-inbox/ with type: skill and complete metadata (author_agent, domain, confidence, risk, tags)
Automatic classification by confidence + risk:
- verified + low → automatically promoted to /published/ in 7 days
- verified + medium → lazy human approval
- hypothesis → stays in /proposed/ awaiting human decision
- high risk → explicit approval required

Fundamental rules

Federated reading: any agent reads any shared area
Controlled writing: agent only writes to /90-inbox/
No agent directly alters another's memory
Improvements become proposals, forks or new versions with supersedes
INDEX.md is consulted before any new creation

Structure of /50-skills/

50-skills/
  INDEX.md              # navigable index by domain and agent
  README.md             # protocol, mandatory metadata, rules
  published/            # approved skills, read-only for agents
  proposed/             # skills pending review
  deprecated/           # old skills, never delete, keep traceability

The hive mind is not communication between agents

It is accumulated knowledge that any agent can discover and reuse. The federated vault is the shared board. The blackboard pattern at scale, not as real-time memory, but as a repository of persistent learning.

12e / PatternsRecurring pattern memory per tool

Every tool has undocumented behaviors, known bugs and workarounds you discover in practice. Without structured memory, the agent repeats the same diagnosis every session, and you pay the cost in time and frustration. Worse: the workaround you discovered in one project does not reach the next.

The solution: a per-tool pattern ledger in /50-skills/tool-patterns/, separate from project and client context. What the agent learns about Brandcraft in one project applies to all others.

The 3-tier mechanism

Tier	Occurrence	Status	What happens
1	1st time	`observed`	Agent records symptom in inbox with `type: tool_pattern`. Problem documented, optional workaround.
2	2nd time	`auto_fix`	Agent finds the pattern in `tool-patterns/` and applies the documented fix without asking.
3	3rd+ time	`root_cause_pending`	Trigger for root cause analysis. The problem is too recurrent to be just a workaround.

Escalation is automatic: node scripts/escalate-patterns.mjs reads the inbox, increments the counter and updates the status. On the 3rd occurrence, the script explicitly alerts that human analysis is needed.

Scope: tool, not project

This is the distinction that makes the mechanism useful. Project memory stays in /10-projects/. Client memory in /30-clients/. Tool patterns stay in /50-skills/tool-patterns/, available to any project, without contaminating any specific context.

50-skills/
├── published/          ← approved skills (procedures)
├── proposed/           ← skills awaiting approval
├── deprecated/         ← history
├── tool-patterns/      ← per-tool ledger  ← NEW
│   ├── README.md
│   ├── brandcraft.md   ← 1 file per tool
│   └── puppeteer.md
└── INDEX.md

Complete flow

1st occurrence: the agent encounters a problem with Brandcraft. Writes to the inbox:

---
type: tool_pattern
tool: brandcraft
symptom: API returns rate limit in non-standard header X-BrandCraft-Limit-Remaining
fix: check X-BrandCraft-Limit-Remaining before X-RateLimit-Remaining
confidence: verified
risk: low
---

Processes the pattern: node scripts/escalate-patterns.mjs creates tool-patterns/brandcraft.md with status: observed.
2nd occurrence (another project): the agent consults tool-patterns/brandcraft.md, finds status: auto_fix and applies the documented fix without re-diagnosing.
3rd occurrence: the script escalates to root_cause_pending and warns that the problem needs to be solved at the source, not just worked around.

What is NOT a tool_pattern

Do not confuse with user preferences (preference), project facts (fact) or formal decisions (decision). Tool patterns are specific to external tool behaviors, not user choices, not business context.

Why 3 tiers and not just "log and fix"?

Because the cost of intervention should scale with frequency. The 1st time, the context is still rare, documenting is enough. The 2nd time, it is already recurring, automation is justified. The 3rd time, it is a systemic pattern, it requires root cause investigation, not another workaround.

13 / TimeGraphiti: when memory needs history

Graphiti does not replace Obsidian. It enters when memory stops being just notes and starts requiring history, relations, events and changes. Some memories are stable facts. Others change over time. This difference is critical.

Note that Graphiti does not duplicate the memory: it indexes the vault's content to answer temporal and relational questions that the files alone do not answer efficiently. The source of truth remains the Markdown; Graphiti is a derived index, not a substitute.

Figure 6, Graphiti as optional temporal layer

Cases where Graphiti justifies the effort:

Decisions that replace previous decisions and you need to track both.
Relations between people, projects, clients, stacks and agents.
Preferences that change over time (writing style evolves).
Project events, meetings, changes, commitments.
Queries like "what changed?" or "what is the most recent version of the truth?".

When NOT to add Graphiti

If your memory is predominantly stable facts (writing style, principles, patterns), Graphiti is overhead. Add it when temporal questions start appearing frequently and you feel Obsidian alone does not answer them.

Before Graphiti: smaller implementations that already solve part of the problem

Some features you would put in Graphiti already exist in tools focused only on decisions. DecisionNode (github.com/decisionnode/DecisionNode) is one example: it stores decisions as structured JSON with vector embeddings, exposes them via CLI and an MCP server, has history tracking, soft-delete (deprecate/activate) and automatic conflict detection by semantic similarity.

It is a sub-system, not an alternative to federated architecture. It solves the decision module with semantic search; it does not cover Context Packs, domains, style guides or project context. But if your pain is specifically "agents do not remember architecture decisions made last week", DecisionNode can solve this before you need to spin up Graphiti.

Pragmatic path

Obsidian as the general base. DecisionNode for the structured decision sub-module with semantic search. Graphiti only when you need entity relations and change history at the temporal graph level. No need to spin up everything at once.

14 / ExecutionExecutable step by step

The steps below are sequential. Each has an objective completion criterion.

Step 1, Create the vault

mkdir federated-memory
cd federated-memory
mkdir -p 00-global 10-projects 20-domains 30-clients 40-workflows \
         50-skills 60-context-packs 70-decisions 80-agent-adapters \
         90-inbox 99-archive
git init

Done when: the folder structure exists and the git repository is initialized.

Step 2, Write the AGENT.md

Copy the template from section 07 to 00-global/AGENT.md and adjust the domain list to yours.

Done when: the file exists, lists your real domains, and you can explain each rule out loud.

Step 3, Define real domains

Create a folder for each domain in 20-domains/. Use the criterion from section 06: own vocabulary + outputs that should not mix with others.

Done when: you have between 3 and 6 domains. More than 8 is overhead. Fewer than 3 almost never justifies federated memory.

Step 4, Three initial Context Packs

Crie em 60-context-packs/:

writing-style.md, writing tasks in your style
code-work.md, code tasks with your conventions
architecture-review.md, architecture analysis

Use the filled example from section 09 as a model. Each should fit on one screen.

Done when: the three packs exist, each has a goal, Use and Avoid lists, and the temporal validity field.

Step 5, Adapter and capture hook for the first agent

If using Claude Code, create 80-agent-adapters/claude/CLAUDE.md pointing to AGENT.md and a default Context Pack. Use the template from section 08. Also configure the capture hook in .claude/settings.json (PostToolUse running scripts/capture-to-inbox.mjs, per section 09c). For other clients, the adapter follows the same principle: point to the contract and wire a tool-agnostic capture hook.

Done when: you can run the agent in the vault directory, it reads the adapter automatically, and relevant actions produce suggestions in /90-inbox/suggested-memory.md without manual intervention.

Step 6 (optional), Connect Hermes as an additional client

Skip this step if you are not going to use Hermes. The floor from steps 1 to 5 is already enough for a single client. To connect Hermes as an additional client, use the instructions from section 10 and add an AGENTS.md at the vault root referencing 00-global/AGENT.md.

Done when: when running hermes inside the directory, it responds respecting the "do not load everything" rule and using Context Packs.

Step 7 (optional), Expose via MCP

Skip this step if you do not have multiple external agents consuming the same base. If you need it, choose between Option A (Local REST API) or Option B (direct filesystem) from section 11. Configure in your preferred MCP client. Restrict writing to /90-inbox/.

Done when: an external agent (Claude Desktop, Cursor, ChatGPT with MCP) queries your vault and respects the write limits.

Step 8, Validate with two different agents

Perform the same task with two different agents consuming the same memory. Check three things:

Is the result voice-consistent between the two?
Did neither contaminate the wrong domain?
Did both respect the rule of only suggesting memory in the inbox?

Done when: all three answers are "yes". If any is "no", the problem is in the Context Pack, not the agent.

Step 9 (optional), Add Graphiti

Only execute this step when you feel temporal pain: decisions replacing decisions, questions about history, information validity. Before that, it is over-engineering.

15 / DisciplineMinimum governance

The biggest risk of persistent memory is not lack of information. It is excess information with undue authority. Every architecture needs to separate raw conversation, hypothesis, draft and permanent memory.

Figure 7, Five-stage governance loop

All new memory starts as a suggestion in /90-inbox/.
Every permanent decision must have date, scope, source, status and approval owner.
Old memories need to be archivable, replaceable or markable as obsolete.
Agents consult memory but do not rewrite global principles without permission.
Simple logs suffice at the start: who asked, what was read, what was changed, why.

Weekly inbox review ritual

Human approval only works if there is a set time to do it. Without a fixed ritual, /90-inbox/ becomes a trash bin: suggestions pile up, nothing becomes memory, and agents keep repeating the same questions. Set aside a short weekly block, 20 to 40 minutes usually suffices, to process the inbox to zero.

The repository ships two equivalent scripts in /scripts/ that walk through the review entry by entry:

# Linux / macOS
VAULT_PATH=/path/to/vault ./scripts/review-inbox.sh

# Windows (PowerShell)
$env:VAULT_PATH = 'C:\path\to\vault'
./scripts/review-inbox.ps1

For each pending suggestion the script shows the block and offers four decisions:

Approve (a): appends the entry to the file pointed to in Suggested destination and removes from inbox.
Edit (e): opens the entry in the editor ($EDITOR or notepad), adjusts text or destination and asks again.
Reject (r): discards the suggestion. Weak hypothesis does not become memory.
Defer (d): keeps in inbox for the next review. Useful when lacking context to decide.

Every decision, including rejections and deferrals, is logged in /99-archive/review-log.md with date, domain, destination and summary. This log is what allows, weeks later, seeing how many suggestions became memory, how many were discarded and where agents err most frequently. If the rejection rate is high, the problem is not the inbox: it is the agent's capture rule.

16 / Common mistakesAnti-patterns to avoid

Anti-pattern	Symptom	Antidote
Single super brain	Marketing pulls code, code pulls research, everything mixes	Isolated domains from day one
Always-on context	Slow agent, hallucination, out-of-scope answers	Context Packs per task
Automatic memory without review	Hypotheses turn into facts, the base degrades in weeks	Mandatory inbox, human approval
Adapter as the primary source	Switching tools requires rewriting everything	AGENT.md is the source, the adapter only translates
Powerful orchestrator too early	Hermes/LangGraph doing things no one understands	Start with direct reading, add orchestration when it hurts
Unscoped MCP/API	Any agent writes to any folder	Write only to /90-inbox/, the rest read-only

17 / SwapsReplacements without changing the architecture

The reference implementation uses a Markdown vault versioned with Git. The architecture, however, was designed to be portable, and any part can be swapped without touching the contract.

Alternatives to Obsidian. Logseq, Foam, Dendron, VSCode with Markdown folders, bare Git repository. The requirement is to be readable, versionable, and easy for humans to review.

Other client agents. LangGraph for stateful flows with explicit graphs, CrewAI for role-based agent teams, Letta for persistent agents, Microsoft Agent Framework / AutoGen for multi-agent, Hermes. The requirement is to classify the task, choose the domain, select the Context Pack, and deliver context without becoming the owner of memory.

Access layer. Direct file reading, local CLI, REST API. MCP is recommended when multiple agents need to consume memory in a standardized way.

Machine synchronization. Git natively solves cross-machine sync: push on one machine, pull on another. Proprietary sync (iCloud, Dropbox, Notion) works but adds an intermediary that may conflict with Git's resolution model. The recommended path is Git + remote repository.

For the decisions sub-module: DecisionNode. If you want to replace the files in /70-decisions/ with something that has semantic search and automatic history, DecisionNode (github.com/decisionnode/DecisionNode) does exactly that: decisions in JSON, embeddings via Gemini, cosine similarity search, MCP server ready for Claude Code, Cursor, and Windsurf, conflict detection, reversible soft-delete. The requirement for using it as a replacement remains the same: it cannot write permanent memory without human approval, and the status: approved/superseded rule must be respected. In DecisionNode, this means configuring the agent in "strict" mode and reviewing what enters before activating.

Centralizing systems / Life OS, the opposite path

There is a category of projects that takes the inverse approach to federated: concentrate memory, automation, integrations, and agents into a single monolithic application. They are often called "Personal AI", "Life OS", or "productivity super-app". They win on surface usability: ready-made UI, packaged integrations, visible community, near-zero adoption curve.

The cost appears later. Memory lives inside the application, in the application's format. Switching tools means starting from scratch. The architecture is, by design, centralizing, exactly the anti-pattern that motivated this guide. There is no neutral contract; the adapter is the product. There is no exposed mtime; temporal validity inspection requires manual review, the client agent, or a hook, does here what the application does not expose. The conflict rule is defined by the UI, not by an auditable file.

Criterion	Federated memory	Centralizing systems / Life OS
Adoption speed	Slow, requires learning the contract	Fast, install, connect, use
Portability	Total (Markdown + Git)	Locked to app
Sovereignty	User	Product
Temporal validity by inspection	Yes (file mtime)	Not available
Switching cost	Swap adapter	Start over

For those who prioritize adoption speed over sovereignty, the centralizing path is the right choice. For those who prioritize portability and governance, it is a problem dressed as a solution.

Substitution criterion

You can swap tools. You cannot swap the principles: separate domains, neutral contract, lean context, human approval, write policy restricted to inbox, deterministic conflict resolution, portability.

18 / TemplatesReady-made templates to copy

DECISION.md

# Decision: [short title]

Date: YYYY-MM-DD
Scope: [domain or project]
Status: proposed | approved | superseded | archived
Source: [conversation, meeting, document]
Decision: [the decision in one sentence]
Reason: [why this was decided]
Impacts: [what changes]
Supersedes: [link to previous decision, if any]
Related context packs: [packs that should reflect this decision]

PROJECT_CONTEXT.md

# Project: [name]

Goal: [what this project should solve]
Current status: [where it stands now]
Stack: [technologies, tools]
Constraints: [technical limits, deadline, budget]
Important files: [relevant paths]
Relevant domains: [which memory domains apply]
Relevant context packs: [which packs to use]
Open questions: [what is still undecided]
Do not use: [what NOT to bring into this project]

MEMORY_UPDATE_PROPOSAL.md

# Memory Update Proposal

Source: [conversation, agent, document]
Suggested domain: [target domain]
Suggested memory type: fact | decision | preference | workflow | risk
Confidence: low | medium | high
Why this should become memory: [justification]
Proposed text: [the exact text to become memory]
Human decision: approved | edited | rejected
Reviewer: [name]
Decision date: YYYY-MM-DD

19 / ValidationImplementation checklist

Is there a clear and neutral AGENT.md without reference to specific tools?
Are domains truly separated, with their own vocabulary?
Are there at least three useful Context Packs, with goal, Use/Avoid lists and temporal validity?
Is the first adapter (CLAUDE.md, codex.md, etc.) working?
Does the first client agent read the contract (AGENT.md or adapter) automatically on startup?
Does the client agent only drop suggestions to /90-inbox/, without writing permanent memory on its own?
Does /90-inbox/suggested-memory.md exist, receiving agent suggestions?
Is there an explicit human approval rule before any writing to /00-global/ or /20-domains/?
If MCP is in use, is writing restricted to /90-inbox/? (optional item, the floor does not require MCP)
Was the system tested with at least two different agents consuming the same base?
Is there a plan for when to add Graphiti, and does it include "only when the temporal pain appears"?

20 / SourcesReferences and closing

The references below anchor the technical components. The guide does not depend on a specific tool, but uses concepts and patterns from these ecosystems.

Hermes Agent, NousResearch. github.com/NousResearch/hermes-agent and hermes-agent.nousresearch.com/docs
Model Context Protocol. modelcontextprotocol.io
Obsidian MCP servers (community). Implementations via the Local REST API plugin or via direct filesystem.
Claude Code Memory. docs.claude.com/en/docs/claude-code/memory
Graphiti, Zep. help.getzep.com/graphiti
DecisionNode. github.com/decisionnode/DecisionNode, reference implementation for the decisions sub-module with embeddings and semantic search via MCP.
Letta, Stateful agents. docs.letta.com
CrewAI. docs.crewai.com
Microsoft Agent Framework. learn.microsoft.com/agent-framework

CLOSING

Persistent memory is only useful when isolation exists. Without isolation, the system gets large, slow and confusing. With separate domains, neutral contract, Context Packs and blackboard as coordination pattern, memory becomes infrastructure. Not accumulated noise.

Final thesis. Do not create a super brain. Create a federated memory: human at the origin, segmented by domain, with a neutral contract as the interface and Git as the spine. Client agents read the contract, write to the inbox and never become the memory's owner. They do not need to talk to each other. The shared board is already the coordination.

A note on convergence. Tools like DecisionNode, developed independently, reached conclusions similar to this guide: structured decisions, multi-agent via MCP, auditable history, implicit blackboard pattern. This is not a coincidence. It is a sign that the problem is real and the architectural direction is correct. The market is converging toward federated memory. This guide tries to name the pattern before it becomes just a product feature.

Federated Memory for AI Agents · Guide v3.0
André Almeida · andrealmeidadc.com