Memory System Setup Guide¶
Complete setup instructions for the AgentX memory system.
Prerequisites¶
- Docker and Docker Compose (for containerized databases)
- Python 3.11+ with uv or pip
- OpenAI API key (for embeddings) or local model setup
Quick Start¶
1. Start Database Services¶
Use the provided Docker Compose configuration:
# Start all database services
docker-compose up -d
# Verify services are running
docker-compose ps
# Check logs
docker-compose logs -f neo4j postgres redis
2. Initialize Neo4j Schema¶
Run the schema initialization script:
# Connect to Neo4j browser
open http://localhost:7474
# Or use cypher-shell
docker exec -it agent-neo4j cypher-shell -u neo4j -p your_secure_password
Execute the following Cypher commands:
// ============================================
// CONSTRAINTS AND INDEXES
// ============================================
// Uniqueness constraints
CREATE CONSTRAINT conversation_id IF NOT EXISTS
FOR (c:Conversation) REQUIRE c.id IS UNIQUE;
CREATE CONSTRAINT entity_id IF NOT EXISTS
FOR (e:Entity) REQUIRE e.id IS UNIQUE;
CREATE CONSTRAINT fact_id IF NOT EXISTS
FOR (f:Fact) REQUIRE f.id IS UNIQUE;
CREATE CONSTRAINT goal_id IF NOT EXISTS
FOR (g:Goal) REQUIRE g.id IS UNIQUE;
CREATE CONSTRAINT user_id IF NOT EXISTS
FOR (u:User) REQUIRE u.id IS UNIQUE;
// Property indexes for fast lookups
CREATE INDEX entity_name IF NOT EXISTS FOR (e:Entity) ON (e.name);
CREATE INDEX entity_type IF NOT EXISTS FOR (e:Entity) ON (e.type);
CREATE INDEX fact_confidence IF NOT EXISTS FOR (f:Fact) ON (f.confidence);
CREATE INDEX goal_status IF NOT EXISTS FOR (g:Goal) ON (g.status);
CREATE INDEX turn_timestamp IF NOT EXISTS FOR (t:Turn) ON (t.timestamp);
// Full-text search indexes
CREATE FULLTEXT INDEX entity_search IF NOT EXISTS
FOR (e:Entity) ON EACH [e.name, e.aliases, e.description];
CREATE FULLTEXT INDEX fact_search IF NOT EXISTS
FOR (f:Fact) ON EACH [f.claim];
// ============================================
// VECTOR INDEXES
// ============================================
// Turn embeddings (episodic memory)
CREATE VECTOR INDEX turn_embeddings IF NOT EXISTS
FOR (t:Turn) ON (t.embedding)
OPTIONS {
indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}
};
// Entity embeddings (semantic memory)
CREATE VECTOR INDEX entity_embeddings IF NOT EXISTS
FOR (e:Entity) ON (e.embedding)
OPTIONS {
indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}
};
// Fact embeddings
CREATE VECTOR INDEX fact_embeddings IF NOT EXISTS
FOR (f:Fact) ON (f.embedding)
OPTIONS {
indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}
};
// Strategy embeddings (procedural memory)
CREATE VECTOR INDEX strategy_embeddings IF NOT EXISTS
FOR (s:Strategy) ON (s.embedding)
OPTIONS {
indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}
};
3. Initialize PostgreSQL Schema¶
Connect to PostgreSQL and run the initialization script:
# Connect to PostgreSQL
docker exec -it agent-postgres psql -U agent -d agent_memory
# Or use a SQL file
docker exec -i agent-postgres psql -U agent -d agent_memory < init-scripts/01-init.sql
SQL initialization script:
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- For fuzzy text search
-- Conversation logs (append-only time series)
CREATE TABLE conversation_logs (
id BIGSERIAL PRIMARY KEY,
conversation_id UUID NOT NULL,
turn_index INTEGER NOT NULL,
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
role VARCHAR(20) NOT NULL,
content TEXT NOT NULL,
content_hash VARCHAR(64),
token_count INTEGER,
model VARCHAR(100),
metadata JSONB DEFAULT '{}',
embedding vector(1536),
UNIQUE(conversation_id, turn_index)
);
-- BRIN index for time-range queries (very efficient for time-series)
CREATE INDEX idx_logs_timestamp ON conversation_logs USING BRIN (timestamp);
CREATE INDEX idx_logs_conversation ON conversation_logs (conversation_id);
CREATE INDEX idx_logs_embedding ON conversation_logs USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Memory timeline (unified temporal index)
CREATE TABLE memory_timeline (
id BIGSERIAL PRIMARY KEY,
memory_type VARCHAR(50) NOT NULL,
neo4j_node_id VARCHAR(100),
event_time TIMESTAMPTZ NOT NULL,
summary TEXT,
embedding vector(1536),
importance_score FLOAT DEFAULT 0.5,
access_count INTEGER DEFAULT 0,
last_accessed TIMESTAMPTZ,
archived BOOLEAN DEFAULT FALSE,
metadata JSONB DEFAULT '{}'
);
CREATE INDEX idx_timeline_time ON memory_timeline USING BRIN (event_time);
CREATE INDEX idx_timeline_type ON memory_timeline (memory_type);
CREATE INDEX idx_timeline_importance ON memory_timeline (importance_score DESC);
CREATE INDEX idx_timeline_embedding ON memory_timeline USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Tool invocations audit
CREATE TABLE tool_invocations (
id BIGSERIAL PRIMARY KEY,
conversation_id UUID NOT NULL,
turn_index INTEGER NOT NULL,
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
tool_name VARCHAR(100) NOT NULL,
tool_input JSONB NOT NULL,
tool_output JSONB,
success BOOLEAN,
latency_ms INTEGER,
error_message TEXT
);
CREATE INDEX idx_tools_conversation ON tool_invocations (conversation_id);
CREATE INDEX idx_tools_name ON tool_invocations (tool_name);
CREATE INDEX idx_tools_timestamp ON tool_invocations USING BRIN (timestamp);
-- User preferences and profiles
CREATE TABLE user_profiles (
user_id VARCHAR(100) PRIMARY KEY,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
preferences JSONB DEFAULT '{}',
expertise_areas JSONB DEFAULT '[]',
communication_style JSONB DEFAULT '{}',
metadata JSONB DEFAULT '{}'
);
-- Function to update timestamp
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER user_profiles_updated
BEFORE UPDATE ON user_profiles
FOR EACH ROW
EXECUTE FUNCTION update_updated_at();
4. Configure Environment Variables¶
Create a .env file in the project root:
# Neo4j Configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_secure_password
# PostgreSQL Configuration
POSTGRES_URI=postgresql://agent:your_secure_password@localhost:5432/agent_memory
# Redis Configuration
REDIS_URI=redis://localhost:6379
# Embedding Provider
EMBEDDING_PROVIDER=openai # or "local"
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
OPENAI_API_KEY=sk-your-api-key-here
# Local Embedding Model (if using local)
LOCAL_EMBEDDING_MODEL=nomic-ai/nomic-embed-text-v1.5
# Memory Settings
EPISODIC_RETENTION_DAYS=90
FACT_CONFIDENCE_THRESHOLD=0.7
SALIENCE_DECAY_RATE=0.95
MAX_WORKING_MEMORY_ITEMS=50
# Retrieval Settings
DEFAULT_TOP_K=10
RERANKING_ENABLED=true
5. Install Python Dependencies¶
Add required dependencies to your project:
# Using uv (recommended)
uv add neo4j redis sqlalchemy psycopg2-binary pgvector pydantic-settings
# For OpenAI embeddings
uv add openai
# For local embeddings
uv add sentence-transformers
# Optional: for entity extraction
uv add spacy
python -m spacy download en_core_web_sm
Or add to pyproject.toml:
[project]
dependencies = [
"neo4j>=5.15.0",
"redis>=5.0.0",
"sqlalchemy>=2.0.0",
"psycopg2-binary>=2.9.0",
"pgvector>=0.2.0",
"pydantic-settings>=2.0.0",
"openai>=1.0.0", # For OpenAI embeddings
"sentence-transformers>=2.2.0", # For local embeddings
]
6. Verify Installation¶
Test the memory system:
from agentx_ai.kit.agent_memory import AgentMemory, Turn
from uuid import uuid4
# Initialize memory
memory = AgentMemory(user_id="test_user", conversation_id=str(uuid4()))
# Store a test turn
turn = Turn(
conversation_id=memory.conversation_id,
index=0,
role="user",
content="Hello, this is a test message."
)
memory.store_turn(turn)
# Retrieve
context = memory.remember("test message")
print(context.to_context_string())
# Clean up
memory.close()
Running the Consolidation Worker¶
The background consolidation worker should run as a separate process:
# Development
python -m agentx_ai.kit.agent_memory.consolidation.worker
# Production (with supervisor/systemd)
# See deployment section below
Docker Compose Configuration¶
Create docker-compose.yml for the memory system databases:
version: '3.8'
services:
neo4j:
image: neo4j:5.15-community
container_name: agent-neo4j
ports:
- "7474:7474" # Browser
- "7687:7687" # Bolt
environment:
- NEO4J_AUTH=neo4j/your_secure_password
- NEO4J_PLUGINS=["apoc"]
- NEO4J_apoc_export_file_enabled=true
- NEO4J_apoc_import_file_enabled=true
- NEO4J_apoc_import_file_use__neo4j__config=true
- NEO4J_server_memory_heap_initial__size=512m
- NEO4J_server_memory_heap_max__size=2G
- NEO4J_server_memory_pagecache_size=1G
volumes:
- ./data/neo4j/data:/data
- ./data/neo4j/logs:/logs
- ./data/neo4j/plugins:/plugins
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7474"]
interval: 10s
timeout: 5s
retries: 5
postgres:
image: pgvector/pgvector:pg16
container_name: agent-postgres
ports:
- "5432:5432"
environment:
- POSTGRES_USER=agent
- POSTGRES_PASSWORD=your_secure_password
- POSTGRES_DB=agent_memory
volumes:
- ./data/postgres:/var/lib/postgresql/data
- ./init-scripts:/docker-entrypoint-initdb.d
healthcheck:
test: ["CMD-SHELL", "pg_isready -U agent -d agent_memory"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
container_name: agent-redis
ports:
- "6379:6379"
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru
volumes:
- ./data/redis:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
# Optional: Redis GUI
redis-commander:
image: rediscommander/redis-commander:latest
container_name: agent-redis-gui
ports:
- "8081:8081"
environment:
- REDIS_HOSTS=local:redis:6379
depends_on:
- redis
Development Workflow¶
Testing Memory Operations¶
# Test episodic memory
from agentx_ai.kit.agent_memory import AgentMemory, Turn
from uuid import uuid4
memory = AgentMemory(user_id="dev_user", conversation_id=str(uuid4()))
# Add multiple turns
for i, content in enumerate(["Hello", "How are you?", "Tell me about Python"]):
turn = Turn(
conversation_id=memory.conversation_id,
index=i,
role="user" if i % 2 == 0 else "assistant",
content=content
)
memory.store_turn(turn)
# Test retrieval
context = memory.remember("Python programming", top_k=5)
print(f"Found {len(context.relevant_turns)} relevant turns")
# Test semantic memory
from agentx_ai.kit.agent_memory import Entity
entity = Entity(
name="Python",
type="ProgrammingLanguage",
description="High-level programming language"
)
memory.upsert_entity(entity)
# Test procedural memory
memory.record_tool_usage(
tool_name="code_interpreter",
tool_input={"code": "print('hello')"},
tool_output={"result": "hello"},
success=True,
latency_ms=150
)
Monitoring¶
Check database health:
# Neo4j stats
docker exec agent-neo4j cypher-shell -u neo4j -p password "CALL dbms.listConfig() YIELD name, value WHERE name STARTS WITH 'dbms.memory' RETURN name, value"
# PostgreSQL stats
docker exec agent-postgres psql -U agent -d agent_memory -c "SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size FROM pg_tables WHERE schemaname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;"
# Redis stats
docker exec agent-redis redis-cli INFO memory
Troubleshooting¶
Neo4j Connection Issues¶
# Check if Neo4j is running
docker ps | grep neo4j
# Check logs
docker logs agent-neo4j
# Test connection
docker exec agent-neo4j cypher-shell -u neo4j -p password "RETURN 'Connected' as status"
PostgreSQL Connection Issues¶
# Check if PostgreSQL is running
docker ps | grep postgres
# Check logs
docker logs agent-postgres
# Test connection
docker exec agent-postgres pg_isready -U agent -d agent_memory
Redis Connection Issues¶
# Check if Redis is running
docker ps | grep redis
# Test connection
docker exec agent-redis redis-cli ping
Vector Index Issues¶
If vector searches are slow:
// Check vector index status
SHOW INDEXES YIELD name, type, state WHERE type = "VECTOR";
// Rebuild vector index if needed
DROP INDEX turn_embeddings;
CREATE VECTOR INDEX turn_embeddings FOR (t:Turn) ON (t.embedding)
OPTIONS {indexConfig: {`vector.dimensions`: 1536, `vector.similarity_function`: 'cosine'}};
Production Deployment¶
Using Supervisor¶
Create /etc/supervisor/conf.d/agentx-memory-worker.conf:
[program:agentx-memory-worker]
command=/path/to/venv/bin/python -m agentx_ai.kit.agent_memory.consolidation.worker
directory=/path/to/agentx-source
user=agentx
autostart=true
autorestart=true
stderr_logfile=/var/log/agentx/memory-worker.err.log
stdout_logfile=/var/log/agentx/memory-worker.out.log
environment=PATH="/path/to/venv/bin"
Using Systemd¶
Create /etc/systemd/system/agentx-memory-worker.service:
[Unit]
Description=AgentX Memory Consolidation Worker
After=network.target
[Service]
Type=simple
User=agentx
WorkingDirectory=/path/to/agentx-source
Environment="PATH=/path/to/venv/bin"
ExecStart=/path/to/venv/bin/python -m agentx_ai.kit.agent_memory.consolidation.worker
Restart=always
[Install]
WantedBy=multi-user.target
Then:
sudo systemctl daemon-reload
sudo systemctl enable agentx-memory-worker
sudo systemctl start agentx-memory-worker
sudo systemctl status agentx-memory-worker
Next Steps¶
- Implement entity extraction (see
extraction/entities.py) - Implement fact extraction (see
extraction/facts.py) - Configure monitoring and alerting
- Set up backup procedures for databases
- Tune database parameters for your workload