Architecture Overview¶
AgentX is a two-tier system: a Django REST API backend and a Tauri desktop client. This document covers the backend architecture.
System Architecture¶
graph TB
subgraph Views["Views Layer"]
V[views.py<br/>HTTP dispatch]
end
subgraph Agent["Agent Core"]
AC[Agent]
TP[TaskPlanner]
SM[SessionManager]
CM[ContextManager]
OP[OutputParser]
end
subgraph Reasoning["Reasoning Framework"]
RO[Orchestrator]
CoT[Chain-of-Thought]
ToT[Tree-of-Thought]
ReAct[ReAct]
Ref[Reflection]
end
subgraph Drafting["Drafting Framework"]
Spec[Speculative]
Pipe[Pipeline]
Cand[Candidate]
end
subgraph Providers["Model Providers"]
PR[ProviderRegistry]
LMS[LM Studio]
ANT[Anthropic]
OAI[OpenAI]
end
subgraph MCP["MCP Client"]
MCM[ClientManager]
SR[ServerRegistry]
TE[ToolExecutor]
subgraph Transports
STDIO[stdio]
SSE[SSE]
HTTP[Streamable HTTP]
end
end
subgraph Prompts["Prompt System"]
PM[PromptManager]
Prof[Profiles]
Sect[Sections]
Comp[Composer]
end
subgraph Memory["Memory System"]
MI[AgentMemory Interface]
EP[Episodic]
SEM[Semantic]
PROC[Procedural]
WM[Working]
EXT[Extraction]
CON[Consolidation]
REC[RecallLayer]
end
subgraph Data["Data Layer"]
Neo4j[("Neo4j")]
PG[("PostgreSQL<br/>+ pgvector")]
Redis[("Redis")]
end
TK[TranslationKit]
V --> AC
V --> TK
V --> MCM
V --> PR
V --> PM
V --> MI
AC --> RO
AC --> TE
AC --> PM
AC --> CM
AC --> MI
RO --> CoT & ToT & ReAct & Ref
RO --> PR
AC -.-> Spec & Pipe & Cand
Spec & Pipe & Cand --> PR
PR --> LMS & ANT & OAI
MCM --> SR --> Transports
MI --> EP & SEM & PROC & WM
MI --> EXT & CON & REC
EP & SEM & PROC --> Neo4j
EP & SEM --> PG
WM --> Redis
CON --> EXT
Request Lifecycle¶
A POST /api/agent/chat request follows this path:
sequenceDiagram
participant C as Client
participant V as views.py
participant A as Agent
participant PM as PromptManager
participant S as SessionManager
participant M as AgentMemory
participant P as Provider
participant MCP as ToolExecutor
C->>V: POST /agent/chat {message, model, profile_id, session_id}
V->>A: Agent(config)
A->>S: get_or_create(session_id)
A->>M: store_turn(user_turn)
A->>M: remember(query)
M-->>A: MemoryBundle
A->>PM: get_system_prompt(profile_id)
PM-->>A: composed system prompt
A->>A: build messages (system + context + memory + user)
A->>A: _get_tools_for_provider() → MCP tools
loop Tool-use loop (max_tool_rounds)
A->>P: complete(messages, tools)
P-->>A: CompletionResult
alt has tool_calls
A->>MCP: call_tool_sync(name, args)
MCP-->>A: ToolResult
A->>A: append tool result to messages
else no tool_calls
Note over A: break loop
end
end
A->>A: parse_output() → extract <think> tags
A->>S: add_message(assistant)
A->>M: store_turn(assistant_turn)
A-->>V: AgentResult
V-->>C: JSON response
Module Index¶
| Module | Path | Purpose | Init |
|---|---|---|---|
| Agent | agent/core.py |
Orchestrates reasoning, tools, memory, prompts | Per-request |
| TaskPlanner | agent/planner.py |
Decomposes tasks into subtasks with goal tracking | Per-request |
| SessionManager | agent/session.py |
Maintains conversation context across messages | Lazy singleton |
| ContextManager | agent/context.py |
Token budgeting, memory injection, summarization | Per-request |
| OutputParser | agent/output_parser.py |
Extracts <think> tags from model output |
Stateless |
| Reasoning | reasoning/orchestrator.py |
Selects and executes reasoning strategy | Per-request |
| Drafting | drafting/ |
Speculative decoding, pipelines, candidates | Per-request |
| Providers | providers/registry.py |
Model-to-provider resolution, model registry | Lazy singleton |
| MCP | mcp/client.py |
External tool server connections and execution | Lazy singleton |
| Prompts | prompts/manager.py |
System prompt composition from profiles + sections | Lazy singleton |
| Memory | kit/agent_memory/memory/interface.py |
Unified API for episodic/semantic/procedural/working memory | Lazy |
| RecallLayer | kit/agent_memory/recall/layer.py |
Multi-strategy retrieval (hybrid, HyDE, entity-centric) | Per-query |
| Extraction | kit/agent_memory/extraction/service.py |
LLM-based entity/fact extraction | Per-consolidation |
| Consolidation | kit/agent_memory/consolidation/worker.py |
Background jobs for memory processing | Background thread |
| Translation | kit/translation.py |
NLLB-200 translation + language detection | Lazy singleton |
| Config | config.py |
Runtime config persistence to data/config.json |
Lazy singleton |
Design Decisions¶
Lazy singletons — Heavy subsystems (TranslationKit, MCP, Providers, Prompts) use @lazy_singleton to defer initialization until first use. Health checks can probe without triggering model loads via get_if_initialized().
Sync Django + async MCP — Django runs synchronously. MCP client uses asyncio internally. The bridge is MCPClientManager.call_tool_sync(), which runs async tool calls on a background event loop thread. The streaming chat endpoint (agent_chat_stream) is the only async view.
Per-request Agent — Each chat/run request creates a fresh Agent instance with its own config. Shared state (sessions, providers, MCP connections) lives in singletons. This keeps the Agent stateless and thread-safe.
Memory is optional — All memory operations are wrapped in try/except. The system degrades gracefully when databases are unavailable. enable_memory=False skips all memory operations.