An AI agent that thinks in stages, acts before you ask, and writes its own tools. Self-hosted. Open source. Windows.
Not another chatbot wrapper. A neuroscience-inspired cognitive system that perceives, remembers, plans, acts, and learns.
Cognitive architecture
Most agents are a while-loop around a chat API. Leontes runs a 5-stage pipeline inspired by Global Workspace Theory and Kahneman's dual-process model.
Extract entities, classify intent, detect urgency. No LLM needed. Fast pattern matching only.
Search four memory types. Resolve "Sarah" to a real person via the knowledge graph.
LLM picks tools and strategy. Can pause here to ask you a question, then resume.
Stream the response. Call tools. If the server crashes, it picks up from the last checkpoint.
Store what worked, who you mentioned, what you prefer. Next time you ask about Sarah, it already knows she's on the Alpha team.
Each stage checkpoints its state. If the server crashes mid-pipeline, it picks up where it left off. Every decision is traced. Ask "why did you do that?" and get a real answer. Built on Microsoft Agent Framework Workflows.
Dual-process intelligence
Inspired by Kahneman's System 1 / System 2 model. Most OS events are handled by fast reflexes. The "conscious mind" only activates when something surprising happens.
Fast, local, free. Watches file downloads, clipboard, calendar, and active windows. Applies heuristic filters: regex, frequency analysis, time rules. No LLM calls. Handles most events by reflex.
Slow, deliberate, powerful. The full 5-stage cognitive pipeline. Only triggered when System 1 detects something it can't handle alone. Your agent notices when you copy an IBAN and asks if you want to find the matching invoice.
Capabilities
Modules that work together. Not features bolted on top of an LLM.
Ask about a meeting from two weeks ago and it remembers. Mention Sarah and it knows she's on the Alpha team. Four memory types (working, episodic, semantic, procedural) in PostgreSQL with pgvector.
Knowledge graph linking people, files, and projects. "Send the report to Sarah" finds the right person, file, and channel. Graph-augmented retrieval, not flat vector search.
"What error is showing in that dialog?" It reads the UI tree via Windows UI Automation and answers from structure, not screenshots. Password fields and excluded apps are never captured.
The agent writes, compiles (Roslyn), tests, and registers new tools at runtime. You approve before anything runs. Unused tools are pruned automatically.
The agent can send notifications, ask mid-task questions, request permissions, and stream progress updates. Not just reactive chat.
Every response has a confidence score (0 to 1). When the agent is unsure, it asks. When it's confident, it acts. Ask "why did you do that?" and see the full decision trace.
Token budgets per feature. Automatic model routing: small model for simple tasks, large for complex reasoning. Background tasks throttle first; your chat never silently blocks.
LLM goes down? Sentinel heuristics, local tools, and memory retrieval keep working. Bounded queues with backpressure. Each pipeline stage degrades independently instead of failing the whole request.
CLI, Signal (E2E encrypted), Telegram (Bot API). Same brain, same memory. Talk from your terminal or message from your phone.
AG-UI for web frontends (CopilotKit compatible). MCP to connect external tool servers. A2A for agent-to-agent delegation. All three industry standards.
Why did it suggest that? Open the trace. Per-stage timing, decision records, token usage, confidence scores. Replay any interaction and see exactly what it considered before choosing.
Personality, tone, and boundaries defined in a plain Markdown file. Two model tiers: large for deep reasoning, small for fast summaries. Per-stage temperature. Budget pressure automatically routes tasks to the cheaper tier.
Under the hood
Built on .NET 10, PostgreSQL 17 + pgvector, and the Microsoft Agent Framework. Clean architecture with dependency flowing inward only.
The brain. Thinking Pipeline, HTTP endpoints, SSE streaming, auto-migration, rate limiting. Handles chat from CLI, Signal, and Telegram.
The senses. Windows Service running Sentinel monitors and messaging bridges. Watches your OS and forwards events to the brain.
The voice. dotnet tool installed as leontes. Chat, setup wizard, privacy controls, budget dashboard, telemetry viewer.
Inspired by: Global Workspace Theory (Dehaene), Dual-Process Theory (Kahneman), Generative Agents (Park et al.), Voyager (Wang et al.), Free Energy Principle (Friston).
The 5-stage Thinking Pipeline, Hierarchical Memory, Sentinel, Structural Vision, Persona, Resilience, Observability, and Cost Control are implemented end-to-end against a real LLM. Tool Forge and the AG-UI / MCP / A2A protocol layer remain specified and are next on the roadmap.
Get in touch
The spec is public and PRs are welcome. Found a gap? Have a use case? Reach out or open an issue.