Lilith v5.0

A dark-fantasy CLI agent that runs entirely on your hardware. Hybrid memory, swarm intelligence, MCP protocol, multi-provider LLM, batch mode, skills with hot-reload, and real-time dashboard. No cloud lock-in. No subscription. Control your PC from Telegram or the terminal.

$ cd Asgard/Lilith
$ cp .env.example .env      # Configure your API keys
$ lilith                    # Run from anywhere

Why local-first?

Lilith is built on a simple belief: your data should stay on your machine. The LLM runs through LM Studio (localhost:1234). Memory is stored in local SQLite. Files never leave your drive. Telegram is just the remote control — the brain is at home.

:::info Key Facts

No API keys required for the core. LM Studio serves the model locally. Kimi available as remote fallback.
Works offline once LM Studio is loaded. Telegram needs internet, but the agent brain doesn't.
48GB RAM + RTX 3060 can run models up to ~27B parameters with CPU offload.
Multi-provider — automatic fallback from LM Studio to Kimi to any OpenAI-compatible API.
Model auto-detection — "auto" in config picks the best loaded model from LM Studio.
838 tests passing — Core, Memory, Swarm, MCP, Dashboard, CLI, TOML Config, Batch, E2E.

:::

How it works

Lilith v5.0 is organized in Six Realinos — each a module with its own law, all nourished by the same roots. The Orchestrator coordinates Skills, Memory, LLM Providers, Swarm, MCP, and Tools through a unified TOML configuration.

You (Telegram App) → Telegram Bot (Vanaheim) → [HTTP] → Gateway (FastAPI :8000) → Lilith (Orchestrator) → LM Studio (Local LLM)

The Gateway

The Gateway is the single point of contact for all external interfaces. It exposes REST endpoints for Telegram, file system operations, scheduler tasks, agents, plugins, memory, swarm, skills, and MCP.

Endpoint	Method	Purpose
`/api/telegram/chat`	POST	Main chat endpoint for Telegram messages
`/api/telegram/pregunta_rapida`	POST	Quick questions (no context persistence)
`/api/telegram/confirm`	POST	Human-in-the-loop confirmation
`/api/pc/fs`	GET/POST	File system operations
`/api/scheduler/tasks`	GET/POST	List or create scheduled tasks
`/api/agents`	GET/POST	List or spawn sub-agents
`/api/plugins`	GET	List available plugins and tools
`/api/memory/*`	Various	Vector memory read/write/search
`/api/swarm/*`	GET/POST	Swarm management: spawn, status, kill, save, load, history
`/api/skills`	GET/POST	List registered skills, trigger, hot-reload
`/api/mcp`	GET/POST	MCP server connection status and management

Core Components

🧠 Hybrid Memory (FASE 2)

Three-layer memory: vector embeddings (sentence-transformers), knowledge graph (NetworkX), and full-text search (FTS5). Auto-compression, entity extraction, session search, and context injection into every prompt. The agent remembers who you are and what you did.

🔮 Multi-Provider LLM (FASE 7)

LM Studio for local inference, Kimi (Moonshot) for remote, and any OpenAI-compatible provider. Automatic fallback when the primary provider fails. Model auto-detection via /models. Zero-config startup — just start LM Studio and go.

🤖 Swarm Intelligence (FASE 9)

Spawn LLM-powered specialist agents — researcher, coder, writer, critic — each with its own context and tool access. File locking prevents conflicts. Code shift notifications keep agents aware. Persistent sessions via SQLite. /swarm spawn, /swarm status, /swarm history.

⚡ Skills & MCP (FASE 8)

Hot-reloadable skill packs with auto-trigger. Skills inject context into prompts when relevant. MCP (Model Context Protocol) connects external tool servers dynamically. 35+ native tools for files, system, network, browser, desktop, coding, and more.

📅 Task Scheduler

Cron-like scheduling with persistent SQLite storage. Create, list, run, and delete tasks via REST or CLI. The scheduler wakes up the agent at the right time to execute background jobs.

📊 Real-time Dashboard (FASE 10)

Web dashboard with WebSocket live updates, multi-pane layout, terminal widget, and system monitoring. Watch agent activity, memory recall, swarm coordination, and tool invocations as they happen. Dark fantasy aesthetic throughout.

🔌 Plugin Architecture

Hot-pluggable tools with dynamic discovery. Enable/disable plugins at runtime. Custom tools registered by dropping a Python file in the plugins directory. Dynamic Tool Registry integrates MCP and native tools seamlessly.

📖 RAG Pipeline

Document ingestion with chunking, embedding via sentence-transformers, and semantic retrieval. Build a personal knowledge base the agent queries in real time. Index with /index, search with /search.

🖥️ PC Control

35+ native tools: file system, process management, Windows automation, browser interaction, coding assistant, network operations, desktop control. The agent can literally use your computer.

📜 TOML Config & Resilience (FASE 10)

Unified configuration in ~/.lilith/config.toml. Priority: TOML > env vars > defaults. Circuit breaker for provider failures, graceful shutdown, error tracking, and automatic recovery.

CLI & Telegram Commands

Control Lilith through the terminal CLI or remotely via Telegram. Both interfaces share the same Orchestrator, memory, and tools.

Command	Description
`/start`	Initialize the bot and show welcome message
`/status`	Check Gateway health, memory stats, and active agents
`/memory`	Inject a memory entry into the vector database
`/recall`	Search memories by semantic similarity
`/tasks`	List scheduled tasks (via Gateway scheduler API)
`/agents`	List active sub-agents and their stats
`/swarm spawn <task>`	Spawn a swarm of specialist agents for parallel work
`/swarm status`	Show active swarm agents and their progress
`/swarm history`	List past swarm sessions from SQLite
`/skills`	List registered skill packs and their status
`/mcp`	Show MCP server connection status and available tools
`/recall <query>`	Semantic search across all memories (vector + graph + FTS5)
`/compact`	Compress old memories to free context window
`/index <path>`	Index files/folder for RAG semantic search
`/search <query>`	Search indexed documents semantically
`/dashboard`	Start/stop the real-time web dashboard
`/stream`	Toggle streaming mode on/off
`/plugins`	List available plugins and tools
`/batch <prompt>`	Run Lilith in batch mode with no interactive session
`(any text)`	Sent to Lilith for processing with full context

Recommended Specs

Lilith is designed to run on consumer hardware. You don't need a data center.

	Minimum	Recommended	Reference Build
RAM	16GB	32GB+	48GB DDR4
CPU	Any modern	6+ core	AMD Ryzen 5 5500 (6c/12t)
GPU	—	NVIDIA 12GB+ VRAM	NVIDIA RTX 3060 12GB
Disk	~10GB free	SSD	Dual SSD setup
Models	7B params (Q4)	13-27B params	27B comfortably
Experience	Slow but functional	Fast responses, good context	Smooth

Why local-first?​

How it works​

The Gateway​

Core Components​

🧠 Hybrid Memory (FASE 2)​

🔮 Multi-Provider LLM (FASE 7)​

🤖 Swarm Intelligence (FASE 9)​

⚡ Skills & MCP (FASE 8)​

📅 Task Scheduler​

📊 Real-time Dashboard (FASE 10)​

🔌 Plugin Architecture​

📖 RAG Pipeline​

🖥️ PC Control​

📜 TOML Config & Resilience (FASE 10)​

CLI & Telegram Commands​

Recommended Specs​