An evolution-native multi-agent runtime.
Lineage lets Agents evolve like living organisms. They aren't disposable config files — they have parents, generations, immutable genome records, and fitness histories. Evaluation drives evolution: the weak retire, the strong reproduce.
Tell the Founder Agent what you want, and it turns a fuzzy idea into a high-quality Agent — running an offline evaluation as part of creation. Each subsequent use leaves an evaluable trail; only changes proven stronger are passed on to the next generation. In parallel, reusable how-tos accumulate, and any future Agent in the same domain inherits them.
The Agent Kernel is built for real work: long conversations don't lose the goal, multi-day tasks resume after crashes, prompt caching across providers saves 70-90% on tokens, and the advisor pattern works on any base model.
Local-first, self-hosted by default. Rust core + Ratatui TUI. Three install channels: brew, npm, or curl.
curl -fsSL https://www.lineagent.ai/install.sh | sh
Install#
Pick whichever package manager you already use. All three channels ship the same release — same semver, same binary, same self-update behavior. Choose by familiarity.
Homebrew (macOS & Linux)
brew install nowa/lineage/lineage
npm
npm install -g @lineage/cli
Direct (curl)
curl -fsSL https://www.lineagent.ai/install.sh | sh
The installer downloads a per-platform tarball, verifies its
SHA-256 against a published checksums.txt, and
drops lineage into ~/.local/bin. It
refuses to run as root and warns you if ~/.local/bin
isn't on your $PATH.
Verify:
lineage --version
lineage status
Windows is not yet supported. Use WSL 2 with a recent Ubuntu rootfs in the meantime.
What is Lineage?#
Lineage is a system where your Agents grow up on their own.
You don't need to write YAML and you don't need to know prompt engineering. Tell Lineage what you want — "help me triage SRE incidents," "write better release notes" — and the Founder Agent reads your codebase, designs the spec, writes the prompts, configures the tool policy, and runs an offline evaluation before delivery. The grunt work between a fuzzy idea and a usable Agent — Lineage does it for you.
More importantly, it doesn't stop at version one. Every conversation leaves an evaluable trail. Lineage notices what worked and what didn't, proposes refinements automatically, safely re-runs them against past traces, and only ships the changes proven to beat the current version. The Agent itself gets more precise — sharper prompts, tighter tool policy —— while its "playbook" grows alongside: every trial-and-error becomes a reusable how-to that any future Agent in the same domain inherits. The Agent learns, and the team's collective knowledge accrues.
Underneath, the Agent Kernel is built for real work: long conversations don't lose the goal, multi-day tasks survive crashes and resume mid-flight, the full context-engineering toolkit is on by default, automatic prompt caching across providers saves 70-90% on tokens, and the advisor pattern works on any base model — run a cheap workhorse day-to-day, hand the hard problems to Opus or GPT-5. On agentic benchmarks, Lineage averages about +15% task pass rate and ~30% fewer tokens per successful task vs. Hermes and Claude Code.
Eight things Lineage does that most agent products don't:
- High-quality Agents from a fuzzy idea. Tell Lineage roughly what you want — "I need an agent for SRE incident triage" or "help me write better release notes." Lineage's Founder Agent reads your codebase, designs the spec, writes the prompts and tool policy, and runs an offline evaluation before the new Agent is ever exposed to you. You don't write YAML; you describe the job.
- Your agents get better the more you use them. Every conversation leaves an evaluable trail. Lineage notices what worked and what didn't, proposes refinements, safely re-runs them against past traces, and only ships changes that beat the current Agent. You don't tune prompts every Friday — Lineage does, and you approve the result.
- Two ways your agents grow. The agent itself gets sharper — better prompts, tighter tool policy — but only after each change passes evaluation; only the proven ones are passed down to the next generation. Separately, the agent's playbook grows: hard-won lessons get written down as reusable how-tos any future Agent in the same domain inherits. The agent learning + the team's collective knowledge growing, at the same time.
- Fits the way you already work. Two modes, both on at once: a personal mode that follows you everywhere (à la OpenClaw — your habits, your memory, your shortcuts, regardless of which folder you're in) and a project mode that respects per-project boundaries (à la Claude Code — each codebase has its own memory, trust state, and rules). No toggle to flip.
- Best-in-class agentic reasoning. Lineage's Agent Kernel is engineered for the parts that decide who actually gets work done: long conversations that don't lose the goal, multi-day tasks that survive crashes and resumes, the full context-engineering scope (progressive disclosure + just-in-time retrieval + context offloading) wired in by default, automatic prompt caching across providers (70-90% token savings on repeat work), and the advisor pattern that was Claude-Code-only until now. Head-to-head against Hermes and Claude Code on agentic benchmarks, Lineage averages about +15% task pass rate and ~30% fewer tokens per successful task.
- Permissions that match how you actually trust software.
Three layers of override that mirror real life: a global
default ("on this laptop, never let agents run sudo"), a
per-project tightening ("inside this client repo, also
never touch
.env"), and a personal/local layer ("on my own machine, allow X for testing"). All three are plain JSON files you can read and revert — nothing hidden in a settings UI. - Advisor on tap, on any model. Run a fast, cost-balanced workhorse day-to-day; when the kernel hits a hard problem, it asks a stronger advisor on the same turn — Claude Opus, GPT-5, or whichever you've configured. Until now this was a Claude-Code-only superpower; Lineage delivers it on top of any model, including open-source ones running on your laptop.
- Coming soon: share evolved agents with people you trust. AMP (Agent Marketplace Protocol) is a peer-to-peer way to send your evolved Strains and Genomes to friends, teammates, or family — like sharing a Spotify playlist of agents you've trained. Their evaluation feedback flows back to you. No central authority deciding which agents live; the network grows by trust, not by leaderboard. Public timeline: MVP-B Alpha.
The four primitives that make this work:
Strain
A reusable agent template — coding, email, browser-ops. Defines which tools an agent in this family is allowed to touch and how it'll be evaluated. Forking a Strain is how you customize Lineage to your domain without losing the evolution machinery.
Agent
A working instance — your actual pal, explore, founder.
Each carries an immutable genome snapshot, a parent reference,
a generation number, and a live fitness history. Lifecycle is
4-state: candidate → stable → deprecated → extinct —
so retirement and rollback are first-class operations, not improvised.
Genome
The agent's source code — prompts, planner config, tool-policy weights — frozen as content-addressed JSON. Editing a genome creates a new child Agent, never mutates the parent. That's what makes "why did the agent change?" answerable by a diff.
Evolution
The closed loop other tools don't have. CUSUM watches the live quality signal → mutate → offline eval → human gate → Blue-Green swap → 7-day observation window → auto-rollback on regression. Selection is mechanical and reversible, not vibe-based.
Underneath, Lineage runs as four cooperating subsystems. A request enters Control, executes inside the AgentKernel, draws on Capability for tools and memory, and is observed by the Evolution loop. Same shape whether the request comes from your terminal, an IM channel, or the API.
Files are the Source of Truth; the SQLite DB is a materialized
index that can be rebuilt from data/ at any time.
Quick start#
Lineage is a TUI-first product. Drop into a project directory
and run lineage chat:
$ cd ~/your-project
$ lineage chat
∴ lineage 0.123.x — pal ready
> review the auth middleware for timing-attack risks
● filesystem.read crates/auth/src/middleware.rs
● filesystem.grep "compare|hmac|verify" crates/auth/
Found one — line 47 compares HMAC tags with `==`, which short-
circuits on first byte mismatch. A sufficiently noisy attacker
can recover the tag byte-by-byte from response timing.
Constant-time compare via `subtle::ConstantTimeEq` would fix it.
Want me to write the patch and run the test suite?
> yes, patch it
● filesystem.edit crates/auth/src/middleware.rs (1 hunk)
● cli.execute cargo test -p auth --lib
passed 17/17 in 3.4s
Done. Diff is ready to commit.
The first run walks you through provider setup (Anthropic / OpenAI / OpenAI-compat / Codex subscription), asks you to trust the project directory, and then drops into the three-pane chat layout — conversation in the middle, status bar at the bottom, optional context pane on the side.
Sessions are persistent. Run lineage chat --resume
to pick up where you left off. Run lineage agents
to list lineage trees, lineage cost to see token
spend, lineage soul to edit the active strain's
personality.
Models & built-in Strains#
Supported model providers
Lineage talks to LLMs through a multi-provider registry. You configure any subset; routing happens by model-name prefix. Mix and match per Strain — pal can run on Claude Sonnet while explore runs on a cheaper Qwen, all in the same session.
claude-*- Anthropic — Claude Opus / Sonnet / Haiku 4.x family.
gpt-*/o3-*/o4-*- OpenAI — Chat Completions and Responses API. Plus OpenAI Codex subscription with native attribution (browser-OAuth pairing, no API key needed).
deepseek-*- DeepSeek — V3 / R1 family.
qwen*- Qwen via DashScope — Qwen 3 / Qwen Max / Qwen Coder.
moonshot-*/kimi-*- Moonshot — Kimi K2 / Kimi Latest.
groq/llama-*/mixtral-*- Groq — Llama / Mixtral / DeepSeek hosted with Groq's LPU inference (very fast).
openrouter/*- OpenRouter — meta-router for 200+ models from Anthropic / OpenAI / Google / Meta / Mistral / etc.
minimax-*/fireworks-*- MiniMax (M2, abab) and Fireworks AI (DeepSeek, Llama, Qwen on Fireworks infra).
Built-in Strains
Lineage ships with six built-in Strains. Each has an immutable seed genome you can fork to create your own variants. Three are user-facing; three live inside the evolution pipeline.
User-facing
pal- Your primary agent. Every request flows through pal first — it handles tasks directly when it can, and delegates to specialists when domain expertise is needed.
explore- Read-only codebase navigator. Fast at finding patterns, generating reports, and answering "where is X defined?" without modifying anything.
founder- System engineer. Designs new Strains, bootstraps Agents, imports Skills. Use when you want to extend Lineage's agent ecosystem itself.
Evolution pipeline
analyzer- Diagnostic engine. Examines evaluation results to find patterns aggregate scores hide, attributes failures to root causes, and decides which mutation operators the evolution pipeline applies next.
grader- Evaluation oracle. Scores Agent outputs against EvalContract dimensions with cited evidence. Drives evolution decisions — accuracy is everything.
comparator- Blind judge. Compares two Agent outputs WITHOUT knowing which is the champion and which is the candidate. Bias toward either side corrupts the entire pipeline.
Tools#
Tools are how Agents act on the world. Lineage ships a Rust-native toolbox covering the moves a coding-and-ops agent actually needs: read & edit files, run shell, drive a browser, fetch web pages, delegate to other Agents, schedule background work. Every tool is gated by per-Strain genome weights — a Strain that doesn't declare a tool simply can't call it. No external runtime, no Node sidecar, no MCP-only fallback.
filesystem- read · write · edit · grep · glob · multi-edit, scoped to
the current
project_dir. Edits are diff-tracked. cli- execute shell commands. Per-Strain allowlist (regex), per run sandbox tier, optional human-in-the-loop confirmation for write-class commands.
delegate- spawn or call another Agent — sync wait, async fire, or polling check. Sub-conversation is its own ExecutionTrace for replay.
browser- navigate · click · type · evaluate · screenshot via the
bundled
agent-browserCDP client. Dedicated Chromium profile, loopback-only control, no shared cookies. web- fetch URL → Markdown (htmd), search via configured search provider, RSS pull. No headless browser fallback by default.
memory- strain-scoped JSONL memory:
save·recall·forget. Project memory is per-project_key, separate from cross-project strain memory. task- create & manage long-running multi-step tasks (Task → Step → Job, ADR-053). File-first state with a stuck-detection watchdog.
system- introspect the runtime — agent list, lineage tree, evolution status (CUSUM signal, candidates in flight), cost ledger.
schedule/file_watch- cron-style scheduled triggers and filesystem-change triggers. Both feed the background queue, not the interactive chat path.
email/calendar/channel- side-effect tools, idempotency-keyed at the DB level — re-running the same call won't double-send.
ask_user/checklist/progress- conversational utilities — pause for user input, render checklists in the TUI, stream progress markers.
There is no MCP wall here: every tool above is a Rust function behind a typed schema. MCP servers can be loaded as additional surfaces, but the core tools don't depend on them.
Agent Kernel#
The Agent Kernel is the part of Lineage that turns a chat turn into action — assembling the prompt, calling the LLM, dispatching tool calls, streaming the results back. We built it as a pure-computation core (Ports & Adapters), so the same Agent runs unchanged across every provider, every transport, every interaction surface. Six capabilities are wired in. Together they're why Lineage holds up where most agents fall over: long sessions, multi-step plans, and provider switching mid-flight.
- Context engineering
- Context engineering — fitting only the highest-signal
tokens into the model's finite attention window — is what
separates an Agent that stays sharp over weeks from one that
rots after lunch. Lineage covers the full canonical scope as
first-class slots (system instructions, project
profile, recalled memory, conversation history, tool state,
few-shot exemplars) with explicit per-slot token
budgets, all assembled by a deterministic
PromptCompiler— same inputs always produce the same prompt. Three field-recognized techniques are wired in by default: progressive disclosure — tools and memory enter as name + stub; the Agent callstool_search/memory_searchto expand on demand (typically 90%+ token savings on a fat catalog); just-in-time retrieval per turn instead of pre-loaded snapshots; and context offloading — bulky tool outputs and pre-compact history get persisted to disk and referenced by ID, so the live window stays lean. Conversations stream straight to JSONL on disk — a crash mid-turn resumes mid-turn instead of starting over. - Compaction — three operations, three timescales
- In most products, compaction is binary: when the window fills, summarize the past or fail. Lineage runs three distinct operations at three different cadences —— Prune on every turn (zero-cost, replaces stale successful tool results with placeholders); Compact when the window approaches budget (LLM-assisted structured summary preserving goal / constraints / decisions / next-step; full conversation archived as JSONL + Markdown before compaction so nothing is lost); Distill at section boundaries (LLM-assisted extraction into reusable knowledge artifacts). Cut points respect message boundaries — never mid-tool-result, never mid-tool-call-batch. The granularity is what lets a single Agent run a multi-day task without losing the goal it was given on turn one.
- Memory — 4 tiers, on your disk
- Anthropic's April-2026 Managed Agents release pitched
"memory as a filesystem" as the breakthrough; Lineage has
shipped that idea since day one — and goes further. Four
tiers, not one: Conversation (this turn's
live working set) → Project (per-project
knowledge,
data/projects/{key}/memory.jsonl) → Strain (cross-project domain expertise,data/memory/{strain_id}.jsonl) → Genome (immutable, fortified only by evolution). State-based promotion (mem0-aligned): information moves up only on evidence, never by leakage. Lower tiers are write-rate-bounded so a noisy session can't drown the signal. Every byte is plain JSONL on your disk — view it, edit it, delete it, version-control it. No vendor SaaS holds memory of your work for you. - Long-running tasks
- Real work outlives one chat turn. Lineage models
multi-step plans as Task → Step → Job ——
the Task is your goal, Steps are individually-checkpointable
units of work, Jobs run inside Steps. Task state lives on
disk first (
data/tasks/{id}/task.json), not in the agent's head, so a Step can pause for human review, resume after a break, or hand off between Agents. A background watchdog raises any Step that hasn't moved in N minutes — plans don't quietly die in the middle and surface as "wait, what happened to that thing I asked for?" three days later. Anthropic's Managed Agents shipped persistent memory in public beta to handle exactly this; Lineage has the equivalent shape on your local disk, no managed-cloud dependency. - Prompt cache
- Provider-side caching is wired in by default — Anthropic, OpenAI, DeepSeek, and any backend that exposes one. The kernel structures every prompt to be cache-friendly: stable system prefix at the top, volatile context at the bottom. Long sessions hit cache-read pricing on every repeat turn — typically 70-90% off the per-token cost — automatic, no app code changes.
- Advisor pattern — without the Anthropic lock-in
- Until now, the advisor pattern — a fast, cost-balanced
model running routine work and delegating only the hard cases
to a stronger model — has effectively been an Anthropic-only
capability, baked into Claude Code and unavailable elsewhere.
Lineage brings it to any provider, including open-source
local models. Run a cost-effective workhorse like
qwen-coderordeepseek-v3for the 90% of everyday turns; when the kernel detects a hard case, it auto-escalates to a SOTA advisor (Claude Opus, GPT-5, etc.) on the same turn — no manual model swap, no agent hand-off. You get Claude-Code-class reasoning quality at non-Claude-Code economics.
Where this lands competitively: most coding-agent products give you context engineering inside their UI, route a single provider, and end the trail at the conversation boundary. Lineage stores every turn as a replayable artifact, evolves the Agent that produced it based on those replays, and lets you swap the underlying model tomorrow. The kernel is the part that makes cross-session, cross-provider, cross-model continuity actually work.
Self-update#
lineage self-update is channel-aware. If you
installed via Homebrew, it redirects you to brew upgrade
lineage; npm to npm update -g @lineage/cli.
Only direct (install.sh) installs do an in-place
atomic POSIX rename swap, with a
.prev hardlink backup.
Background updates are opt-in. By default the
background updater is in notify-only mode: it checks GitHub
every 30 minutes, prints a banner if a new version is available,
and stops there. To turn on silent installs, edit
~/.config/lineage/lineage.toml:
[updates]
auto_check_enabled = true
auto_install_enabled = true # default false
check_interval_minutes = 30
Telemetry & privacy#
Lineage ships anonymous usage telemetry default-on. The data helps prioritize features and catch real regressions. We deliberately skipped the first-run consent prompt — every other consent dialog you've ever seen in a CLI tool taught you to tap "no" on reflex, which means default-off telemetry measures only a self-selected sub-population and biases every product decision. Default-on with a kill-switch gives us a representative sample while keeping you in control.
What is collected:
- Tool-call counts and names (
filesystem.read,delegate_wait, etc. — never their arguments). - Turn durations, session lengths, error counts.
- Strain keys (which agents you used).
- Lineage version + OS/arch.
What is NEVER collected:
- Filesystem paths.
- Prompts, LLM responses, or any tool input/output.
- Source-IP addresses (stripped at the Cloudflare edge before reaching the PostHog endpoint).
- Anything resembling your personal data.
Three opt-out paths:
# 1. Per-process env var (hot-disable for one invocation)
LINEAGE_DISABLE_TELEMETRY=1 lineage chat
# 2. Per-install (writes to lineage.toml)
lineage telemetry disable usage
# 3. Per-process for the local JSONL audit log specifically
LINEAGE_DISABLE_LOCAL_TELEMETRY=1 lineage chat
Telemetry uploads land in our own self-hosted PostHog instance
at telemetry.lineagent.ai — not PostHog Cloud, not
a third party.
FAQ#
- Why Rust?
- The Control / Execution / Capability planes need predictable latency and zero GC pauses on the hot path. Tooling is mature enough that we get these for free without paying in productivity. The TUI specifically benefits — Ratatui is extremely fast, and the streaming-tokens-into-three-panes rendering would be hard to keep tear-free in a GC'd runtime.
- Where does my data live?
- On your machine. Conversation history is JSONL under
data/conversations/. Genomes and strains are YAML/JSON underdata/. Memory is JSONL underdata/memory/. The SQLite DB is purely an index and can be rebuilt from these files vialineage rebuild-indexat any time. Files are the Source of Truth. - Is the agent autonomous? What about safety?
- The Capability Plane runs in a tiered sandbox — Wasm/WASI in
MVP, Linux namespace + seccomp + cgroups in Phase 1, MicroVM in
Phase 2. Tool policy is per-strain via genome weights:
a strain that doesn't list
cli_executein its allowlist literally cannot shell out. Browser automation goes through a dedicated Chromium profile and binds loopback-only. - What's a "genome"?
- An immutable JSON snapshot of an Agent's prompts, planner config, and tool-policy weights. Genome is content-addressed by SHA-256; editing one creates a new Agent (a child) and the old one stays archived for replay. This makes the lineage auditable: you can always answer "what changed?" by diffing parent vs child genomes.
- Linux / macOS only?
- Yes for now. Path/sandbox conventions assume POSIX (XDG dirs, flock, atomic-rename semantics). Windows lands in Phase 1 if there's demand. WSL 2 works today.
- How do I delete everything?
-
The first command deletes any uploaded events from the central PostHog and disables future uploads. Thelineage telemetry disable usage --purge rm -rf "${XDG_DATA_HOME:-$HOME/.local/share}/lineage" \ "${XDG_STATE_HOME:-$HOME/.local/state}/lineage" \ "${XDG_CONFIG_HOME:-$HOME/.config}/lineage" \ "$HOME/.local/bin/lineage"rmuses XDG defaults when the XDG variables are unset, so it removes the real local data, state, config, and binary paths. - Why not just use Claude Code / Cursor / Copilot directly?
- You can, and you should — they're great. Lineage is for when you want the agents themselves to evolve over time based on evaluation. Claude Code is a tool; a Lineage Agent is a lineage node with a fitness history. If your only need is "edit this file, fix this bug, write this test," you don't need Lineage. If you want to tune which agent personalities succeed at which tasks across weeks, and have that selection happen mechanically rather than by gut, that's where Lineage starts to pay off.