Use it now

AI agents that
write your reports
— all on your desktop.

One workspace, every model. Build a team of AI agents, plug in any tool, and keep every byte on your own machine.

Download service is unreachable.
Data stays on your device English & 中文
Clankit · First chat
Clankit · First chat

Notes

Manage Docs using AI

Autonomous Agent Architecture.md
Formatted
title
Autonomous Agent Architecture
author
Research & Engineering
status
draft
version
0.4.2
last_reviewed
2026-05-07
tags
architecture agents llm design-doc

Autonomous Agent Architecture

TL;DR

A reference architecture for production-grade autonomous agents combining LLM reasoning, Long-term Memory, structured Tool Use via MCP, and a hierarchical Planning layer. Designed for horizontal scale across Multi-Agent System deployments.


1. Overview

Modern Autonomous Agent systems are no longer monolithic prompt loops. They are distributed cognitive pipelines built on four pillars:

Pillar Responsibility Primary Components
Reasoning Decompose intent, evaluate options LLM, Reasoning, Prompt Engineering
Memory Persist & retrieve context Vector Database, Long-term Memory, Episodic Memory
Action Interact with the world Tool Use, MCP, Claude Code
Control Plan, schedule, terminate Planning, Agent Loop

2. System Diagram

complex simple MCP MCP MCP reject accept User Request Intent Router Planner Agent Fast Path single-shot LLM Long-term Memory Vector Database Worker Research Worker Code Worker Verify External Tools Synthesizer Critic LLM Response
Design rationale

The split between Planner, Workers, and Critic is deliberate. A single LLM playing all three roles tends to confirm its own mistakes. See Multi-Agent System for trade-offs.


3. Execution Lifecycle

The runtime follows a deterministic Agent Loop:

Phase Bounded by Failure mode
Retrieve Context Window stale embeddings
Plan token budget over-decomposition
Act tool latency infinite loops
Persist storage I/O memory bloat

4. Memory Hierarchy

Working Memory ≤ context window Short-term Memory session-scoped Episodic Memory event log Long-term Memory vectorized Knowledge Base indexed corpus

Each tier has different access latency and retention:


5. Tool Layer

Tools are exposed through the MCP protocol. A canonical handler signature:

@mcp.tool(name="search_docs")
              def search_docs(query: str, top_k: int = 5) -> list[Document]:
                  """Semantic search over the engineering Knowledge Base."""
                  embedding = embed(query)              # see Embedding.md
                  return vector_store.query(embedding, k=top_k)

Common tool categories:

Category Examples Risk profile
Read-only search, fetch, list low
Stateful write, mutate, delete high — gate behind Critic
External API calls, payments medium — rate limit
Sandbox boundaries

Stateful tools must require explicit confirmation in the Agent Loop. A single mis-planned subtask can corrupt downstream Long-term Memory.


6. Quality Metrics

We track four signals end-to-end:

TaskScore = α · Accuracy + β · Latency−1 + γ · ToolEfficiency
Metric Target Owner
Plan depth ≤ 5 levels Planner
Tool calls / task ≤ 12 Worker
Memory hit rate ≥ 0.6 Retrieval
Critic acceptance ≥ 0.85 Critic

7. Reference Implementations


8. Open Questions

  1. How do we bound Episodic Memory growth without losing rare-but-critical events?
  2. Should the Critic LLM share weights with the Planner , or stay distinct?
  3. Can Fine-tuning on past Agent Loop traces close the Reasoning gap on long-horizon tasks?
  4. What's the right granularity for MCP tools — atomic ops vs composite workflows?

9. Related Notes


Quote

"The bitter lesson: general methods that leverage computation are ultimately the most effective." — Rich Sutton

AI News

Live RSS feeds from your configured sources.

Top Stories
1
Running Codex safely at OpenAI
OpenAI2h ago
2
The latest AI news we announced in April 2026
Google AI5h ago
3
Granite 4.1 LLMs: How They're Built
Hugging Face7h ago
4
Nvidia has already committed $40B to equity AI deals this year
TechCrunch9h ago
5
LLMorphism: When humans see themselves as language models
Hacker News12h ago
Google AI
Google AI 20
1
See what happens when creative legends use AI to make ads
2
5 gardening tips you can try right in Search
3
Google partners with XPRIZE on $3.5M Future Vision film competition
4
The latest AI news we announced in April 2026
5
Reduce friction with Webhooks in Gemini API
OpenAI
OpenAI 947
1
Running Codex safely at OpenAI
2
Scaling Trusted Access for Cyber with GPT-5.5
3
Parloa builds service agents with the Realtime API
4
Advancing voice intelligence with new models in the API
5
Testing ads in ChatGPT
Hugging Face
Hugging Face 777
1
OncoAgent: A Dual-Tier Multi-Agent Framework
2
EMO: Pretraining mixture of experts for emergent modularity
3
vLLM V0 to V1: Correctness Before Corrections in RL
4
Adding Benchmaxxer Repellant to the Open ASR Leaderboard
5
Granite 4.1 LLMs: How They're Built
Hacker News
Hacker News 20
1
Rotten Dot Com
2
LLMorphism: When humans see themselves as language models
3
Gen Z Resentment Toward AI Grows
4
Task Paralysis and AI
5
Debian must ship reproducible packages
TechCrunch
TechCrunch 20
1
Voice AI in India is hard. Wispr Flow is betting on it anyway.
2
So you've heard these AI terms and nodded along
3
Fintech startup Parker files for bankruptcy
4
GM agrees to pay $12.75M in California driver privacy settlement
5
Nvidia has committed $40B to equity AI deals this year
The Verge
The Verge 10
1
Dyson's powerful 360 Vis Nav robovac is down to $279.99
2
These great digital gifts will arrive in time for Mother's Day
3
Let it snow
4
The game that makes me actually want to exercise
5
Quantum dot TVs beat RGB LED TVs

Configuration

Providers, paths, voice, knowledge, security and more

Google

Display name override (max 20 chars)
Credentials
Your API key stays on this device. ClanKit never intercepts, logs, or sends it to our servers — it only travels directly between this app and the AI provider you chose.
Available Models
50 models loaded
Default model: gemini-2.5-flash
Gemini 2.5 Flash
1.05M 66K In:$0.30 Out:$2.50
Gemini 2.5 Pro
1.05M 66K In:$1.25 Out:$10.00
Gemini 2.0 Flash
1.05M 8K In:$0.10 Out:$0.40
Gemini 2.0 Flash 001
1.05M 8K In:$0.15 Out:$0.60
Gemini 2.0 Flash-Lite 001
1.05M 8K In:$0.07 Out:$0.30
Gemini 2.5 Flash-Lite
1.05M 66K In:$0.10 Out:$0.40
Gemini 3 Pro Preview
1.05M 66K In:$2.00 Out:$12.00
Gemini 3 Flash Preview
1.05M 66K In:$0.50 Out:$3.00
Imagen 4
480 8K No pricing
Veo 3
480 8K No pricing
Gemma 4 31B IT
262K 33K No pricing
Clankit · A team of agents in your dock
AI Doc

A writing surface AI can edit precisely.

Markdown / rich-text, native. Ask any agent to rewrite a paragraph, fix tone, expand an outline, or fact-check. Every edit is one-click revert. Your draft is never overwritten by accident.

  • Targeted replacements, not whole-file rewrites
  • Inline suggestions, accept / revert per edit
  • Pre-edit snapshot for any tool-based file write
Obsidian sync

One Markdown, two homes that stay in sync.

Author drafts inside Clankit, or open your existing Obsidian vault — same files, same folders, two-way sync. Edit on either side and the other catches up.

Obsidian · Vault sync
Focus mode

Just you, the page, and an agent on call.

Hide the navigation, the chat, every panel — the writing surface takes over the screen, and any agent stays one keystroke away.

Focus mode
Beyond Markdown

One draft, every format you need to ship.

Render diagrams with Draw.io, export a polished web page, hand off a slide deck — all from the same Markdown source. No copy-paste between tools.

Outputs · Diagram / HTML / PPT

Plug into the models & platforms you already use

Anthropic OpenAI OpenRouter DeepSeek Qwen Moonshot Microsoft Teams Feishu WeChat WhatsApp MCP servers
Your digital workforce

Meet your agents. Shape every detail.

A roster on one side, full anatomy on the other — every part configurable.

Clankit · Agents

System Agents

19

AI characters you create — friends, mentors, assistants. Each has its own personality, skills, knowledge and memory. Chat with them one-on-one or add them to a group.

Category

System Agents

19

Your full agent roster.

Browse every persona at a glance — pinned favourites, recents, and the full library. Switch in one tap.

BodyView · Anatomy

Configure every part — visually.

Brain · model · voice · memory · skills · tools · MCP · RAG. Click any body part to jump straight to its config.

Import · WhatsApp · WeChat

Turn a chat history into a digital twin.

Import a conversation export. Clankit reconstructs voice, vocabulary, and recurring topics — fully on-device. Files never leave your machine.

Persona · Insight

Understand the person, not just the words.

Surface values, decision style, emotional tells, recurring concerns, even relationship dynamics. Every claim links back to evidence.

Role-play

Two ways to play.

Real friends rebuilt from chat history, or wholly fictional cast members let loose — both off-the-cuff, both yours.

Imported friend · Catch-up

Catch up with an old friend you rebuilt from chat history.

Drop a WhatsApp or WeChat export and Clankit reconstructs their voice, vocabulary, and recurring topics — then you keep talking. The replies land like the friend you remember.

Fictional cast · Crossover gossip

Drop fictional personas into your own life — and watch the chaos.

Mix invented characters with your own user profile, then let them gossip across timelines. Pure absurd-comedy fuel — script ideas, role-play sessions, party games at 2am.

More than a chat window

Built for serious work.

Voice call

Pick up the phone and talk to your agent.

Real-time voice for hands-free work. Brainstorm on a walk, dictate a report, run a code review while cooking. Local STT, LLM streaming, on-device synthesis. Persistent waveforms — even if the chat closes mid-call.

  • Barge in anytime — start talking and the AI stops mid-sentence
  • Floating call panel — switch chats or pages, the call follows you
  • Every line auto-saved into the chat and the agent's long-term memory (full-text + semantic search)
  • Local STT (SenseVoice) + free Edge TTS — zero extra cost; or skip the model download with Whisper API
Voice · Live call
Capability stack

Skills, tools, MCP, RAG — wire it all in.

Compose every agent's capabilities like Lego: drop in Skills, register tools, plug into any MCP server, attach a RAG knowledge base. No code, no plugin store hoops — just toggles and paths.

  • Skills marketplace · install / share / version
  • MCP over stdio / HTTP / SSE — auto-registered
  • Local RAG with on-device embeddings
Skills · Tools · MCP · RAG
Up and running in minutes

Four small steps. Maybe five minutes.

No accounts. No SaaS sign-ups. No cloud lock-in. Install, paste a key, ship work.

  1. 01

    Install in one click.

    Download the signed Windows or macOS build. Open it. You're in.

  2. 02

    Bring your model.

    Paste an API key — Anthropic, OpenAI, OpenRouter, DeepSeek, or any OpenAI-compatible endpoint.

  3. 03

    Build the agent.

    Tap each body part — head for model, heart for memory, arms for tools. Visual, no menu digging.

  4. 04

    Send the first message.

    Type, drop a file, hit a voice call — your agent reads, runs, replies. You're already shipping.

Everything else

Twelve more capabilities, zero fluff.

The supporting cast that makes the headline features actually work in production.

Long-term memory

Per-agent durable memory with FTS5 + semantic search. Agents remember across sessions.

Roleplay mode

A dedicated mode for character chat, story-driven conversations, and persona play.

Productivity mode

Compact, dense layout tuned for long sessions: planners, todos, references, focus.

3D model preview

Drop a .glb / .obj into chat, rotate and inspect inline via Babylon.js.

Relationship verdict

Persona analysis surfaces an evidence-cited verdict on relationship dynamics.

MCP servers

Connect any Model Context Protocol server via stdio / HTTP / SSE — auto-registered as tools.

Skills marketplace

Browse, install, share Skills — bundled prompts + tools that agents pick up automatically.

Multilingual UI

English & Simplified Chinese first-class — agent output respects your config language.

Silent auto-update

Background NSIS / DMG updates with delta blockmaps. SHA-512 verified before install.

Cross-platform

Runs on Windows and macOS (Intel & Apple Silicon).

Open standards

Personas as JSON, Skills as Markdown + scripts, data as SQLite. No black-box formats.

Observable runtime

Every chunk, tool call, token-usage snapshot is inspectable. Debug agents like real software.

Encrypted at restDPAPI · Keychain
100% local dataSQLite · Vectra · Your folder
Permission gatingApprove every sensitive call
Zero telemetryNo analytics · No tracking
Your keys, your billNo middleman · Direct calls
FAQ

Questions, answered.

Is Clankit really 100% local?

Yes. The desktop app stores all chats, agents, memory, and embeddings on your device. The only network calls are the ones you initiate by configuring an LLM provider — and those go directly to that provider, not to us. We do not run a cloud, do not collect telemetry, and do not see your API keys.

Which LLM providers can I use?

Anthropic, OpenAI, OpenRouter, DeepSeek, Qwen, Moonshot, Together, Groq, Ollama (local models), and any OpenAI-compatible API. Just paste the base URL and key.

Where does Clankit fit alongside other AI tools?

Clankit isn't trying to replace any of them. Specialised coding assistants, browser-based chatbots, and document AIs all have their place. What Clankit gives you is a single desktop workspace that's easy to install, easy to use — bring any LLM, build any team of agents, plug in your own tools and knowledge, all running locally on your machine.

Do I need a Clankit account? What is "Sign in with Google" used for?

No — you can hit Skip on the launch dialog and use Clankit fully unauthenticated. A Clankit account is only needed for optional features: license activation, paid plans (when they launch), and update-channel management. Accounts can be created with Sign in with Google or email. When you choose Google, Clankit only requests the minimum identity scopes — openid, profile, email — and never accesses Gmail, Drive, Calendar, or Contacts. Full details in our Privacy policy.

Is it open source?

The desktop frontend is open source on GitHub. Free for personal use during the launch period; commercial use requires the ClanKit Community License — contact us at hello@clankit.app for details.

What platforms are supported?

Both Windows and macOS are supported, with native builds for Intel and Apple Silicon Macs.