ThinHarness is a minimal, opinionated Python harness for purpose-built agents — it owns the agent loop and leaves the rest of your app to you.
It exists for the gap between building the agent loop yourself and adopting a large runtime where the loop comes bundled with assumptions you don't need and can't easily change.
Why ThinHarness:
Focused scope: filesystem tools, structured output, tool retries, approvals, subagents, parallel LLM calls, skills, MCP, limits, streaming, and OTel tracing. Nothing more.
You keep your stack: orchestration, auth, storage, and deployment stay yours, because they become product-specific anyway.
Small enough to fork: full harness-level feature coverage in under 8k LOC, matching frameworks 5–10× its size; each major feature in its own file with no hidden dependencies.
Proven on real work: on a LongMemEval-V2 subset it outperformed the paper's task-tuned Codex harness (74.0% vs 72.4%) using ~46% fewer tokens, with only built-in tools.
Pre-1.0, MIT. I'd love feedback from people building agent workflows in Python.
Built with