A retrieval-augmented QA system over enterprise documents (Confluence, Google Drive, Slack, GitHub, Jira) that answers with grounded, cited responses — and abstains when the answer isn't there.
The headline feature is a cost-bounded agentic retrieval loop: the agent retrieves, asks itself whether it has enough context, and reformulates the query until it does. What sets it apart from most 'agentic RAG' demos is that it provably cannot overspend — a reserve-based budget (spent + next_call + reserve ≤ cap), a tiktoken-counted context cap, and a hard iteration limit. In a live run a $0.0035 cap halted the loop at $0.002012.
It ships with an honest eval harness that scores retrieval recall, faithfulness, and abstention-correctness per source-type and per question-type — no cherry-picking. It's perfect on abstention (never fabricates) and shows its weak spots rather than hiding them.
Stack: Django/DRF, pgvector hybrid retrieval (dense + Postgres FTS fused by Reciprocal Rank Fusion), Next.js, Docker. Evaluated on EnterpriseRAG-Bench.
Built with