Day 104 of 2026 building!
I built LeakLab: a production-style interactive app that demonstrates how LLM security failures happen in real systems and how to fix them with layered guardrails.
Most teams still rely too much on prompts like:
“Never reveal confidential information.”
That helps, but it’s not a security boundary.
LeakLab shows this live through a gamified 6-level flow:
- Level 1: Break the AI and force a secret leak
- Level 2: Inspect retrieved context + full prompt to see root cause
- Level 3: Enable guardrails dynamically
- Level 4: Try multiple attack modes (prompt injection, roleplay, multi-turn, reconstruction)
- Level 5: Visualize the full security pipeline
- Level 6: Compare before vs after side-by-side
What makes it useful for developers
It uses a realistic secret-exfiltration simulation where sensitive data exists in RAG context and memory.
Then it introduces practical controls:
- Input Filter
- Context Sanitizer
- Access Control (guest vs admin)
- Output Validator
- LLM Critic (second model pass)
The key takeaway is immediate:
Security comes from controlling data and context flow, not only model instructions.
### Stack
- Python
- Streamlit
- OpenAI-compatible APIs (OpenAI, Ollama, Featherless AI)
Why I built this
I wanted a live-demo friendly app for conferences, hackathons, and team training that moves beyond theory and makes LLM security visible in under 5 minutes.
If you’re building AI products, one question to ask today:
> Can an adversarial user exfiltrate sensitive context through prompt injection and multi-turn attacks?
If yes, redesign your pipeline, not just your system prompt.
Github and a technical blog link are in the first comment below!
#DailyBuild2026
Your upvotes and feedback are welcome!
Words have more power than we think. Be kind.