Post by Harish Kotra

Id Verified
Harish Kotra
@harishh • #show  • 2mo

Day 104 of 2026 building!

I built LeakLab: a production-style interactive app that demonstrates how LLM security failures happen in real systems and how to fix them with layered guardrails.


Most teams still rely too much on prompts like:

“Never reveal confidential information.”


That helps, but it’s not a security boundary.


LeakLab shows this live through a gamified 6-level flow:


- Level 1: Break the AI and force a secret leak

- Level 2: Inspect retrieved context + full prompt to see root cause

- Level 3: Enable guardrails dynamically

- Level 4: Try multiple attack modes (prompt injection, roleplay, multi-turn, reconstruction)

- Level 5: Visualize the full security pipeline

- Level 6: Compare before vs after side-by-side


What makes it useful for developers


It uses a realistic secret-exfiltration simulation where sensitive data exists in RAG context and memory.


Then it introduces practical controls:


- Input Filter

- Context Sanitizer

- Access Control (guest vs admin)

- Output Validator

- LLM Critic (second model pass)


The key takeaway is immediate:


Security comes from controlling data and context flow, not only model instructions.


### Stack


- Python

- Streamlit

- OpenAI-compatible APIs (OpenAI, Ollama, Featherless AI)


Why I built this


I wanted a live-demo friendly app for conferences, hackathons, and team training that moves beyond theory and makes LLM security visible in under 5 minutes.


If you’re building AI products, one question to ask today:


> Can an adversarial user exfiltrate sensitive context through prompt injection and multi-turn attacks?


If yes, redesign your pipeline, not just your system prompt.


Github and a technical blog link are in the first comment below!


#DailyBuild2026

Your upvotes and feedback are welcome!

Words have more power than we think. Be kind.