Built a distributed incident root-cause analysis platform for DevOps/SRE teams ingesting live Prometheus alerts. → Feedback-aware RAG pipeline (LlamaIndex + ChromaDB) auto-generates root-cause hypotheses — cutting mean triage time by ~70% → Non-blocking FastAPI endpoints sustaining sub-200ms P99 latency under high-throughput alert bursts → Multi-tenancy, JWT auth, rate-limiting; deployed on Render + Vercel with CI/CD via GitHub Actions → Built chaos simulator to stress-test incident detection across microservices Stack: Python · FastAPI · LlamaIndex · ChromaDB · PostgreSQL · Prometheus · Docker · Next.js Live Url - https://sentinel-sre-zeta.vercel.app