- Architected and deployed a multi-service enterprise RAG platform integrating internal knowledge (Confluence) with validation guardrails, hybrid retrieval (Pinecone dense search + MS-MARCO MiniLM reranking), and intent-aware routing to deliver reliable, citable AI responses while reducing unnecessary vector calls and latency. - Designed a decoupled ingestion pipeline (webhooks → staged state → deferred re-indexing) with retry logic and idempotent upserts to enable non-blocking, scalable indexing across enterprise data sources. Implemented enterprise-grade trust and reliability layers using DeBERTa-MNLI–based contradiction detection, Guardrails output validation, and modular Dockerized services (LLM engine, indexer, reranker, NLI, analytics) with fail-safe logging and non-blocking error handling. - Instrumented full-pipeline observability using ClickHouse + Metabase to track stage-level latency, status, and trace IDs, and integrated PromptLayer + LangSmith for prompt tracing, evaluation, and performance monitoring. Demo Link - https://drive.google.com/file/d/1owk99_p45ifnRXTgqgfJuUVEkU6i58fo/view