⚖️ Legal Document Analyzer – Optimized LLM + RAG Backend Spring Boot, Gemini LLM, RAG, PgVector, Redis, JWT I rebuilt and optimized my Legal Document Analyzer into a production-oriented LLM system that performs clause extraction, summarization, risk detection, and document-specific Q&A using Retrieval-Augmented Generation (RAG). I redesigned the LLM response pipeline to produce structured outputs via prompt templates and parsed them directly on the backend into a relational schema mapped to legal headings. This eliminated large JSON parsing on the frontend, reduced payload size, and improved query performance and UI rendering efficiency. To enable document-specific querying, I implemented a full RAG pipeline using embeddings stored in PostgreSQL with PgVector. Documents are chunked and embedded; user queries are converted to vectors and matched via similarity search to retrieve top-K relevant context before LLM inference. This reduced hallucinations and significantly lowered token usage and cost while improving answer accuracy. Redis-based fixed-window rate limiting protects the system and controls LLM cost (2 analysis requests/day/user; 5 standard API calls/hour). The backend uses JWT stateless authentication and modular AI service integration. This rebuild provided hands-on experience with real-world RAG pipelines, vector search, LLM cost control, and scalable AI backend architecture. Key Highlights: RAG + PgVector • Structured LLM outputs • Redis rate limiting • Cost-aware LLM design