Architected a highly optimized, enterprise-grade Retrieval-Augmented Generation (RAG) system capable of parsing and reasoning over complex, 200+ page research documents. By moving beyond naive text splitting and implementing semantic-aware retrieval, this system dramatically improved context precision and generative faithfulness. 1. Engineered a multi-stage ingestion pipeline using LlamaParse to extract and preserve complex Markdown tables and layouts from massive PDFs. 2. Replaced rigid token-window splitting with dynamic Semantic Chunking (via Gemma-3 embeddings) to partition documents by contextual shifts. 3. Slashed vector search latency and storage by 3x using Matryoshka Representation Learning (MRL), truncating dense vectors to 256 dimensions with no semantic loss. 4. Deployed a Dockerized Weaviate v4 vector database supporting Hybrid Search, Multi-Query Expansion, and Cross-Encoder Reranking. 5. Integrated DeepSeek-R1 (via Ollama) as the generative reasoning engine to synthesize reranked context into heavily grounded, source-cited responses. 6. Validated architecture via RAGAS, achieving 0.98 Context Recall, 0.93 Context Precision, and 0.72 Faithfulness.