Designed and developed a generative AI-powered medical chatbot using LangChain and GPT models, integrated with Pinecone for vector-based retrieval of relevant medical contexts. Deployed a Flask-based backend with AWS CI/CD pipelines, improving response relevance and reducing end-to-end response latency by approximately 35% through optimized retrieval and prompt orchestration.