81
Most AI apps stop at “upload → ask → answer”.
I’m building Cortex to go deeper — as a scalable data processing + retrieval system.
Here’s what’s happening under the hood :
Users upload documents → stored via pre-signed URLs (S3) → metadata persisted → async processing kicked off via a queue (BullMQ).
A worker pipeline then:
• Downloads & buffers the file
• Extracts text (PDF parsing)
• Chunks the data
• Generates embeddings
• Stores everything in a vector database
On the query side:
User question → embedding → similarity search → context retrieval → LLM response
All of this is stitched together with:
• Async job processing (BullMQ)
• Real-time updates via SSE
• Modular services sharing a common data layer
Currently in Phase 1, focusing on getting the ingestion + processing pipeline rock solid.
The goal is to evolve this into a production-grade knowledge engine, not just another chatbot.
Would genuinely love feedback to improve on the current implementation.
Built with