Skip to main content
JW

Building a RAG chatbot with Laravel and OpenAI - my approach

James Wilson AI & Machine Learning 201 views
I just finished building a RAG (Retrieval-Augmented Generation) chatbot using Laravel as the backend and wanted to share my architecture. Stack: - Laravel 11 for the API and admin panel - OpenAI GPT-4 for generation - Pinecone for vector storage - Laravel Horizon for async embedding jobs How it works: 1. Admin uploads documents via Filament panel 2. A queued job chunks the document and generates embeddings via OpenAI 3. Embeddings stored in Pinecone with metadata 4. User asks a question, relevant chunks retrieved, sent as context to GPT-4 Key learnings: - Chunk size of 500 tokens with 50-token overlap works best for my use case - Always include source attribution in responses for trust - Cache frequent queries in Redis to save API costs Happy to answer questions about the implementation.

3 Replies

Best Answer
AK
Aisha Khan 1 week ago
Nice architecture James! A few suggestions from my experience: 1. Chunk overlap is critical. 50 tokens is good, I have found 10-15% overlap works well in general. 2. Hybrid search. Combine vector similarity with keyword matching (BM25) for better retrieval results. 3. Re-ranking. After retrieval, re-rank chunks by relevance before sending to the LLM. Also, have you considered using pgvector instead of Pinecone? It keeps everything in PostgreSQL and avoids another external SaaS dependency.
SM
Sofia Martinez 6 days ago
This is exactly what I want to build! Quick question: how do you handle document updates? If a document changes, do you re-embed the entire thing or just the modified chunks?
JW
James Wilson 5 days ago
Good question Sofia! I delete all chunks for that document and re-embed entirely. Trying to diff individual chunks gets complicated fast because chunk boundaries might shift when content changes. The re-embedding cost is minimal with batched API calls, about $0.01 per 100 pages of text.

Post a Reply