Build an OpenAI RAG system with document upload, semantic search and caching
Go to WorkflowDescription
Overview
This workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.
It allows users to upload documents, convert them into vector embeddings, and query them using natural language. The system retrieves relevant document context and generates accurate AI responses while using caching to improve performance and reduce costs.
This workflow is ideal for building AI knowledge bases, document assistants, and internal search systems.
How It Works
1. Input & Configuration
Receives requests via webhook (rag-system)
Supports two actions:
upload → process documents
query → answer questions
Defines:
Chunk size & overlap
TopK retrieval count
Database table names
Document Upload Flow
Text Extraction
Extracts text from uploaded PDF documents
Text Chunking
Splits text into overlapping chunks for better retrieval accuracy
Document Structuring
Converts chunks into structured documents
Embedding Generation
Generates vector embeddings using OpenAI
Vector Storage
Stores embeddings in PGVector (Postgres)
Upload Logging
Logs document metadata (user, filename, timestamp)
Response
Returns success message via webhook
Query Flow
Cache Check
Checks if query result exists in cache (last 1 hour)
Cache Routing
If cached → return cached response
If not → proceed to retrieval
Cache Hit Flow
Format Cached Response
Standardizes cached output format
Respond to User
Returns cached answer with cached: true
Cache Miss Flow
Vector Retrieval
Retrieves top relevant document chunks from PGVector
AI Answer Generation
Uses LLM with retrieved context
Generates accurate, context-based answer
Cache Storage
Saves query + response in database for reuse
Response
Returns generated answer with cached: false
Setup Instructions
Webhook Setup
Configure endpoint (rag-system)
Send payload with:
action: upload / query
user_id
document or query
OpenAI Setup
Add API credentials for:
Embeddings
Chat model
Postgres + PGVector
Enable PGVector extension
Create tables:
documents
query_cache
upload_log
Configure Parameters
Adjust:
Chunk size (e.g., 1000)
Overlap (e.g., 200)
TopK (e.g., 5)
Optional Enhancements
Add authentication layer
Add multi-tenant filtering (user_id)
Use Cases
AI document search systems
Internal knowledge base assistants
Customer support knowledge retrieval
Legal or compliance document analysis
SaaS AI chat with custom data
Requirements
OpenAI API key
Postgres database with PGVector
n8n instance (cloud or self-hosted)
Key Features
Full RAG architecture (upload + query)
PDF document ingestion pipeline
Semantic search with vector embeddings
Context-aware AI responses
Query caching for performance optimization
Multi-user support via metadata filtering
Scalable and modular design
Summary
A complete RAG-based AI system that enables document ingestion, semantic search, and intelligent query answering. It combines vector databases, LLMs, and caching to deliver fast, accurate, and scalable AI-powered knowledge retrieval.