Build an OpenAI RAG system with document upload, semantic search and caching

0 views

Built by

ResilNext

Created on June 07, 2026

Description

Overview
This workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.

It allows users to upload documents, convert them into vector embeddings, and query them using natural language. The system retrieves relevant document context and generates accurate AI responses while using caching to improve performance and reduce costs.

This workflow is ideal for building AI knowledge bases, document assistants, and internal search systems.

How It Works

1. Input & Configuration
Receives requests via webhook (rag-system)
Supports two actions:
upload → process documents
query → answer questions
Defines:
Chunk size & overlap
TopK retrieval count
Database table names

Document Upload Flow

Text Extraction
Extracts text from uploaded PDF documents

Text Chunking
Splits text into overlapping chunks for better retrieval accuracy

Document Structuring
Converts chunks into structured documents

Embedding Generation
Generates vector embeddings using OpenAI

Vector Storage
Stores embeddings in PGVector (Postgres)

Upload Logging
Logs document metadata (user, filename, timestamp)

Response
Returns success message via webhook

Query Flow

Cache Check
Checks if query result exists in cache (last 1 hour)

Cache Routing
If cached → return cached response
If not → proceed to retrieval

Cache Hit Flow

Format Cached Response
Standardizes cached output format

Respond to User
Returns cached answer with cached: true

Cache Miss Flow

Vector Retrieval
Retrieves top relevant document chunks from PGVector

AI Answer Generation
Uses LLM with retrieved context
Generates accurate, context-based answer

Cache Storage
Saves query + response in database for reuse

Response
Returns generated answer with cached: false

Setup Instructions

Webhook Setup
Configure endpoint (rag-system)
Send payload with:
action: upload / query
user_id
document or query

OpenAI Setup
Add API credentials for:
Embeddings
Chat model

Postgres + PGVector
Enable PGVector extension
Create tables:
documents
query_cache
upload_log

Configure Parameters
Adjust:
Chunk size (e.g., 1000)
Overlap (e.g., 200)
TopK (e.g., 5)

Optional Enhancements
Add authentication layer
Add multi-tenant filtering (user_id)

Use Cases

AI document search systems
Internal knowledge base assistants
Customer support knowledge retrieval
Legal or compliance document analysis
SaaS AI chat with custom data

Requirements

OpenAI API key
Postgres database with PGVector
n8n instance (cloud or self-hosted)

Key Features

Full RAG architecture (upload + query)
PDF document ingestion pipeline
Semantic search with vector embeddings
Context-aware AI responses
Query caching for performance optimization
Multi-user support via metadata filtering
Scalable and modular design

Summary

A complete RAG-based AI system that enables document ingestion, semantic search, and intelligent query answering. It combines vector databases, LLMs, and caching to deliver fast, accurate, and scalable AI-powered knowledge retrieval.

Nodes Used (8)

AI Agent

@n8n/n8n-nodes-langchain.agent

Default Data Loader

@n8n/n8n-nodes-langchain.documentDefaultDataLoader

Embeddings OpenAI

@n8n/n8n-nodes-langchain.embeddingsOpenAi

OpenAI Chat Model

@n8n/n8n-nodes-langchain.lmChatOpenAi

Postgres

n8n-nodes-base.postgres

Postgres PGVector Store

@n8n/n8n-nodes-langchain.vectorStorePGVector

Recursive Character Text Splitter

@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter

Vector Store Question Answer Tool

@n8n/n8n-nodes-langchain.toolVectorStore

Build an OpenAI RAG system with document upload, semantic search and caching

Description

Nodes Used (8)

Select Nodes to Filter