Evaluate RAG Response Accuracy with OpenAI: Document Groundedness Metric

638 views

Built by

Jimleuk

Created on June 05, 2026

Description

This n8n template demonstrates how to calculate the evaluation metric "RAG document groundedness" which in this scenario, measures the ability to provide or reference information included only in retrieved vector store documents.

The scoring approach is adapted from https://cloud.google.com/vertex-ai/generative-ai/docs/models/metrics-templates#pointwise_groundedness

How it works
This evaluation works best for an agent that requires document retrieval from a vector store or similar source.
For our scoring, we need to collect the agent's response and the documents retrieved and use an LLM to assess if the former is based off the latter.
A key factor is to look out information in the response which is not mentioned in the documents.
A high score indicates LLM adherence and alignment whereas a low score could signal inadequate prompt or model hallucination.

Requirements
n8n version 1.94+
Check out this Google Sheet for a sample data https://docs.google.com/spreadsheets/d/1YOnu2JJjlxd787AuYcg-wKbkjyjyZFgASYVV0jsij5Y/edit?usp=sharing

Nodes Used (10)

AI Agent

@n8n/n8n-nodes-langchain.agent

Basic LLM Chain

@n8n/n8n-nodes-langchain.chainLlm

Default Data Loader

@n8n/n8n-nodes-langchain.documentDefaultDataLoader

Embeddings OpenAI

@n8n/n8n-nodes-langchain.embeddingsOpenAi

Evaluation

n8n-nodes-base.evaluation

HTTP Request

n8n-nodes-base.httpRequest

OpenAI Chat Model

@n8n/n8n-nodes-langchain.lmChatOpenAi

Recursive Character Text Splitter

@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter

Simple Vector Store

@n8n/n8n-nodes-langchain.vectorStoreInMemory

Structured Output Parser

@n8n/n8n-nodes-langchain.outputParserStructured

Evaluate RAG Response Accuracy with OpenAI: Document Groundedness Metric

Description

Nodes Used (10)

Select Nodes to Filter