Extract structured invoice JSON from PDFs with Mistral OCR and an LLM API

Go to Workflow
0 views
Built by Venkata V Venkata V
Created on June 05, 2026

Description

N8N AI LLM Unstructured Invoice data PDF OCR recognition to JSON output API

What this workflow does
Accepts a PDF or image upload via Webhook as binary property "data"
Runs OCR with the Mistral OCR node
Normalizes OCR text
Sends OCR text to an LLM to extract structured JSON
Cleans and normalizes the JSON
Returns either:
status: ok
status: review_needed

Setup
Import the workflow JSON into n8n
Create/attach Mistral AI credentials on the "Mistral OCR" node
Create/attach your choice LLM AI credentials on the OCR text to JSON converson node
Activate the workflow
POST a file to:
/webhook/ocr-to-json

Notes
This starter is tuned for invoices/documents but can be adapted for receipts, purchase orders, or forms.
Depending on your installed n8n version, the Mistral node parameter names may need minor adjustment after import.
The workflow returns review_needed when confidence is below 0.5.

Nodes Used (3)

Code
n8n-nodes-base.code
HTTP Request
n8n-nodes-base.httpRequest
Mistral AI
n8n-nodes-base.mistralAi