Extract invoice data to Excel using Google Gemini, OCR, and Telegram
Go to WorkflowDescription
🚀 1. HOW IT WORKS
This workflow automatically extracts structured data from invoices sent via Telegram (PDF or image) and saves it to Excel.
A user sends an invoice (PDF or image) to a Telegram bot
The workflow detects the file type (PDF or image)
For PDF:
Extracts text directly from the file
Falls back to OCR if needed
For images:
Uses OCR to extract text
The extracted text is cleaned and processed
AI (Google Gemini) converts the raw text into structured JSON data
The data is validated and formatted
Valid data is saved to Excel (Microsoft Excel or Google Sheets)
A confirmation message is sent back via Telegram
This eliminates manual data entry and speeds up invoice processing.
⚙️ 2. SETUP INSTRUCTIONS
Prerequisites:
Telegram Bot
Create a bot using BotFather
Copy the Bot Token
Google Gemini API Key
Get API key from Google AI Studio
Excel / Google Sheets
Prepare a sheet with columns:
invoice_number, date, vendor, total, tax, items
Setup Steps:
Configure Telegram Trigger Node
Paste your Bot Token
Configure File Download
Ensure binary data is correctly passed
Configure OCR
Use Tesseract OCR node or external OCR API
Configure Google Gemini Node
Add API key
Use provided prompt for structured extraction
Configure Excel / Google Sheets Node
Connect your account
Map fields correctly
Test Workflow
Send a sample invoice via Telegram
Activate Workflow
🛠 Requirements
n8n (self-hosted or cloud)
Telegram Bot Token
Google Gemini API Key
Excel 365 or Google Sheets
📂 Output Example
{
"invoice_number": "INV-001",
"date": "2025-01-01",
"vendor": "ABC Company",
"total_amount": "100000",
"tax": "10000",
"items": [
{
"name": "Service A",
"quantity": "1",
"price": "100000"
}
]
}
🚀 Use Cases
Invoice automation
Accounts payable automation
Financial data entry automation
Document digitization
💡 Notes
Works best with clear and high-quality images
OCR accuracy depends on image quality
AI improves extraction accuracy significantly