Extract Arabic Text from PDFs with Mistral OCR, Telegram Bot & Google Docs

0 views

Built by

Abdulrahman Alhalabi

Created on June 09, 2026

Description

Arabic OCR Telegram Bot

How it Works

Receive PDF Files - Users send PDF documents via Telegram to the bot
OCR Processing - Mistral AI's OCR service extracts Arabic text from document pages
Text Organization - Processes and formats extracted content with page numbers
Create Google Doc - Generates a formatted document with all extracted text
Deliver Results - Sends users a clickable link to their processed document

Set up Steps

Setup Time: ~20 minutes

Create Telegram Bot - Get bot token from @BotFather on Telegram
Configure APIs - Set up Mistral AI OCR and Google Docs API credentials
Set Folder Permissions - Create Google Drive folder for storing results
Test Bot - Send a sample Arabic PDF to verify OCR accuracy
Deploy Webhook - Activate the Telegram webhook for real-time processing

Detailed API configuration and Arabic text handling notes are included as sticky notes within the workflow.

What You'll Need:
Telegram Bot Token (free from @BotFather)
Mistral AI API key (OCR service)
Google Docs/Drive API credentials
Google Drive folder for document storage
Sample Arabic PDF files for testing

Key Features:
Real-time progress updates (5-step process notifications)
Automatic page numbering in Arabic
Direct Google Docs integration
Error handling for non-PDF files

Nodes Used (4)

Code

n8n-nodes-base.code

Google Docs

n8n-nodes-base.googleDocs

HTTP Request

n8n-nodes-base.httpRequest

n8n-nodes-base.telegram

Extract Arabic Text from PDFs with Mistral OCR, Telegram Bot & Google Docs

Description

Nodes Used (4)

Select Nodes to Filter