Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini

Go to Workflow
0 views
Built by Ravi Patel Ravi Patel
Created on June 15, 2026

Description

Quick Overview
This workflow manually runs to read a list of webpage URLs from Google Sheets, scrape each page with ScrapingBee, and use Google Gemini to extract structured product data from screenshots with an HTML fallback, then append the results back into a Google Sheets sheet.

How it works
Runs when you manually trigger the workflow.
Reads the list of URLs to scrape from a Google Sheets spreadsheet.
Fetches a full-page screenshot for each URL using the ScrapingBee API.
Sends the screenshot (and the URL for context) to a Google Gemini model to extract product details into a structured JSON format, calling a ScrapingBee HTML fetch tool when screenshot extraction is incomplete.
Converts any fetched HTML to Markdown and returns it to the Gemini agent to complete the extraction.
Splits the extracted product array into individual items and appends them as new rows in the Google Sheets “Results” sheet.

Setup
Create a Google Sheets service account connection in n8n and set the target spreadsheet and the “List of URLs” and “Results” sheet selections.
Add a Google Gemini (PaLM) API credential in n8n and ensure the selected model (gemini-1.5-pro-latest) is available for your account.
Add your ScrapingBee API key in both ScrapingBee HTTP requests (screenshot and HTML) and confirm the target URLs are reachable from your n8n environment.
Ensure the “Results” sheet columns match the structured output fields (for example: product_title, product_price, product_brand, promo, promo_percentage/promo_percent) or update the output schema and column mappings accordingly.

Nodes Used (6)

AI Agent
@n8n/n8n-nodes-langchain.agent
Call n8n Workflow Tool
@n8n/n8n-nodes-langchain.toolWorkflow
Google Gemini Chat Model
@n8n/n8n-nodes-langchain.lmChatGoogleGemini
Google Sheets
n8n-nodes-base.googleSheets
HTTP Request
n8n-nodes-base.httpRequest
Structured Output Parser
@n8n/n8n-nodes-langchain.outputParserStructured