LOW / CODEBlueprints

operationsai

AI Document Data Extractor

Extract structured data from PDFs, invoices, or contracts using Claude AI. The workflow monitors a Gmail inbox for attachments, sends document content to Claude for extraction, and populates a Google Sheet with the structured results.

Intermediate~25 minn8nMake.com 561 downloads 1937 views

Workflow Flow

New Email with Attachment

Extract Document Text

Claude AI Extracts Data

Populate Google Sheet

Notify on Slack

Setup Instructions

1. Set up a Gmail filter or label (e.g., "Documents to Process") for emails containing attachments you want to extract data from. 2. In your automation platform, create a workflow triggered by "New Email" in Gmail matching your label or filter. 3. Add a Gmail node to download the attachment. For PDFs, add a text extraction step (many platforms have built-in PDF-to-text, or use a dedicated service). 4. Add an HTTP Request node to call the Claude API (POST https://api.anthropic.com/v1/messages). Set headers: x-api-key to your Anthropic API key, anthropic-version to 2023-06-01, content-type to application/json. 5. Set the model to "claude-sonnet-4-20250514" and max_tokens to 1024. Write a prompt that describes the exact fields you want extracted (e.g., vendor_name, invoice_number, date, line_items, total_amount, due_date). Ask Claude to return a JSON object with these fields. Include an example of the expected output format in the prompt. 6. Add a Google Sheets node to append a new row with the extracted data. Map each JSON field to the corresponding column in your sheet. 7. Add a Slack notification to confirm the extraction was successful, including the document name and key extracted values. 8. Test with a sample invoice email. Verify the extracted data appears correctly in Google Sheets. Adjust the prompt if fields are missing or incorrectly parsed.

Troubleshooting

**Claude misidentifies fields in documents:** Be very specific in your prompt about field names and formats. Include an example of expected output. For invoices, specify date format (YYYY-MM-DD), currency format, and how to handle multiple line items. **PDF text extraction produces garbled output:** Some PDFs are image-based (scanned documents). These require OCR before Claude can process them. Add a Google Cloud Vision or Tesseract OCR step before the Claude API call. **Google Sheets row has wrong column mapping:** Double-check that the JSON field names match your column headers exactly. Use a Function node to explicitly map fields to columns in the correct order before the Sheets append step. **Token limit exceeded on large documents:** Invoices are usually short, but contracts can be very long. For large documents, extract only the relevant pages or sections before sending to Claude. Summarize lengthy clauses instead of sending raw text.

Further Reading

Blueprint guideHow to Use AI to Extract Structured Data From Any DocumentRead the guide →Related readingBest AI Tools for Document Processing and Data ExtractionRead the guide →Related readingHow to Use AI to Extract Data From Invoices Without Manual EntryRead the guide →

Related Blueprints

Multi-Step Approval Workflow

Route approval requests through a sequential chain of approvers via Slack, log decisions in Google Sheets, and send final confirmation emails.

slackgmailgoogle-sheets

Advanced~40 min

Inventory Low Stock Alerts

Monitor inventory levels in Google Sheets and automatically send Slack and email alerts when any item drops below its reorder threshold.

google-sheetsslackgmail

Beginner~15 min

AI Personalized Sales Email Drafter

Generate hyper-personalized outreach emails using Claude AI based on prospect data from Google Sheets. Each email is tailored to the prospect's role, company, and pain points before being drafted in Gmail.

gmailgoogle-sheetsslack

Beginner~15 min

AI Deal Intelligence Briefing

Before a sales call, Claude AI pulls prospect info and recent context from Google Sheets, generates a comprehensive prep briefing with talking points, and posts it to Slack so your team walks into every call prepared.

google-sheetsslackgmail

Intermediate~20 min

Need a custom version?

We can build a tailored automation workflow for your specific needs.

New blueprints weekly

Get notified when we publish new automation workflows.