CrawlForge
HomeUse CasesIntegrationsPricingDocumentationBlog
Extract Web Data With Local LLMs (Ollama + CrawlForge)
AI Engineering
Back to Blog
AI Engineering

Extract Web Data With Local LLMs (Ollama + CrawlForge)

C
CrawlForge Team
Engineering Team
May 24, 2026
9 min read

On this page

Quick Answer

extract_with_llm is a CrawlForge tool that runs LLM-powered web extraction against a local Ollama instance by default, with optional fallback to OpenAI or Anthropic. No external API key is required, your scraped data never leaves your machine, and per-extraction LLM cost drops from cents to zero. It costs 3 CrawlForge credits per call regardless of model.

A web extraction task should not require sending your scraped data to OpenAI. With extract_with_llm and a local Ollama install, structured extraction now runs entirely on your machine -- no API key, no per-token cost, no third party seeing your content. This post is the deep dive: why local matters, how to set it up, and when to use it versus the alternatives.

Table of Contents

  • What Is LLM Web Extraction?
  • Why Local Ollama (and Not OpenAI)?
  • Setup: Ollama + extract_with_llm in 5 Minutes
  • list_ollama_models: Discover What Is On Your Machine
  • Schema-Driven Extraction
  • When to Use Ollama vs OpenAI vs Anthropic
  • Cost Comparison
  • Limitations and Accuracy

What Is LLM Web Extraction?

LLM web extraction is the workflow where you hand a model two things -- the raw HTML of a page and a schema describing what you want -- and the model returns structured JSON. It is the modern replacement for hand-written CSS selectors. When a site's layout changes, the selectors break; the LLM adapts.

CrawlForge has had this for a while as extract_structured (CSS-first, LLM fallback). The new extract_with_llm tool is LLM-first: every extraction is an LLM call. What changed in v4.2.2 is the default provider. Previously you needed an OpenAI or Anthropic key. Now you do not.

Why Local Ollama (and Not OpenAI)?

Three reasons, in descending order of importance for most teams.

1. Your scraped data stays on your machine. When you call OpenAI, the page content goes to OpenAI. For competitive intelligence, legal docs, customer data, or anything regulated, that is often a deal-breaker. Local Ollama means localhost only.

2. The LLM is free. A gpt-4o-mini call costs around $0.0001 per extraction at scale. Sounds like nothing -- until you are running 10,000 extractions a day. Ollama is $0 plus electricity. You still pay 3 CrawlForge credits per call regardless of provider.

3. No new API key to manage. If you have Ollama, you have a provider. No signup, no billing alerts, no rotation.

The trade-off is model quality. A llama3.1:8b on a laptop will not match gpt-4o on hard pages. For the 80% case (clean HTML, simple schema), it is fine. For the 20% (heavily formatted, ambiguous), you can fall back to OpenAI or Anthropic in the same call.

Setup: Ollama + extract_with_llm in 5 Minutes

Bash

That is the Ollama side. Now from CrawlForge (via the CLI, the MCP server, or the API -- any path works):

Typescript

From the CLI:

Bash

list_ollama_models: Discover What Is On Your Machine

Before running a big batch, sanity-check what is actually installed:

Bash

Or via MCP/SDK:

Typescript

This costs 0 credits. It is a discovery helper, not an extraction call. Useful in onboarding flows, CI smoke tests ("is Ollama set up correctly?"), and as the first step in any local-LLM pipeline.

Schema-Driven Extraction

The schema is a standard JSON Schema object. The LLM reads it and produces matching output. Two patterns work well:

Pattern 1: Flat schema for simple data

Typescript

Pattern 2: Nested schema for richer data

Typescript

The smaller the schema, the more reliable the extraction -- especially on local 8B models. If you find a model hallucinating fields, simplify.

When to Use Ollama vs OpenAI vs Anthropic

ScenarioRecommended providerWhy
High-volume batch extractionOllamaMarginal LLM cost = $0
Sensitive or regulated dataOllamaLocalhost only
Complex, multi-page reasoningOpenAI (gpt-4o)Best reasoning on hard pages
Long context (50K+ tokens)Anthropic (Claude)200K context window
Quick prototype, low volumeOllamaZero setup beyond install
Ambiguous content (sarcasm, slang)OpenAI or AnthropicLarger model = better disambiguation

The good news: switching providers is one parameter change. You can prototype on Ollama, hit a hard page, switch to provider: "openai" for that one call, and continue.

Cost Comparison

Cost-per-1,000-extractions, assuming average page (~3K tokens in, ~200 tokens out):

ProviderModelLLM cost per 1KCrawlForge credits per 1KTotal per 1K
Ollama (local)llama3.1:8b$0 (electricity)3,000 credits3,000 credits
Ollama (local)llama3.1:70b$0 (electricity, more)3,000 credits3,000 credits
OpenAIgpt-4o-mini~$0.453,000 credits3,000 credits + $0.45
OpenAIgpt-4o~$8.503,000 credits3,000 credits + $8.50
Anthropicclaude-haiku-4-5~$0.303,000 credits3,000 credits + $0.30
Anthropicclaude-sonnet-4-6~$4.503,000 credits3,000 credits + $4.50

CrawlForge credits are flat across providers -- you pay for the orchestration, not the model. At our Hobby tier (10,000 credits/$19/mo), that is roughly 3,300 LLM extractions per month for $19 + $0 (if you use Ollama).

Limitations and Accuracy

What local Ollama (llama3.1:8b) does well in our internal benchmarks:

  • Clean product pages (Amazon, Shopify): ~95% field accuracy
  • Article metadata (Substack, Medium): ~92%
  • Forum threads (HN, Reddit): ~88%

What it struggles with:

  • Heavily JS-rendered pages with minimal HTML (use scrape_with_actions first, then extract)
  • Pages with ambiguous structure (use a frontier model)
  • Schemas with 20+ fields (split into multiple extractions)
  • Non-English content (use a multilingual model like mistral)

If accuracy is critical and Ollama is missing fields, the fix is usually: simpler schema, or upgrade to provider: "openai" for that call.


Ready to extract data without leaving your machine? Start free with 1,000 credits and try extract_with_llm with your local Ollama. New to CrawlForge? Read the v4.2.2 launch announcement for context, or the CLI guide for a terminal-first workflow.

Tags

extract-with-llmOllamalocal-AIai-engineeringstructured-extractionprivacy

About the Author

C

CrawlForge Team

Engineering Team

Building the most comprehensive web scraping MCP server. We create tools that help developers extract, analyze, and transform web data for AI applications.

On this page

Frequently Asked Questions

Does extract_with_llm cost credits if Ollama is free?+

Yes -- 3 CrawlForge credits per call regardless of which LLM provider you use. The credit covers orchestration: fetching the page, handling anti-bot defenses, validating the response against your schema, and returning structured JSON. The LLM cost itself is separate: $0 with Ollama, or per-token pricing with OpenAI/Anthropic.

Which Ollama models work best for web extraction?+

llama3.1:8b is the recommended starting point -- fast, accurate enough for most pages, and runs on consumer hardware (8GB RAM minimum). For higher accuracy on complex pages, try llama3.1:70b (requires 48GB RAM). For multilingual extraction, mistral or qwen2.5 work well. Avoid models below 7B parameters -- they hallucinate fields.

Can I use a remote Ollama instance instead of localhost?+

Yes. Set the OLLAMA_HOST environment variable to your remote Ollama URL before calling extract_with_llm. This is useful if you run Ollama on a dedicated GPU server and your CrawlForge worker is on a different machine. The connection is still direct -- CrawlForge does not proxy it.

How does extract_with_llm compare to extract_structured?+

extract_structured is CSS-first: you provide selectors, it returns matched data, and falls back to an LLM only if selectors fail. extract_with_llm is LLM-first: every extraction is a model call. Use extract_structured when you know the selectors and they are stable; use extract_with_llm when the page layout is unknown or changes often.

Is my data sent anywhere when I use Ollama?+

No third party sees your data when provider is "ollama". The scraped HTML goes from CrawlForge to your Ollama instance (localhost or your own server) and the response comes back. Nothing is logged or stored on our side beyond standard request metadata. With provider "openai" or "anthropic", the HTML is sent to that vendor under their respective terms.

What if the page has anti-bot protection?+

Use stealth_mode first to fetch the page, then pass the HTML directly to extract_with_llm via its "html" parameter (skips the fetch step). Or combine in a single call: extract_with_llm with stealth: true makes the fetch use residential proxies and fingerprint rotation, then runs LLM extraction on the result.

Related Articles

MCP Protocol Explained: A Developer Guide for 2026
AI Engineering

MCP Protocol Explained: A Developer Guide for 2026

Learn how the Model Context Protocol works, why it matters for AI agents, and how to build MCP servers and clients with architecture diagrams and code.

C
CrawlForge Team
|
Apr 27
|
10m
How to Build a RAG Pipeline with Web Data
AI Engineering

How to Build a RAG Pipeline with Web Data

Build a production RAG pipeline that crawls websites, extracts content, chunks text, generates embeddings, and serves retrieval-augmented answers.

C
CrawlForge Team
|
Apr 14
|
11m
Stealth Mode Scraping: How CrawlForge Bypasses Anti-Bot Detection
AI Engineering

Stealth Mode Scraping: How CrawlForge Bypasses Anti-Bot Detection

Technical deep-dive into anti-bot detection systems and how CrawlForge's stealth mode features help you scrape protected websites ethically and effectively.

C
CrawlForge Team
|
Jan 22
|
14m

Footer

CrawlForge

Enterprise web scraping for AI Agents. 23 specialized MCP tools designed for modern developers building intelligent systems.

Product

  • Features
  • Pricing
  • Use Cases
  • Integrations
  • Changelog

Resources

  • Getting Started
  • API Reference
  • Templates
  • Guides
  • Blog
  • FAQ

Developers

  • MCP Protocol
  • Claude Desktop
  • Cursor IDE
  • LangChain
  • LlamaIndex

Company

  • About
  • Contact
  • Privacy
  • Terms

Stay updated

Get the latest updates on new tools and features.

Built with Next.js and MCP protocol

© 2025-2026 CrawlForge. All rights reserved.