Does extract_with_llm cost credits if Ollama is free?

Yes -- 3 CrawlForge credits per call regardless of which LLM provider you use. The credit covers orchestration: fetching the page, handling anti-bot defenses, validating the response against your schema, and returning structured JSON. The LLM cost itself is separate: $0 with Ollama, or per-token pricing with OpenAI/Anthropic.

Which Ollama models work best for web extraction?

llama3.1:8b is the recommended starting point -- fast, accurate enough for most pages, and runs on consumer hardware (8GB RAM minimum). For higher accuracy on complex pages, try llama3.1:70b (requires 48GB RAM). For multilingual extraction, mistral or qwen2.5 work well. Avoid models below 7B parameters -- they hallucinate fields.

Can I use a remote Ollama instance instead of localhost?

Yes. Set the OLLAMA_HOST environment variable to your remote Ollama URL before calling extract_with_llm. This is useful if you run Ollama on a dedicated GPU server and your CrawlForge worker is on a different machine. The connection is still direct -- CrawlForge does not proxy it.

How does extract_with_llm compare to extract_structured?

extract_structured is CSS-first: you provide selectors, it returns matched data, and falls back to an LLM only if selectors fail. extract_with_llm is LLM-first: every extraction is a model call. Use extract_structured when you know the selectors and they are stable; use extract_with_llm when the page layout is unknown or changes often.

Is my data sent anywhere when I use Ollama?

No third party sees your data when provider is "ollama". The scraped HTML goes from CrawlForge to your Ollama instance (localhost or your own server) and the response comes back. Nothing is logged or stored on our side beyond standard request metadata. With provider "openai" or "anthropic", the HTML is sent to that vendor under their respective terms.

What if the page has anti-bot protection?

Use stealth_mode first to fetch the page, then pass the HTML directly to extract_with_llm via its "html" parameter (skips the fetch step). Or combine in a single call: extract_with_llm with stealth: true makes the fetch use residential proxies and fingerprint rotation, then runs LLM extraction on the result.

Extract Web Data With Local LLMs (Ollama + CrawlForge)

A web extraction task should not require sending your scraped data to OpenAI. With extract_with_llm and a local Ollama install, structured extraction now runs entirely on your machine -- no API key, no per-token cost, no third party seeing your content. This post is the deep dive: why local matters, how to set it up, and when to use it versus the alternatives.

What Is LLM Web Extraction?
Why Local Ollama (and Not OpenAI)?
Setup: Ollama + extract_with_llm in 5 Minutes
list_ollama_models: Discover What Is On Your Machine
Schema-Driven Extraction
When to Use Ollama vs OpenAI vs Anthropic
Cost Comparison
Limitations and Accuracy

What Is LLM Web Extraction?

LLM web extraction is the workflow where you hand a model two things -- the raw HTML of a page and a schema describing what you want -- and the model returns structured JSON. It is the modern replacement for hand-written CSS selectors. When a site's layout changes, the selectors break; the LLM adapts.

CrawlForge has had this for a while as extract_structured (CSS-first, LLM fallback). The new extract_with_llm tool is LLM-first: every extraction is an LLM call. What changed in v4.2.2 is the default provider. Previously you needed an OpenAI or Anthropic key. Now you do not.

Why Local Ollama (and Not OpenAI)?

Three reasons, in descending order of importance for most teams.

1. Your scraped data stays on your machine. When you call OpenAI, the page content goes to OpenAI. For competitive intelligence, legal docs, customer data, or anything regulated, that is often a deal-breaker. Local Ollama means localhost only.

2. The LLM is free. A gpt-4o-mini call costs around $0.0001 per extraction at scale. Sounds like nothing -- until you are running 10,000 extractions a day. Ollama is $0 plus electricity. You still pay 3 CrawlForge credits per call regardless of provider.

3. No new API key to manage. If you have Ollama, you have a provider. No signup, no billing alerts, no rotation.

The trade-off is model quality. A llama3.1:8b on a laptop will not match gpt-4o on hard pages. For the 80% case (clean HTML, simple schema), it is fine. For the 20% (heavily formatted, ambiguous), you can fall back to OpenAI or Anthropic in the same call.

Setup: Ollama + extract_with_llm in 5 Minutes

Bash

# 1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 2. Pull a model (8B is a good starting point)
ollama pull llama3.1:8b

# 3. Verify it is running
curl http://127.0.0.1:11434/api/tags

That is the Ollama side. Now from CrawlForge (via the CLI, the MCP server, or the API -- any path works):

Typescript

import { CrawlForge } from 'crawlforge-mcp-server';

const cf = new CrawlForge({ apiKey: process.env.CRAWLFORGE_API_KEY });

const result = await cf.extract_with_llm({
  url: 'https://news.ycombinator.com/item?id=39820700',
  schema: {
    type: 'object',
    properties: {
      title: { type: 'string' },
      points: { type: 'number' },
      author: { type: 'string' },
      comments_count: { type: 'number' },
    },
    required: ['title', 'points'],
  },
  provider: 'ollama',
  model: 'llama3.1:8b',
});

console.log(result);
// { title: "Show HN: ...", points: 412, author: "patio11", comments_count: 187 }

From the CLI:

Bash

crawlforge extract https://news.ycombinator.com/item?id=39820700 \
  --schema schema.json \
  --provider ollama \
  --model llama3.1:8b

list_ollama_models: Discover What Is On Your Machine

Before running a big batch, sanity-check what is actually installed:

Bash

crawlforge extract --list-ollama-models

Or via MCP/SDK:

Typescript

const models = await cf.list_ollama_models();
// [{ name: 'llama3.1:8b', size: '4.7GB', modified: '2026-05-12' }, ...]

This costs 0 credits. It is a discovery helper, not an extraction call. Useful in onboarding flows, CI smoke tests ("is Ollama set up correctly?"), and as the first step in any local-LLM pipeline.

Schema-Driven Extraction

The schema is a standard JSON Schema object. The LLM reads it and produces matching output. Two patterns work well:

Pattern 1: Flat schema for simple data

Typescript

{
  type: 'object',
  properties: {
    title: { type: 'string' },
    price_usd: { type: 'number' },
    in_stock: { type: 'boolean' },
  },
}

Pattern 2: Nested schema for richer data

Typescript

{
  type: 'object',
  properties: {
    product: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        price: {
          type: 'object',
          properties: {
            amount: { type: 'number' },
            currency: { type: 'string' },
          },
        },
      },
    },
    reviews: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          author: { type: 'string' },
          rating: { type: 'number' },
          text: { type: 'string' },
        },
      },
    },
  },
}

The smaller the schema, the more reliable the extraction -- especially on local 8B models. If you find a model hallucinating fields, simplify.

When to Use Ollama vs OpenAI vs Anthropic

Scenario	Recommended provider	Why
High-volume batch extraction	Ollama	Marginal LLM cost = $0
Sensitive or regulated data	Ollama	Localhost only
Complex, multi-page reasoning	OpenAI (`gpt-4o`)	Best reasoning on hard pages
Long context (50K+ tokens)	Anthropic (Claude)	200K context window
Quick prototype, low volume	Ollama	Zero setup beyond install
Ambiguous content (sarcasm, slang)	OpenAI or Anthropic	Larger model = better disambiguation

The good news: switching providers is one parameter change. You can prototype on Ollama, hit a hard page, switch to provider: "openai" for that one call, and continue.

Cost Comparison

Cost-per-1,000-extractions, assuming average page (~3K tokens in, ~200 tokens out):

Provider	Model	LLM cost per 1K	CrawlForge credits per 1K	Total per 1K
Ollama (local)	`llama3.1:8b`	$0 (electricity)	3,000 credits	3,000 credits
Ollama (local)	`llama3.1:70b`	$0 (electricity, more)	3,000 credits	3,000 credits
OpenAI	`gpt-4o-mini`	~$0.45	3,000 credits	3,000 credits + $0.45
OpenAI	`gpt-4o`	~$8.50	3,000 credits	3,000 credits + $8.50
Anthropic	`claude-haiku-4-5`	~$0.30	3,000 credits	3,000 credits + $0.30
Anthropic	`claude-sonnet-4-6`	~$4.50	3,000 credits	3,000 credits + $4.50

CrawlForge credits are flat across providers -- you pay for the orchestration, not the model. At our Hobby tier (10,000 credits/$19/mo), that is roughly 3,300 LLM extractions per month for $19 + $0 (if you use Ollama).

Limitations and Accuracy

What local Ollama (llama3.1:8b) does well in our internal benchmarks:

Clean product pages (Amazon, Shopify): ~95% field accuracy
Article metadata (Substack, Medium): ~92%
Forum threads (HN, Reddit): ~88%

What it struggles with:

Heavily JS-rendered pages with minimal HTML (use scrape_with_actions first, then extract)
Pages with ambiguous structure (use a frontier model)
Schemas with 20+ fields (split into multiple extractions)
Non-English content (use a multilingual model like mistral)

If accuracy is critical and Ollama is missing fields, the fix is usually: simpler schema, or upgrade to provider: "openai" for that call.

Ready to extract data without leaving your machine? Start free with 1,000 credits and try extract_with_llm with your local Ollama. New to CrawlForge? Read the v4.2.2 launch announcement for context, or the CLI guide for a terminal-first workflow.

What Is LLM Web Extraction?
Why Local Ollama (and Not OpenAI)?
Setup: Ollama + extract_with_llm in 5 Minutes
list_ollama_models: Discover What Is On Your Machine
Schema-Driven Extraction
When to Use Ollama vs OpenAI vs Anthropic
Cost Comparison
Limitations and Accuracy

What Is LLM Web Extraction?

Why Local Ollama (and Not OpenAI)?

Three reasons, in descending order of importance for most teams.

3. No new API key to manage. If you have Ollama, you have a provider. No signup, no billing alerts, no rotation.

Setup: Ollama + extract_with_llm in 5 Minutes

Bash

# 1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 2. Pull a model (8B is a good starting point)
ollama pull llama3.1:8b

# 3. Verify it is running
curl http://127.0.0.1:11434/api/tags

That is the Ollama side. Now from CrawlForge (via the CLI, the MCP server, or the API -- any path works):

Typescript

import { CrawlForge } from 'crawlforge-mcp-server';

const cf = new CrawlForge({ apiKey: process.env.CRAWLFORGE_API_KEY });

const result = await cf.extract_with_llm({
  url: 'https://news.ycombinator.com/item?id=39820700',
  schema: {
    type: 'object',
    properties: {
      title: { type: 'string' },
      points: { type: 'number' },
      author: { type: 'string' },
      comments_count: { type: 'number' },
    },
    required: ['title', 'points'],
  },
  provider: 'ollama',
  model: 'llama3.1:8b',
});

console.log(result);
// { title: "Show HN: ...", points: 412, author: "patio11", comments_count: 187 }

From the CLI:

Bash

crawlforge extract https://news.ycombinator.com/item?id=39820700 \
  --schema schema.json \
  --provider ollama \
  --model llama3.1:8b

list_ollama_models: Discover What Is On Your Machine

Before running a big batch, sanity-check what is actually installed:

Bash

crawlforge extract --list-ollama-models

Or via MCP/SDK:

Typescript

const models = await cf.list_ollama_models();
// [{ name: 'llama3.1:8b', size: '4.7GB', modified: '2026-05-12' }, ...]

Schema-Driven Extraction

The schema is a standard JSON Schema object. The LLM reads it and produces matching output. Two patterns work well:

Pattern 1: Flat schema for simple data

Typescript

{
  type: 'object',
  properties: {
    title: { type: 'string' },
    price_usd: { type: 'number' },
    in_stock: { type: 'boolean' },
  },
}

Pattern 2: Nested schema for richer data

Typescript

{
  type: 'object',
  properties: {
    product: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        price: {
          type: 'object',
          properties: {
            amount: { type: 'number' },
            currency: { type: 'string' },
          },
        },
      },
    },
    reviews: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          author: { type: 'string' },
          rating: { type: 'number' },
          text: { type: 'string' },
        },
      },
    },
  },
}

The smaller the schema, the more reliable the extraction -- especially on local 8B models. If you find a model hallucinating fields, simplify.

When to Use Ollama vs OpenAI vs Anthropic

Scenario	Recommended provider	Why
High-volume batch extraction	Ollama	Marginal LLM cost = $0
Sensitive or regulated data	Ollama	Localhost only
Complex, multi-page reasoning	OpenAI (`gpt-4o`)	Best reasoning on hard pages
Long context (50K+ tokens)	Anthropic (Claude)	200K context window
Quick prototype, low volume	Ollama	Zero setup beyond install
Ambiguous content (sarcasm, slang)	OpenAI or Anthropic	Larger model = better disambiguation

The good news: switching providers is one parameter change. You can prototype on Ollama, hit a hard page, switch to provider: "openai" for that one call, and continue.

Cost Comparison

Cost-per-1,000-extractions, assuming average page (~3K tokens in, ~200 tokens out):

Provider	Model	LLM cost per 1K	CrawlForge credits per 1K	Total per 1K
Ollama (local)	`llama3.1:8b`	$0 (electricity)	3,000 credits	3,000 credits
Ollama (local)	`llama3.1:70b`	$0 (electricity, more)	3,000 credits	3,000 credits
OpenAI	`gpt-4o-mini`	~$0.45	3,000 credits	3,000 credits + $0.45
OpenAI	`gpt-4o`	~$8.50	3,000 credits	3,000 credits + $8.50
Anthropic	`claude-haiku-4-5`	~$0.30	3,000 credits	3,000 credits + $0.30
Anthropic	`claude-sonnet-4-6`	~$4.50	3,000 credits	3,000 credits + $4.50

Limitations and Accuracy

What local Ollama (llama3.1:8b) does well in our internal benchmarks:

Clean product pages (Amazon, Shopify): ~95% field accuracy
Article metadata (Substack, Medium): ~92%
Forum threads (HN, Reddit): ~88%

What it struggles with:

Heavily JS-rendered pages with minimal HTML (use scrape_with_actions first, then extract)
Pages with ambiguous structure (use a frontier model)
Schemas with 20+ fields (split into multiple extractions)
Non-English content (use a multilingual model like mistral)

If accuracy is critical and Ollama is missing fields, the fix is usually: simpler schema, or upgrade to provider: "openai" for that call.

On this page

Table of Contents

What Is LLM Web Extraction?

Why Local Ollama (and Not OpenAI)?

Setup: Ollama + extract_with_llm in 5 Minutes

list_ollama_models: Discover What Is On Your Machine

Schema-Driven Extraction

When to Use Ollama vs OpenAI vs Anthropic

Cost Comparison

Limitations and Accuracy

Try this yourself — no signup needed

Tags

About the Author

CrawlForge Team

Stay updated with the latest insights

Frequently Asked Questions

Related Articles

MCP Protocol Explained: A Developer Guide for 2026

How to Build a RAG Pipeline with Web Data

Best Web Scraping Tools for AI Agents in 2026

On this page

Table of Contents

What Is LLM Web Extraction?

Why Local Ollama (and Not OpenAI)?

Setup: Ollama + extract_with_llm in 5 Minutes

list_ollama_models: Discover What Is On Your Machine

Schema-Driven Extraction

When to Use Ollama vs OpenAI vs Anthropic

Cost Comparison

Limitations and Accuracy

Try this yourself — no signup needed

Tags

About the Author

CrawlForge Team

Stay updated with the latest insights

Frequently Asked Questions

Related Articles

MCP Protocol Explained: A Developer Guide for 2026

How to Build a RAG Pipeline with Web Data

Best Web Scraping Tools for AI Agents in 2026