On this page
Claude is exceptional at reading, reasoning over, and summarizing web content -- but out of the box it cannot fetch a single live page. Web scraping with Claude only becomes possible once you bridge that gap with the Model Context Protocol (MCP). Connect CrawlForge MCP and Claude gains 23 scraping tools it can call on demand: fetch a URL, extract clean article text, pull structured data with CSS selectors, bypass anti-bot systems, or run multi-source deep research. No Python, no Playwright boilerplate, no scraping code at all.
This is the hub guide for every Claude web scraping workflow. Whether you use Claude Desktop, Claude Code in your terminal, or the raw Anthropic API, this page shows you the setup and links to the deep-dive tutorial for each path.
Quick Start: 2-Minute Setup
You only need three things: Node.js 18+, a Claude surface (Desktop, Code, or API access), and a free CrawlForge API key. Sign up at crawlforge.dev/signup -- you get 1,000 credits with no credit card.
For Claude Desktop, add CrawlForge to your config file:
{
"mcpServers": {
"crawlforge": {
"command": "npx",
"args": ["-y", "@crawlforge/mcp-server"],
"env": {
"CRAWLFORGE_API_KEY": "cf_live_YOUR_API_KEY_HERE"
}
}
}
}For Claude Code, the fastest path is the setup wizard:
npm install -g crawlforge-mcp-server
npx crawlforge-setup # paste your cf_live_ key when promptedRestart Claude, and you are scraping. Ask it plainly: "Fetch https://news.ycombinator.com and give me the top 5 story titles." Claude Desktop supports MCP servers on every plan, including Free; Claude Code requires a paid Claude plan or API billing -- either way, there is no separate scraping subscription. The rest of this guide explains how each path works and what you can build.
Table of Contents
- How Claude Scrapes the Web
- Claude Desktop Setup
- Claude Code Setup
- Claude API: Build Your Own Agent
- What You Can Build
- Scraping Protected and JavaScript-Heavy Sites
- Credits and Costs
- Frequently Asked Questions
How Claude Scrapes the Web
Claude has no native network access. Ask the model to "read this page" and it will tell you it cannot open URLs -- its knowledge ends at its training cutoff. Web scraping with Claude works by giving the model tools it can call, and the standard for those tools is the Model Context Protocol.
MCP is Anthropic's open standard for connecting AI assistants to external systems. An MCP server advertises a set of tools (each with a name, description, and JSON input schema); the client (Claude Desktop, Claude Code, or your own API loop) shows those tools to the model. When a prompt needs live data, Claude emits a structured tool call, the client executes it, and the result flows back into the conversation. If you are new to the protocol, start with our MCP protocol explainer for developers.
CrawlForge is an MCP server purpose-built for web scraping. Instead of one generic "fetch" function, it exposes 23 specialized tools -- from fetch_url (raw HTML, 1 credit) to deep_research (multi-source synthesis, 10 credits). Claude picks the right tool for each request automatically. For the full architecture and tool catalogue, read the complete guide to MCP web scraping.
The key mental model: you describe the outcome in plain English, and Claude orchestrates the tools. You never write a scraping script. Claude decides whether to fetch, extract, crawl, or research -- and chains tools together when a task needs several steps.
Claude Desktop Setup
Claude Desktop is the no-terminal path. It reads MCP servers from a single JSON file and exposes their tools through the chat interface. This is the best option if you want to scrape conversationally without writing any code.
The config file lives at:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Add the CrawlForge block from the Quick Start above, replace the placeholder with your cf_live_ key, then quit Claude Desktop completely (not just close the window) and reopen it. A tools icon appears in the input box once the server loads. Test it with:
Extract the main content from https://example.com and summarize it in three bullets.
Claude calls extract_content (2 credits) and returns clean prose with no ads or navigation. For the full walkthrough -- including troubleshooting the "no tools found" error and five worked example prompts -- see how to add web scraping to Claude Desktop.
Claude Code Setup
Claude Code is Anthropic's terminal-based agent. It can edit files, run shell commands, and write tests -- but like Desktop, it cannot fetch live pages until you connect an MCP server. Once CrawlForge is registered, Claude Code can scrape, save results to disk, and feed them straight into code in a single session.
Install the server globally and run the wizard:
npm install -g crawlforge-mcp-server
crawlforge-mcp-server --version
# crawlforge-mcp-server 3.0.16
npx crawlforge-setup # writes the MCP entry and stores your keyRestart Claude Code, then confirm the connection:
/mcp
You should see crawlforge listed as connected with its tools available. Now ask Claude Code to scrape: "Fetch https://news.ycombinator.com and return the top 5 stories as a JSON array." It calls fetch_url (1 credit), parses the HTML, and writes valid JSON.
If you are brand new to Claude Code, the beginner's installation guide covers Node setup, manual config files, and the /mcp add command step by step. For a task-focused walkthrough -- scraping a pricing page, then escalating to JavaScript-rendered sites -- read how to scrape websites with Claude Code.
Claude API: Build Your Own Agent
If you are building a product instead of working interactively, skip Desktop and Code entirely and wire CrawlForge into the Anthropic Claude API. The API supports native tool use: you pass Claude a set of tool definitions, and the model returns structured tool_use blocks that your code executes against the CrawlForge REST API.
The loop is simple: send the user prompt plus tool definitions, receive a tool_use block, call CrawlForge, return the tool_result, and let Claude continue. Here is the core of it in TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const claude = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
// Call the CrawlForge REST API for a given tool
async function runCrawlForgeTool(name: string, input: Record<string, unknown>) {
const res = await fetch(`https://crawlforge.dev/api/v1/tools/${name}`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.CRAWLFORGE_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify(input),
});
return res.json();
}
const message = await claude.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
tools: [
{
name: 'extract_content',
description: 'Extract clean readable content from a web page. Costs 2 credits.',
input_schema: {
type: 'object',
properties: { url: { type: 'string', description: 'The URL to read' } },
required: ['url'],
},
},
],
messages: [
{ role: 'user', content: 'What is on the Hacker News front page right now?' },
],
});
// When Claude returns a tool_use block, execute it and feed the result back
const toolUse = message.content.find((b) => b.type === 'tool_use');
if (toolUse && toolUse.type === 'tool_use') {
const result = await runCrawlForgeTool(toolUse.name, toolUse.input as Record<string, unknown>);
console.log(result); // feed this back as a tool_result to continue the conversation
}This pattern scales from a single tool to all 23. The full production version -- defining multiple tool schemas, handling the multi-turn tool-use loop, and streaming responses -- lives in how to use CrawlForge with the Anthropic Claude API.
What You Can Build
The setup is the boring part. Here is what web scraping with Claude actually unlocks, with example prompts you can paste today.
1. A Research Assistant
Point Claude at a topic and let it search, fetch, and synthesize across sources instead of guessing from stale training data.
Research "state of WebGPU browser support in 2026". Search the web, read the top
sources, and give me a 5-bullet summary with a citation link after each bullet.
Claude chains search_web (5 credits) with extract_content (2 credits each) per source. The full project -- prompt design, source ranking, and citation formatting -- is in building an AI research assistant with Claude and MCP.
2. A Competitive Intelligence Agent
Track competitor pages -- pricing, feature lists, changelogs -- and let Claude flag what changed.
Scrape the pricing pages for these three competitors and build a comparison table
of plan name, monthly price, and included seats. Highlight anything cheaper than ours.
Claude uses scrape_structured (2 credits) for clean tabular extraction. See the end-to-end build in build a competitive intelligence agent with Claude and CrawlForge.
3. Deep Research on a Single Question
When a question needs breadth and source verification rather than a quick lookup, hand it to deep_research.
Do deep research on "regulatory changes affecting EU AI startups in 2026" and
return a structured report with conflicting viewpoints noted.
This single tool (10 credits) searches, fetches, cross-checks, and synthesizes with citations. Read what it does under the hood in introducing deep research.
4. An Automated Price Monitor
Combine scraping with change tracking so Claude tells you when a price moves -- without re-reading everything yourself.
Set up a daily check on this product page. Extract the current price and alert me
only when it drops below $80.
Claude pairs scrape_structured (2 credits) with track_changes (3 credits). The complete system -- selectors, scheduling, and alerting -- is in build an AI price monitoring system.
Scraping Protected and JavaScript-Heavy Sites
Many sites either render content with client-side JavaScript or sit behind anti-bot systems like Cloudflare, DataDome, or PerimeterX. A plain fetch_url returns an empty shell or a challenge page. CrawlForge gives Claude two escalation tools for these cases.
scrape_with_actions (5 credits) drives a real browser: it can wait for content to load, click buttons, fill forms, and scroll before extracting. Use it for single-page apps and login-gated or interaction-gated content.
stealth_mode (5 credits) adds fingerprint randomization, residential proxy rotation, and human-behavior simulation to get past bot detection on otherwise-public pages.
This pricing page is behind Cloudflare and loads its table with JavaScript. Use
stealth mode to load it, wait for the pricing table to render, then extract the
plan names and prices into JSON.
The right order is: try fetch_url first (1 credit -- many "protected" pages serve content to well-formed requests), then escalate. The tradeoffs of fingerprint rotation and when each tool wins are covered in our stealth mode deep dive. One honest limitation: CrawlForge will not solve interactive CAPTCHAs or scrape content behind a paywall you have not paid for -- and you should not ask it to.
Credits and Costs
CrawlForge uses a credit model: every tool call deducts a fixed number of credits, so cheap operations stay cheap. Here are the real per-tool costs.
| Credits | Tools |
|---|---|
| 0 | list_ollama_models |
| 1 | fetch_url, extract_text, extract_links, extract_metadata, scrape_template |
| 2 | scrape_structured, extract_content, map_site, process_document, localization |
| 3 | track_changes, analyze_content, extract_structured, extract_with_llm |
| 4 | summarize_content, crawl_deep |
| 5 | stealth_mode, scrape_with_actions, batch_scrape, search_web, generate_llms_txt |
| 10 | deep_research |
The plans scale the monthly credit allowance:
| Plan | Price | Credits | Best for |
|---|---|---|---|
| Free | $0 | 1,000 (one-time) | Trying it out, light personal use |
| Hobby | $19/mo | 5,000/mo | Side projects and regular scraping |
| Professional | $99/mo | 50,000/mo | Production agents and teams |
| Business | $399/mo | 250,000/mo | High-volume pipelines with SLA |
Two habits keep costs low. First, prefer the cheapest tool that works -- fetch_url (1 credit) over search_web (5 credits) when you already know the URL. Second, use batch_scrape (5 credits) for many URLs instead of firing individual calls. The full breakdown lives on the pricing page, and you can watch usage in the dashboard.
Ready to Start?
Web scraping with Claude takes about two minutes to set up and zero lines of scraping code. Pick your path -- Claude Desktop, Claude Code, or the Claude API -- connect CrawlForge, and let Claude pick from 23 tools on demand.
Start free with 1,000 credits -- no credit card required.