On this page
GitHub Copilot's agent mode in VS Code can read your repo, run terminal commands, and edit files -- but it cannot fetch live web pages reliably. Add CrawlForge MCP and Copilot gains 20 scraping tools, from fetch_url to deep_research.
This guide walks through enabling web scraping in GitHub Copilot agents step by step, with working TypeScript examples and troubleshooting for the common gotchas.
Table of Contents
- The Problem: Copilot's Limited Web Access
- Prerequisites
- Step 1: Install CrawlForge MCP
- Step 2: Enable Copilot Agent Mode
- Step 3: Register the MCP Server
- Step 4: Verify Tools Are Available
- Step 5: Your First Scrape
- Full Example: Generate a REST Client from Live Docs
- Advanced: Multi-Tool Workflows
- Troubleshooting
- FAQ
The Problem: Copilot's Limited Web Access
Copilot Chat has a @web participant but it rewrites your query into Bing search -- no raw page access, no structured extraction, no anti-bot bypass. In agent mode, Copilot can run shell commands, so you could pipe curl | pandoc through a tool call, but that misses JavaScript-rendered pages and trips anti-bot systems within minutes.
MCP solves this. Since VS Code 1.102 shipped general-availability MCP support in mid-2025, Copilot agents can call any MCP server you register. CrawlForge exposes 20 scraping tools as MCP, so scraping becomes a first-class agent action.
Prerequisites
- VS Code 1.102+ --
code --version - GitHub Copilot subscription (Individual, Business, or Enterprise) with agent mode enabled
- Node.js 18+
- CrawlForge account -- free at crawlforge.dev/signup
Step 1: Install CrawlForge MCP
Step 2: Enable Copilot Agent Mode
- Open VS Code settings (
Cmd+,/Ctrl+,). - Search for
chat.agent.enabledand toggle it on. - Open Copilot Chat (
Ctrl+Alt+I/Cmd+Ctrl+I) and switch the mode dropdown from "Ask" to "Agent."
Agent mode is what enables MCP tool calls. In "Ask" or "Edit" modes, Copilot ignores registered MCP servers.
Step 3: Register the MCP Server
VS Code supports MCP at two scopes:
- Workspace --
.vscode/mcp.jsonin the repo root (team-shared via git) - User -- editable via "MCP: Add Server" command palette (personal)
For team scraping, use the workspace config:
Then export the key in your shell (do not commit it):
Or use ${input:apiKey} and VS Code will prompt once per workspace.
Step 4: Verify Tools Are Available
- Open Copilot Chat.
- Click the tool icon in the chat header.
- Confirm
crawlforgeis listed with 20 tools. Uncheck any you want to hide.
If the list is empty, jump to Troubleshooting.
Step 5: Your First Scrape
In Copilot Chat (agent mode), paste:
#crawlforge Fetch https://news.ycombinator.com and list the top 5 story titles with their URLs.
The #crawlforge hint nudges Copilot toward the right tool. Copilot calls fetch_url (1 credit), receives HTML, parses titles, and returns them inline.
Full Example: Generate a REST Client from Live Docs
Here is a concrete workflow: auto-generate a typed Stripe client from the live documentation.
Prompt Copilot agent:
#crawlforge Use extract_content on https://docs.stripe.com/api/charges/create.
Then in src/clients/stripe-charges.ts, write a typed function createCharge(params)
that calls the Stripe API with the exact fields documented on that page.
Add a Zod schema for params.
Copilot issues this MCP call:
Copilot reads the markdown and writes:
Cost: 2 credits (one extract_content call). Repeat for every Stripe endpoint and you have a full typed client derived from live docs.
Advanced: Multi-Tool Workflows
Copilot agents excel at chaining tools. Example prompt:
#crawlforge 1) search_web for "OpenTelemetry Node.js auto-instrumentation 2026"
2) extract_content from the top 3 results
3) Summarize the differences into docs/otel-options.md with a decision matrix.
Copilot runs the three MCP calls sequentially, feeds results forward, and writes the markdown. Total cost: ~11 credits (5 + 2 + 2 + 2).
Credit Quick Reference
| Task | Tool | Credits |
|---|---|---|
| Static HTML fetch | fetch_url | 1 |
| Clean article text | extract_content | 2 |
| CSS-selector fields | scrape_structured | 2 |
| Discover URLs | map_site | 2 |
| Web search | search_web | 5 |
| Anti-bot bypass | stealth_mode | 5 |
| Deep research | deep_research | 10 |
Troubleshooting
Tools list is empty -- Agent mode is off. Enable chat.agent.enabled in settings and switch chat mode to "Agent."
"Spawn ENOENT: crawlforge-mcp-server" -- VS Code cannot find the binary. Use an absolute path in .vscode/mcp.json: "command": "/usr/local/bin/crawlforge-mcp-server".
Copilot ignores the MCP server -- Prefix your prompt with #crawlforge or explicitly name a tool: "Use fetch_url to...". Copilot sometimes picks built-in tools by default.
401 Unauthorized -- The env var is not reaching the server. Check echo $CRAWLFORGE_API_KEY in the same shell that launched VS Code. On macOS, GUI-launched VS Code does not inherit shell env -- launch via code . from the terminal.
Workspace MCP config not loading -- VS Code does not auto-trust .vscode/mcp.json. Open the file, click the "Trust" notification, then reload the window.
Next Steps
- Read the 20-tools overview to see the full toolkit
- Study the MCP protocol explainer for how VS Code talks to servers
- See getting started docs for the REST API
- Compare to Firecrawl alternative if you are evaluating scraping vendors
Start free with 1,000 credits at crawlforge.dev/signup.