On this page
Cursor IDE is great at reasoning about your code but cannot see the live web. Add CrawlForge via Cursor's Model Context Protocol integration and the Composer gains 20 scraping tools -- no Python script, no curl, no leaving the editor.
This guide walks through setting up web scraping in Cursor IDE step by step, with runnable examples for research, structured extraction, and competitor monitoring.
Table of Contents
- Why Scrape from Inside Cursor?
- Prerequisites
- Step 1: Install the MCP Server
- Step 2: Configure Cursor's MCP Settings
- Step 3: Restart and Verify
- Step 4: Your First Scrape in Composer
- Full Example: Build a Competitor Price Tracker
- Workflow: Use Scrapes to Write Code
- Troubleshooting
- FAQ
Why Scrape from Inside Cursor?
Cursor Composer treats MCP tools as first-class actions: it picks the right tool for a task, passes typed arguments, and feeds results back into the conversation. When you scrape websites in Cursor IDE through CrawlForge, the extracted data is immediately available for Cursor to generate tests, write TypeScript interfaces, or update dashboards. No copy-paste, no context switch.
If you already use Cursor rules to shape Composer's behavior, MCP tools slot right in -- rules describe how to code, tools expose what Composer can do.
Prerequisites
- Cursor IDE 0.42+ -- download from cursor.com
- Node.js 18+ --
node --versionto check - CrawlForge account -- free at crawlforge.dev/signup
Step 1: Install the MCP Server
Confirm it is on your PATH:
Step 2: Configure Cursor's MCP Settings
Cursor reads MCP servers from ~/.cursor/mcp.json. Create it if it does not exist:
Paste this config (replace the key):
On Windows the file lives at %USERPROFILE%\.cursor\mcp.json and the command should be crawlforge-mcp-server.cmd.
Step 3: Restart and Verify
- Quit Cursor completely (
Cmd+Qon macOS). - Reopen the project.
- Go to Settings -> Features -> MCP. You should see
crawlforgewith a green dot and 20 tools listed.
If the server is red or the tools list is empty, skip to Troubleshooting.
Step 4: Your First Scrape in Composer
Open Composer (Cmd+I) and paste:
Use CrawlForge to fetch https://news.ycombinator.com and list the top 5 story titles.
Cursor will call fetch_url (1 credit) and display the response. When you approve the tool call, Composer parses the HTML and returns a clean list.
Full Example: Build a Competitor Price Tracker
Say you want to track pricing changes on a competitor SaaS. Open Composer and paste:
Use scrape_structured to pull pricing from https://competitor.example.com/pricing.
Fields: plan (h3), price (.price), features (ul li).
Then generate a TypeScript type for the response.
Cursor issues a scrape_structured call with your selectors, returns JSON, and emits this TypeScript in the next editor chunk:
Total cost: 2 credits per run. Schedule it via Vercel Cron or GitHub Actions and you have a free-tier price tracker.
Workflow: Use Scrapes to Write Code
The real unlock is feeding scraped data into Cursor's code generation. Proven patterns:
- Type generation from live APIs: "Fetch
https://api.example.com/users, then generate a Zod schema matching the response." - Test fixtures from real pages: "Scrape the top 3 articles from Hacker News and save them as JSON fixtures in
tests/fixtures/." - Documentation extraction: "Use
extract_contenton the React docs foruseState, then write an idiomatic example that matches." - Competitor feature parity: "Use
map_siteon competitor.com and flag any URL patterns we do not have in our own sitemap."
Each pattern is 1-5 credits per run and keeps you inside Cursor.
Credit Costs Summary
| Operation | Tool | Credits |
|---|---|---|
| Fetch HTML | fetch_url | 1 |
| Clean text | extract_text | 1 |
| Readable article | extract_content | 2 |
| CSS-selector extract | scrape_structured | 2 |
| Sitemap discovery | map_site | 2 |
| Web search | search_web | 5 |
| SPA with clicks | scrape_with_actions | 5 |
| Anti-bot bypass | stealth_mode | 5 |
Troubleshooting
Tools list empty in Cursor Settings -- Cursor caches MCP config. Fully quit (Cmd+Q), then reopen. Check ~/.cursor/logs/ for parse errors.
"Command not found: crawlforge-mcp-server" -- npm's global bin is not on Cursor's PATH. Fix by setting an absolute path in mcp.json: "command": "/usr/local/bin/crawlforge-mcp-server".
Every call returns 401 -- API key missing or still the placeholder. Verify with: curl -H "Authorization: Bearer $CRAWLFORGE_API_KEY" https://crawlforge.dev/api/v1/credits/balance.
Cursor asks for approval on every tool call -- That is expected default behavior. Enable "Auto-approve for trusted servers" in MCP settings if you want Composer to run scrapes silently.
Composer ignores the MCP tool -- Explicitly prompt: "Use CrawlForge's scrape_structured tool to...". Cursor sometimes defaults to its built-in web fetch, which is less capable.
Next Steps
- Read the Cursor rules guide to optimize Composer behavior for scraping
- Browse the 20-tools overview to see what else you can automate
- Check the getting started docs for API reference and credit pricing
- Compare vendors at Firecrawl alternative
Start free with 1,000 credits at crawlforge.dev/signup.