On this page
Most AI tools love to be agents. The CrawlForge CLI is built for the opposite: scriptable, terminal-first, predictable. You install it, set an environment variable, and every one of CrawlForge's 23 tools becomes a shell command. JSON in, JSON out. Pipe to jq, schedule with cron, run in CI -- it works the same way everywhere.
Table of Contents
- What Is the CrawlForge CLI?
- Install in 30 Seconds
- The 15 Commands at a Glance
- Your First Scrape
- Piping JSON Output to jq
- Scheduling With Cron
- CLI vs MCP vs Raw API
- Three Real-World Workflows
- Global Flags Reference
- What It Costs
What Is the CrawlForge CLI?
The CrawlForge CLI is a standalone npm package (@crawlforge/cli) that exposes all 23 CrawlForge tools as terminal commands. It is not a wrapper around an MCP server, and it does not need a long-running process. You type crawlforge scrape <url>, it makes an HTTPS call to CrawlForge's API, and prints JSON to stdout. That is the entire story.
It exists because half the scraping work people do is not agent-shaped. Cron jobs, CI steps, one-off research, ad-hoc pulls from a shell -- those want plain old commands, not a JSON-RPC handshake.
Install in 30 Seconds
That is it. No config file, no auth flow, no service to start. If you do not have an API key yet, grab one at crawlforge.dev/signup -- you get 1,000 free credits on signup.
To make the env var permanent on macOS or Linux:
On Windows (PowerShell):
The 15 Commands at a Glance
Every command maps to one or more CrawlForge tools:
| Command | Primary tool | Credits | Example |
|---|---|---|---|
scrape | fetch_url, extract_content | 1-2 | crawlforge scrape https://example.com |
search | search_web | 5 | crawlforge search "MCP servers 2026" |
crawl | crawl_deep | 4 | crawlforge crawl https://docs.example.com --depth 3 |
map | map_site | 2 | crawlforge map https://example.com |
extract | extract_with_llm | 3 | crawlforge extract <url> --schema schema.json |
track | track_changes | 3 | crawlforge track <url> --baseline |
analyze | analyze_content | 3 | crawlforge analyze <url> |
research | deep_research | 10 | crawlforge research "AI agents in 2026" |
stealth | stealth_mode | 5 | crawlforge stealth <url> |
batch | batch_scrape | 5 | crawlforge batch urls.txt |
actions | scrape_with_actions | 5 | crawlforge actions <url> --steps steps.json |
localize | localization | 2 | crawlforge localize <url> --country DE |
llmstxt | generate_llms_txt | 5 | crawlforge llmstxt https://example.com |
template | scrape_template | 1 | crawlforge template amazon --url <url> |
monitor | track_changes | 3 | crawlforge monitor <url> --interval 24h |
Your First Scrape
The simplest possible call:
What comes back is the page's main content as JSON:
Want just the URLs? Pipe to jq:
Want it in a file?
Piping JSON Output to jq
This is the workflow that makes the CLI worth installing. Everything outputs JSON, and JSON pipes into anything.
Get the top 10 HN story titles:
Search the web and extract URLs:
Scrape a page and count words:
Batch scrape, then filter for error responses:
The pattern: --json gives you machine-readable output, then jq slices and dices.
Scheduling With Cron
A daily check on a competitor's pricing page:
A nightly research run:
A weekly llms.txt regeneration for your own site:
In CI? Use the same commands in your GitHub Actions YAML. The CLI checks CRAWLFORGE_API_KEY first, so just set it as a repository secret.
CLI vs MCP vs Raw API: When to Use Each
| Workflow | Use the CLI | Use MCP | Use Raw API |
|---|---|---|---|
| One-off scrape from your terminal | yes | no | no |
| Cron job or CI step | yes | no | only if you need to |
| Claude / Cursor / Windsurf agent | no | yes | no |
| Embedded in a Node/Python service | no | only if MCP-shaped | yes |
| Long-running background worker | no | no | yes |
| Quick exploration of an unfamiliar site | yes | maybe | no |
Rule of thumb: if a human is typing the command, use the CLI. If an LLM is selecting the tool, use MCP. If a server is calling it in a loop, use the raw API.
Three Real-World Workflows
1. Competitive Pricing Monitor
A shell script that runs daily, scrapes three competitor pricing pages, diffs against yesterday's snapshot, and posts to Slack if anything changed.
Cost: ~9 credits per day (3 competitors × 3 credits for track).
2. Lead Enrichment From a CSV
Read a CSV of company domains, scrape each homepage for contact info, write enriched data back.
Cost: 1 credit per company.
3. Research Report Pipeline
A weekly Sunday cron that runs a research query, summarizes the result, and emails it to the team.
Cost: 13 credits per run (10 for research, 3 for analyze).
Global Flags Reference
These work on every command:
--json-- machine-readable output (default for piping; use--prettyfor human-readable JSON)--output <file>-- write to file instead of stdout--timeout <ms>-- override default 30s timeout--verbose-- print debug info to stderr--api-key <key>-- override the env var
What It Costs
The CLI itself is free. You pay only for the underlying tool calls, billed against your existing credit balance. No extra subscription, no per-invocation fee. A daily cron that runs track against three URLs and research once a week costs roughly 100 credits per month -- well within the free tier.
Ready to install? Get your free API key at crawlforge.dev/signup and run npm install -g @crawlforge/cli. New here? Read the v4.2.2 launch announcement for everything new, or the original MCP quickstart for the MCP version instead.