Is the CrawlForge CLI free?

The CLI package itself is free and open. You pay only for the underlying tool calls billed against your normal CrawlForge credit balance, the same as you would from MCP or the raw API. There is no extra per-invocation fee.

Do I need a CrawlForge API key to use the CLI?

Yes. The CLI reads the CRAWLFORGE_API_KEY environment variable on every call. Get a free key at crawlforge.dev/signup (no credit card required) and set it once in your shell profile.

Can I use the CrawlForge CLI in CI/CD pipelines?

Yes -- this is one of its primary use cases. Install via "npm install -g crawlforge-mcp-server" in your CI runner, set CRAWLFORGE_API_KEY as a repository secret, and run any command. It works the same in GitHub Actions, GitLab CI, CircleCI, and Jenkins.

How is the CrawlForge CLI different from curl?

curl gives you raw HTML. The CrawlForge CLI returns structured JSON: cleaned content, extracted metadata, links, headings, and tool-specific fields like search results, research summaries, or template-scraped product data. It also handles anti-bot defenses, stealth mode, and browser automation -- all things curl cannot do.

Does the CLI support all 26 CrawlForge tools?

Yes. The 15 commands cover all 26 tools (some commands expose multiple tools via flags). For example, "crawlforge extract" maps to extract_with_llm by default and extract_structured with the --css flag.

Can the CrawlForge CLI output structured data for parsing?

Yes -- pass --json on any command and the output is clean JSON suitable for piping into jq or any JSON-aware tool. Use --pretty for human-readable formatting, or redirect stdout to a file (crawlforge scrape --pretty > out.json).

Web Scraping From the CLI: The CrawlForge CLI Guide

Most AI tools love to be agents. The CrawlForge CLI is built for the opposite: scriptable, terminal-first, predictable. You install it, set an environment variable, and every one of CrawlForge's 26 tools becomes a shell command. JSON in, JSON out. Pipe to jq, schedule with cron, run in CI -- it works the same way everywhere.

What Is the CrawlForge CLI?
Install in 30 Seconds
The 15 Commands at a Glance
Your First Scrape
Piping JSON Output to jq
Scheduling With Cron
CLI vs MCP vs Raw API
Three Real-World Workflows
Global Flags Reference
What It Costs

What Is the CrawlForge CLI?

The CrawlForge CLI ships inside the crawlforge-mcp-server package as the crawlforge command and exposes all 26 CrawlForge tools as terminal commands. A single global install gives you both the MCP server and the CLI. It does not need a long-running process or an MCP client: you type crawlforge scrape <url>, it makes an HTTPS call to CrawlForge's API, and prints JSON to stdout. That is the entire story.

It exists because half the scraping work people do is not agent-shaped. Cron jobs, CI steps, one-off research, ad-hoc pulls from a shell -- those want plain old commands, not a JSON-RPC handshake.

Install in 30 Seconds

Bash

npm install -g crawlforge-mcp-server
export CRAWLFORGE_API_KEY="cf_live_your_key_here"
crawlforge --help

That is it. No config file, no auth flow, no service to start. If you do not have an API key yet, grab one at crawlforge.dev/signup -- you get 1,000 free credits on signup.

To make the env var permanent on macOS or Linux:

Bash

echo 'export CRAWLFORGE_API_KEY="cf_live_..."' >> ~/.zshrc
source ~/.zshrc

On Windows (PowerShell):

Powershell

[Environment]::SetEnvironmentVariable("CRAWLFORGE_API_KEY", "cf_live_...", "User")

The 15 Commands at a Glance

Every command maps to one or more CrawlForge tools:

Command	Primary tool	Credits	Example
`scrape`	`fetch_url`, `extract_content`	1-2	`crawlforge scrape https://example.com`
`search`	`search_web`	5	`crawlforge search "MCP servers 2026"`
`crawl`	`crawl_deep`	4	`crawlforge crawl https://docs.example.com --depth 3`
`map`	`map_site`	2	`crawlforge map https://example.com`
`extract`	`extract_with_llm`	3	`crawlforge extract <url> --schema schema.json`
`track`	`track_changes`	3	`crawlforge track <url> --threshold 10`
`analyze`	`analyze_content`	3	`crawlforge analyze <url>`
`research`	`deep_research`	10	`crawlforge research "AI agents in 2026"`
`stealth`	`stealth_mode`	5	`crawlforge stealth <url>`
`batch`	`batch_scrape`	5	`crawlforge batch urls.txt`
`actions`	`scrape_with_actions`	5	`crawlforge actions <url> --script steps.json`
`localize`	`localization`	2	`crawlforge localize <url> --country DE`
`llmstxt`	`generate_llms_txt`	5	`crawlforge llmstxt https://example.com`
`template`	`scrape_template`	1	`crawlforge template amazon-product <url>`
`monitor`	`track_changes`	3	`crawlforge monitor <url> --interval 3600`

Your First Scrape

The simplest possible call:

Bash

crawlforge scrape https://news.ycombinator.com

What comes back is the page's main content as JSON:

Json

{
  "url": "https://news.ycombinator.com",
  "title": "Hacker News",
  "content": "Hacker News new | past | comments | ask...",
  "links": ["https://news.ycombinator.com/from?site=...", "..."],
  "fetched_at": "2026-05-21T10:14:33Z",
  "credits_used": 1
}

Want just the URLs? Pipe to jq:

Bash

crawlforge scrape https://news.ycombinator.com --json | jq '.links[]'

Want it in a file? Redirect stdout:

Bash

crawlforge scrape https://news.ycombinator.com --pretty > hn.json

Piping JSON Output to jq

This is the workflow that makes the CLI worth installing. Everything outputs JSON, and JSON pipes into anything.

Get the HN front-page story titles:

Bash

crawlforge template hacker-news-front-page https://news.ycombinator.com --json \
  | jq -r '.stories[] | .title'

Search the web and extract URLs:

Bash

crawlforge search "best web scraping libraries 2026" --json \
  | jq '.results[] | .url'

Scrape a page and count words:

Bash

crawlforge scrape https://example.com --json \
  | jq -r '.content' \
  | wc -w

Batch scrape, then filter for error responses:

Bash

crawlforge batch urls.txt --json \
  | jq '.results[] | select(.status_code >= 400)'

The pattern: --json gives you machine-readable output, then jq slices and dices.

Scheduling With Cron

A daily check on a competitor's pricing page:

Bash

# crontab -e
0 9 * * * /usr/local/bin/crawlforge track https://competitor.com/pricing --json > /var/log/pricing.json

A nightly research run:

Bash

0 2 * * * /usr/local/bin/crawlforge research "AI tooling news" --depth standard --pretty > /var/log/research.json

A weekly llms.txt regeneration for your own site:

Bash

0 3 * * 0 /usr/local/bin/crawlforge llmstxt https://yoursite.com --include-full > /var/www/yoursite.com/llms.txt

In CI? Use the same commands in your GitHub Actions YAML. The CLI checks CRAWLFORGE_API_KEY first, so just set it as a repository secret.

Yaml

# .github/workflows/daily-research.yml
- name: Run weekly research
  env:
    CRAWLFORGE_API_KEY: ${{ secrets.CRAWLFORGE_API_KEY }}
  run: |
    npm install -g crawlforge-mcp-server
    crawlforge research "industry news" --depth standard --pretty > report.json

CLI vs MCP vs Raw API: When to Use Each

Workflow	Use the CLI	Use MCP	Use Raw API
One-off scrape from your terminal	yes	no	no
Cron job or CI step	yes	no	only if you need to
Claude / Cursor / Windsurf agent	no	yes	no
Embedded in a Node/Python service	no	only if MCP-shaped	yes
Long-running background worker	no	no	yes
Quick exploration of an unfamiliar site	yes	maybe	no

Rule of thumb: if a human is typing the command, use the CLI. If an LLM is selecting the tool, use MCP. If a server is calling it in a loop, use the raw API.

Three Real-World Workflows

1. Competitive Pricing Monitor

A shell script that runs daily, scrapes three competitor pricing pages, diffs against yesterday's snapshot, and posts to Slack if anything changed.

Bash

#!/bin/bash
for url in $(cat competitors.txt); do
  crawlforge track "$url" --json \
    > "snapshots/$(date +%F)-$(basename $url).json"
done

# Diff against yesterday's snapshot
diff "snapshots/$(date -v-1d +%F)-pricing.json" \
     "snapshots/$(date +%F)-pricing.json" \
  || curl -X POST $SLACK_WEBHOOK -d '{"text": "Pricing changed"}'

Cost: ~9 credits per day (3 competitors × 3 credits for track).

2. Lead Enrichment From a CSV

Read a CSV of company domains, scrape each homepage for contact info, write enriched data back.

Bash

while IFS=, read -r company domain; do
  data=$(crawlforge scrape "https://$domain" --json)
  email=$(echo "$data" | jq -r '.metadata.contact_email // empty')
  echo "$company,$domain,$email" >> enriched.csv
done < companies.csv

Cost: 1 credit per company.

3. Research Report Pipeline

A weekly Sunday cron that runs a research query and emails the synthesized summary to the team.

Bash

crawlforge research "AI agent frameworks news this week" --depth deep --pretty > report.json
jq -r '.summary' report.json \
  | mail -s "Weekly AI report" team@example.com

Cost: 10 credits per run (research includes the synthesized summary).

Global Flags Reference

These work on every command:

--json -- compact, machine-readable JSON (pipe-friendly)
--pretty -- pretty-printed JSON
--quiet -- suppress all stdout output (exit code only)
--api-key <key> -- override the CRAWLFORGE_API_KEY env var
--timeout <ms> -- override the default 30s timeout

To write results to a file, redirect stdout: crawlforge scrape <url> --pretty > out.json.

What It Costs

The CLI itself is free. You pay only for the underlying tool calls, billed against your existing credit balance. No extra subscription, no per-invocation fee. A daily cron that runs track against three URLs and research once a week costs roughly 100 credits per month -- well within the free tier.

Ready to install? Get your free API key at crawlforge.dev/signup and run npm install -g crawlforge-mcp-server. New here? Read the v4.2.2 launch announcement for everything new, or the original MCP quickstart for the MCP version instead.

What Is the CrawlForge CLI?
Install in 30 Seconds
The 15 Commands at a Glance
Your First Scrape
Piping JSON Output to jq
Scheduling With Cron
CLI vs MCP vs Raw API
Three Real-World Workflows
Global Flags Reference
What It Costs

What Is the CrawlForge CLI?

It exists because half the scraping work people do is not agent-shaped. Cron jobs, CI steps, one-off research, ad-hoc pulls from a shell -- those want plain old commands, not a JSON-RPC handshake.

Install in 30 Seconds

Bash

npm install -g crawlforge-mcp-server
export CRAWLFORGE_API_KEY="cf_live_your_key_here"
crawlforge --help

That is it. No config file, no auth flow, no service to start. If you do not have an API key yet, grab one at crawlforge.dev/signup -- you get 1,000 free credits on signup.

To make the env var permanent on macOS or Linux:

Bash

echo 'export CRAWLFORGE_API_KEY="cf_live_..."' >> ~/.zshrc
source ~/.zshrc

On Windows (PowerShell):

Powershell

[Environment]::SetEnvironmentVariable("CRAWLFORGE_API_KEY", "cf_live_...", "User")

The 15 Commands at a Glance

Every command maps to one or more CrawlForge tools:

Command	Primary tool	Credits	Example
`scrape`	`fetch_url`, `extract_content`	1-2	`crawlforge scrape https://example.com`
`search`	`search_web`	5	`crawlforge search "MCP servers 2026"`
`crawl`	`crawl_deep`	4	`crawlforge crawl https://docs.example.com --depth 3`
`map`	`map_site`	2	`crawlforge map https://example.com`
`extract`	`extract_with_llm`	3	`crawlforge extract <url> --schema schema.json`
`track`	`track_changes`	3	`crawlforge track <url> --threshold 10`
`analyze`	`analyze_content`	3	`crawlforge analyze <url>`
`research`	`deep_research`	10	`crawlforge research "AI agents in 2026"`
`stealth`	`stealth_mode`	5	`crawlforge stealth <url>`
`batch`	`batch_scrape`	5	`crawlforge batch urls.txt`
`actions`	`scrape_with_actions`	5	`crawlforge actions <url> --script steps.json`
`localize`	`localization`	2	`crawlforge localize <url> --country DE`
`llmstxt`	`generate_llms_txt`	5	`crawlforge llmstxt https://example.com`
`template`	`scrape_template`	1	`crawlforge template amazon-product <url>`
`monitor`	`track_changes`	3	`crawlforge monitor <url> --interval 3600`

Your First Scrape

The simplest possible call:

Bash

crawlforge scrape https://news.ycombinator.com

What comes back is the page's main content as JSON:

Json

{
  "url": "https://news.ycombinator.com",
  "title": "Hacker News",
  "content": "Hacker News new | past | comments | ask...",
  "links": ["https://news.ycombinator.com/from?site=...", "..."],
  "fetched_at": "2026-05-21T10:14:33Z",
  "credits_used": 1
}

Want just the URLs? Pipe to jq:

Bash

crawlforge scrape https://news.ycombinator.com --json | jq '.links[]'

Want it in a file? Redirect stdout:

Bash

crawlforge scrape https://news.ycombinator.com --pretty > hn.json

Piping JSON Output to jq

This is the workflow that makes the CLI worth installing. Everything outputs JSON, and JSON pipes into anything.

Get the HN front-page story titles:

Bash

crawlforge template hacker-news-front-page https://news.ycombinator.com --json \
  | jq -r '.stories[] | .title'

Search the web and extract URLs:

Bash

crawlforge search "best web scraping libraries 2026" --json \
  | jq '.results[] | .url'

Scrape a page and count words:

Bash

crawlforge scrape https://example.com --json \
  | jq -r '.content' \
  | wc -w

Batch scrape, then filter for error responses:

Bash

crawlforge batch urls.txt --json \
  | jq '.results[] | select(.status_code >= 400)'

The pattern: --json gives you machine-readable output, then jq slices and dices.

Scheduling With Cron

A daily check on a competitor's pricing page:

Bash

# crontab -e
0 9 * * * /usr/local/bin/crawlforge track https://competitor.com/pricing --json > /var/log/pricing.json

A nightly research run:

Bash

0 2 * * * /usr/local/bin/crawlforge research "AI tooling news" --depth standard --pretty > /var/log/research.json

A weekly llms.txt regeneration for your own site:

Bash

0 3 * * 0 /usr/local/bin/crawlforge llmstxt https://yoursite.com --include-full > /var/www/yoursite.com/llms.txt

In CI? Use the same commands in your GitHub Actions YAML. The CLI checks CRAWLFORGE_API_KEY first, so just set it as a repository secret.

Yaml

# .github/workflows/daily-research.yml
- name: Run weekly research
  env:
    CRAWLFORGE_API_KEY: ${{ secrets.CRAWLFORGE_API_KEY }}
  run: |
    npm install -g crawlforge-mcp-server
    crawlforge research "industry news" --depth standard --pretty > report.json

CLI vs MCP vs Raw API: When to Use Each

Workflow	Use the CLI	Use MCP	Use Raw API
One-off scrape from your terminal	yes	no	no
Cron job or CI step	yes	no	only if you need to
Claude / Cursor / Windsurf agent	no	yes	no
Embedded in a Node/Python service	no	only if MCP-shaped	yes
Long-running background worker	no	no	yes
Quick exploration of an unfamiliar site	yes	maybe	no

Rule of thumb: if a human is typing the command, use the CLI. If an LLM is selecting the tool, use MCP. If a server is calling it in a loop, use the raw API.

Three Real-World Workflows

1. Competitive Pricing Monitor

A shell script that runs daily, scrapes three competitor pricing pages, diffs against yesterday's snapshot, and posts to Slack if anything changed.

Bash

#!/bin/bash
for url in $(cat competitors.txt); do
  crawlforge track "$url" --json \
    > "snapshots/$(date +%F)-$(basename $url).json"
done

# Diff against yesterday's snapshot
diff "snapshots/$(date -v-1d +%F)-pricing.json" \
     "snapshots/$(date +%F)-pricing.json" \
  || curl -X POST $SLACK_WEBHOOK -d '{"text": "Pricing changed"}'

Cost: ~9 credits per day (3 competitors × 3 credits for track).

2. Lead Enrichment From a CSV

Read a CSV of company domains, scrape each homepage for contact info, write enriched data back.

Bash

while IFS=, read -r company domain; do
  data=$(crawlforge scrape "https://$domain" --json)
  email=$(echo "$data" | jq -r '.metadata.contact_email // empty')
  echo "$company,$domain,$email" >> enriched.csv
done < companies.csv

Cost: 1 credit per company.

3. Research Report Pipeline

A weekly Sunday cron that runs a research query and emails the synthesized summary to the team.

Bash

crawlforge research "AI agent frameworks news this week" --depth deep --pretty > report.json
jq -r '.summary' report.json \
  | mail -s "Weekly AI report" team@example.com

Cost: 10 credits per run (research includes the synthesized summary).

Global Flags Reference

These work on every command:

--json -- compact, machine-readable JSON (pipe-friendly)
--pretty -- pretty-printed JSON
--quiet -- suppress all stdout output (exit code only)
--api-key <key> -- override the CRAWLFORGE_API_KEY env var
--timeout <ms> -- override the default 30s timeout

To write results to a file, redirect stdout: crawlforge scrape <url> --pretty > out.json.

On this page

Table of Contents

What Is the CrawlForge CLI?

Install in 30 Seconds

The 15 Commands at a Glance

Your First Scrape

Piping JSON Output to jq

Scheduling With Cron

CLI vs MCP vs Raw API: When to Use Each

Three Real-World Workflows

1. Competitive Pricing Monitor

2. Lead Enrichment From a CSV

3. Research Report Pipeline

Global Flags Reference

What It Costs

Try this yourself — no signup needed

Tags

About the Author

CrawlForge Team

Stay updated with the latest insights

Frequently Asked Questions

Related Articles

How to Use CrawlForge with Make and Zapier

How to Scrape Websites with Claude Code (2026 Guide)

How to Build a Web-Scraping MCP Server in TypeScript (2026)

On this page

Table of Contents

What Is the CrawlForge CLI?

Install in 30 Seconds

The 15 Commands at a Glance

Your First Scrape

Piping JSON Output to jq

Scheduling With Cron

CLI vs MCP vs Raw API: When to Use Each

Three Real-World Workflows

1. Competitive Pricing Monitor

2. Lead Enrichment From a CSV

3. Research Report Pipeline

Global Flags Reference

What It Costs

Try this yourself — no signup needed

Tags

About the Author

CrawlForge Team

Stay updated with the latest insights

Frequently Asked Questions

Related Articles

How to Use CrawlForge with Make and Zapier

How to Scrape Websites with Claude Code (2026 Guide)

How to Build a Web-Scraping MCP Server in TypeScript (2026)