CrawlForge
Api Reference
...
Tools
Crawl Deep
Crawling4 credits

crawl_deep

Discover and crawl entire websites with intelligent breadth-first search, URL filtering, and configurable depth control. Respects robots.txt and crawl delays.

Use Cases

Site Architecture Analysis

Discover all pages and understand website structure for SEO audits

Content Discovery

Find all blog posts, products, or documentation pages automatically

Competitive Intelligence

Map competitor websites and discover new products or features

Broken Link Detection

Crawl sites to find 404s, redirects, and broken internal links

Data Migration

Discover all pages before migrating or archiving a website

Sitemap Generation

Create comprehensive sitemaps for SEO or documentation

Endpoint

POST/api/v1/tools/crawl_deep
Auth Required
2 req/s on Free plan
4 credits

Parameters

NameTypeRequiredDefaultDescription
url
stringRequired-
Starting URL for the crawl (must be same domain)
Example: https://example.com
maxDepth
numberOptional3
Maximum crawl depth (1-10 levels)
Example: 5
maxPages
numberOptional100
Maximum pages to crawl (1-1000)
Example: 500
includePatterns
string[]Optional-
Only crawl URLs matching these regex patterns
Example: ["/blog/.*", "/products/.*"]
excludePatterns
string[]Optional-
Skip URLs matching these regex patterns
Example: ["/admin/.*", ".*\\.(pdf|zip)$"]
respectRobotsTxt
booleanOptionaltrue
Respect robots.txt directives
Example: true
sameDomain
booleanOptionaltrue
Only crawl URLs on the same domain
Example: true
crawlDelay
numberOptional1000
Delay between requests in milliseconds (100-5000)
Example: 2000

Request Examples

terminalBash

Response Example

200 OK45,200ms
{
"success": true,
"data": {
"startUrl": "https://example.com",
"pagesDiscovered": 487,
"pagesCrawled": 487,
"maxDepthReached": 5,
"robotsTxtRespected": true,
"crawlStarted": "2025-10-01T12:00:00Z",
"crawlCompleted": "2025-10-01T12:00:45Z",
"urls": [
{
"url": "https://example.com",
"depth": 0,
"status": 200,
"title": "Example Domain",
"linksFound": 15
},
{
"url": "https://example.com/blog",
"depth": 1,
"status": 200,
"title": "Blog - Example",
"linksFound": 42
},
{
"url": "https://example.com/blog/post-1",
"depth": 2,
"status": 200,
"title": "First Blog Post",
"linksFound": 8
}
],
"statistics": {
"status200": 450,
"status301": 20,
"status404": 15,
"status500": 2,
"avgResponseTime": 234,
"totalSize": 12500000
}
},
"credits_used": 4,
"credits_remaining": 996,
"processing_time": 45200
}
Field Descriptions
data.pagesDiscoveredTotal unique URLs found during crawl
data.pagesCrawledNumber of pages successfully fetched
data.maxDepthReachedMaximum depth level reached
data.urlsArray of all discovered URLs with metadata
data.statisticsAggregate crawl statistics
credits_used4 credits per crawl request (flat fee)
processing_timeTotal crawl duration (varies by site size)

Error Handling

Robots.txt Blocked (403 Forbidden)

The site's robots.txt disallows crawling. Set respectRobotsTxt=false to override (use responsibly).

Max Pages Reached (200 OK with warning)

Crawl stopped at maxPages limit. Increase limit or filter URLs more specifically.

Invalid Pattern (400 Bad Request)

includePatterns or excludePatterns contains invalid regex. Check pattern syntax.

Insufficient Credits (402 Payment Required)

Credits reserved upfront (estimated). Add more credits before starting large crawls.

Pro Tip: Use includePatterns to crawl specific sections (e.g., /blog/). This saves credits and reduces crawl time. Respect crawlDelay to avoid overwhelming smaller sites—1-2 seconds is recommended.

Credit Cost

4 credits
4 credits per request
Flat fee per crawl request regardless of pages discovered. Crawl up to 1,000 pages per request.

What's Included:

  • Up to 1,000 pages per crawl
  • Configurable depth (1-10 levels)
  • URL pattern filtering
  • robots.txt handling
  • Full crawl statistics

Plan Recommendations:

Free Plan: 1,000 credits = 250 crawl requests

Hobby Plan: 5,000 credits = 1,250 crawl requests ($19/mo)

Professional Plan: 50,000 credits = 12,500 crawl requests ($99/mo)

Related Tools

map_site
Fast sitemap discovery without full crawl (2 credits)
batch_scrape
Scrape discovered URLs in parallel (5 credits)
extract_links
Extract links from a single page (1 credit)
screenshot
Capture screenshots of discovered pages (2 credits)
Ready to try crawl_deep? Sign up for free and get 1,000 credits to start building.