Crawling1 creditPer Page

crawl_deep

Discover and crawl entire websites with intelligent breadth-first search, URL filtering, and configurable depth control. Respects robots.txt and crawl delays.

Use Cases

Site Architecture Analysis

Discover all pages and understand website structure for SEO audits

Content Discovery

Find all blog posts, products, or documentation pages automatically

Competitive Intelligence

Map competitor websites and discover new products or features

Broken Link Detection

Crawl sites to find 404s, redirects, and broken internal links

Data Migration

Discover all pages before migrating or archiving a website

Sitemap Generation

Create comprehensive sitemaps for SEO or documentation

Endpoint

POST/api/v1/tools/crawl_deep
Auth Required
2 req/s on Free plan
1 credit

Parameters

NameTypeRequiredDefaultDescription
url
stringRequired-
Starting URL for the crawl (must be same domain)
Example: https://example.com
maxDepth
numberOptional3
Maximum crawl depth (1-10 levels)
Example: 5
maxPages
numberOptional100
Maximum pages to crawl (1-1000)
Example: 500
includePatterns
string[]Optional-
Only crawl URLs matching these regex patterns
Example: ["/blog/.*", "/products/.*"]
excludePatterns
string[]Optional-
Skip URLs matching these regex patterns
Example: ["/admin/.*", ".*\\.(pdf|zip)$"]
respectRobotsTxt
booleanOptionaltrue
Respect robots.txt directives
Example: true
sameDomain
booleanOptionaltrue
Only crawl URLs on the same domain
Example: true
crawlDelay
numberOptional1000
Delay between requests in milliseconds (100-5000)
Example: 2000

Request Examples

terminalBash

Response Example

200 OK45,200ms
{
"success": true,
"data": {
"startUrl": "https://example.com",
"pagesDiscovered": 487,
"pagesCrawled": 487,
"maxDepthReached": 5,
"robotsTxtRespected": true,
"crawlStarted": "2025-10-01T12:00:00Z",
"crawlCompleted": "2025-10-01T12:00:45Z",
"urls": [
{
"url": "https://example.com",
"depth": 0,
"status": 200,
"title": "Example Domain",
"linksFound": 15
},
{
"url": "https://example.com/blog",
"depth": 1,
"status": 200,
"title": "Blog - Example",
"linksFound": 42
},
{
"url": "https://example.com/blog/post-1",
"depth": 2,
"status": 200,
"title": "First Blog Post",
"linksFound": 8
}
],
"statistics": {
"status200": 450,
"status301": 20,
"status404": 15,
"status500": 2,
"avgResponseTime": 234,
"totalSize": 12500000
}
},
"credits_used": 487,
"credits_remaining": 513,
"processing_time": 45200
}
Field Descriptions
data.pagesDiscoveredTotal unique URLs found during crawl
data.pagesCrawledNumber of pages successfully fetched
data.maxDepthReachedMaximum depth level reached
data.urlsArray of all discovered URLs with metadata
data.statisticsAggregate crawl statistics
credits_used1 credit per page crawled (not per page discovered)
processing_timeTotal crawl duration (varies by site size)

Error Handling

Robots.txt Blocked (403 Forbidden)

The site's robots.txt disallows crawling. Set respectRobotsTxt=false to override (use responsibly).

Max Pages Reached (200 OK with warning)

Crawl stopped at maxPages limit. Increase limit or filter URLs more specifically.

Invalid Pattern (400 Bad Request)

includePatterns or excludePatterns contains invalid regex. Check pattern syntax.

Insufficient Credits (402 Payment Required)

Credits reserved upfront (estimated). Add more credits before starting large crawls.

Credit Cost

1 credit
1 credit per page crawled
Credits charged for successfully crawled pages only. Failed requests (404s, timeouts) don't consume credits.

Example Costs:

Small site (50 pages): 50 credits

Medium site (500 pages): 500 credits

Large site (1000 pages max): 1,000 credits

Plan Recommendations:

Free Plan: 1,000 credits = 1,000 pages or 20 medium sites

Hobby Plan: 5,000 credits = 5,000 pages or 100 medium sites ($19/mo)

Professional Plan: 50,000 credits = 50 large sites ($99/mo)

Related Tools