CrawlForge
Guides
Batch Processing
Intermediate Guide12 min read

Batch Processing Guide

Scale web scraping to thousands of URLs with efficient queue management, error recovery, and performance optimization strategies.

In This Guide
Using batch_scrape ToolQueue ManagementError RecoveryPerformance Optimization

1. Using batch_scrape Tool

The batch_scrape tool handles up to 50 URLs concurrently with built-in rate limiting and webhook notifications.

Basic Batch Scraping
1 credit
1 credit per URL (50 URLs = 50 credits)
Bash
Async Processing with Webhooks
Ideal for large batches (100+ URLs) - get notified when complete
Typescript

2. Queue Management

Process thousands of URLs by chunking them into batches and managing a queue.

Chunking Strategy
Break large URL lists into manageable batches
Typescript
Pro Tip: Use Redis or a database to store your queue. This allows you to resume processing if your script crashes or needs to restart.

3. Error Recovery

Handle failures gracefully with retry logic and error tracking.

Robust Error Handling
Typescript

4. Performance Optimization

Maximize throughput and minimize costs with these optimization strategies.

Optimization Best Practices

Optimize Concurrency

Start with maxConcurrency: 5, increase to 10 for Professional/Business plans

Use onlyMainContent

Set onlyMainContent: true to reduce response size by 60-80%

Choose Minimal Formats

Use formats: ["markdown"] instead of multiple formats (html, text, screenshot)

Cache Results

Store scraped data in Redis/database to avoid re-scraping same URLs

Avoid Over-Batching

Don't exceed 50 URLs per batch - split into multiple requests instead

Don't Ignore Rate Limits

Respect your plan's rate limits (Free: 5/s, Hobby: 10/s, Pro: 50/s, Business: 100/s)

Expected Performance

Small Batch (10 URLs)

~5 seconds

maxConcurrency: 5

Medium Batch (50 URLs)

~15 seconds

maxConcurrency: 10

Large Batch (500 URLs)

~3 minutes

10 batches × 50 URLs

Massive Batch (5,000 URLs)

~30 minutes

100 batches × 50 URLs

Next Steps
Continue learning with more advanced guides
Credit Optimization →
Minimize costs
Stealth Techniques →
Bypass anti-bot systems