Batch Processing Guide
Scale web scraping to thousands of URLs with efficient queue management, error recovery, and performance optimization strategies.
1. Using batch_scrape Tool
The batch_scrape tool handles up to 50 URLs concurrently with built-in rate limiting and webhook notifications.
2. Queue Management
Process thousands of URLs by chunking them into batches and managing a queue.
3. Error Recovery
Handle failures gracefully with retry logic and error tracking.
4. Performance Optimization
Maximize throughput and minimize costs with these optimization strategies.
Optimize Concurrency
Start with maxConcurrency: 5, increase to 10 for Professional/Business plans
Use onlyMainContent
Set onlyMainContent: true to reduce response size by 60-80%
Choose Minimal Formats
Use formats: ["markdown"] instead of multiple formats (html, text, screenshot)
Cache Results
Store scraped data in Redis/database to avoid re-scraping same URLs
Avoid Over-Batching
Don't exceed 50 URLs per batch - split into multiple requests instead
Don't Ignore Rate Limits
Respect your plan's rate limits (Free: 5/s, Hobby: 10/s, Pro: 50/s, Business: 100/s)
Small Batch (10 URLs)
~5 seconds
maxConcurrency: 5
Medium Batch (50 URLs)
~15 seconds
maxConcurrency: 10
Large Batch (500 URLs)
~3 minutes
10 batches × 50 URLs
Massive Batch (5,000 URLs)
~30 minutes
100 batches × 50 URLs