Advanced Guide15 min read

Advanced Scraping Techniques

Master complex scraping scenarios including dynamic content, authentication-protected pages, JavaScript rendering, and AJAX handling with CrawlForge MCP.

In This Guide

Dynamic Content & JavaScript Authentication & Sessions AJAX & Infinite Scroll Rate Limit Handling

1. Dynamic Content & JavaScript

Many modern websites render content with JavaScript after the initial page load. Use scrape_with_actions to wait for dynamic elements.

When to Use Browser Automation

Single-Page Apps (SPAs): React, Vue, Angular apps that load data asynchronously

Lazy Loading: Images, videos, or content that loads on scroll

Interactive Elements: Dropdowns, modals, or tabs that reveal content

Static HTML: Use fetch_url instead (5x cheaper)

Example: Scraping a React SPA

5 credits

Bash

Pro Tip: Always try fetch_url first. Many SPAs pre-render content in the initial HTML or expose API endpoints you can call directly.

2. Authentication & Sessions

Scrape pages behind login forms or API authentication using cookies, headers, or automated form submission.

Strategy 1: Cookie Authentication

Best for sites where you can obtain session cookies manually

Bash

Strategy 2: Automated Login with Forms

Automate the entire login process with form_submit

Bash

Security Note: Never hardcode credentials. Use environment variables and rotate them regularly. Consider using OAuth or API tokens when available.

3. AJAX & Infinite Scroll

Capture content that loads as you scroll or click "Load More" buttons.

Infinite Scroll Example

5 credits

Typescript

4. Rate Limit Handling

Implement exponential backoff and retry logic when encountering 429 responses.

Retry Logic Example

Typescript

Next Steps

Continue your learning journey with more advanced guides

Batch Processing →

Scale to thousands of URLs

Stealth Techniques →

Bypass anti-bot systems

Advanced Guide15 min read

Advanced Scraping Techniques

Master complex scraping scenarios including dynamic content, authentication-protected pages, JavaScript rendering, and AJAX handling with CrawlForge MCP.

In This Guide

Dynamic Content & JavaScript Authentication & Sessions AJAX & Infinite Scroll Rate Limit Handling

1. Dynamic Content & JavaScript

Many modern websites render content with JavaScript after the initial page load. Use scrape_with_actions to wait for dynamic elements.

When to Use Browser Automation

Single-Page Apps (SPAs): React, Vue, Angular apps that load data asynchronously

Lazy Loading: Images, videos, or content that loads on scroll

Interactive Elements: Dropdowns, modals, or tabs that reveal content

Static HTML: Use fetch_url instead (5x cheaper)

Example: Scraping a React SPA

5 credits

Bash

Pro Tip: Always try fetch_url first. Many SPAs pre-render content in the initial HTML or expose API endpoints you can call directly.

2. Authentication & Sessions

Scrape pages behind login forms or API authentication using cookies, headers, or automated form submission.

Strategy 1: Cookie Authentication

Best for sites where you can obtain session cookies manually

Bash

Strategy 2: Automated Login with Forms

Automate the entire login process with form_submit

Bash

Security Note: Never hardcode credentials. Use environment variables and rotate them regularly. Consider using OAuth or API tokens when available.

3. AJAX & Infinite Scroll

Capture content that loads as you scroll or click "Load More" buttons.

Infinite Scroll Example

5 credits

Typescript

4. Rate Limit Handling

Implement exponential backoff and retry logic when encountering 429 responses.

Retry Logic Example

Typescript

Next Steps

Continue your learning journey with more advanced guides

Batch Processing →

Scale to thousands of URLs

Stealth Techniques →

Bypass anti-bot systems