On this page
Dify is an open-source LLM app development platform that lets you build AI applications with a visual workflow editor. By adding CrawlForge as a custom tool, your Dify workflows gain the ability to scrape websites, search the web, and extract structured data -- all without writing code.
This guide covers both the no-code approach (Dify's visual tool configuration) and the API-based approach for advanced integrations.
Table of Contents
- What Is Dify?
- Prerequisites
- Step 1: Set Up a Custom Tool Provider
- Step 2: Define CrawlForge Tool Schemas
- Step 3: Build a Web Research Workflow
- Step 4: Build a Content Extraction Pipeline
- Step 5: Handle Authentication and Errors
- Credit Cost Reference
- CrawlForge Tools Available in Dify
- Next Steps
What Is Dify?
Dify is a production-ready platform for building LLM applications. It provides a visual workflow builder, agent orchestration, RAG pipeline management, and a library of 50+ built-in tools. Dify supports custom tool integration through OpenAPI specifications, which means any REST API -- including CrawlForge -- can be added as a tool.
Dify's native MCP integration also means you can connect CrawlForge as an MCP server directly. This guide covers both approaches.
Prerequisites
- Dify instance -- either Dify Cloud or self-hosted via Docker
- A CrawlForge account with an API key (1,000 free credits)
- Admin access to your Dify workspace
Step 1: Set Up a Custom Tool Provider
In your Dify dashboard, navigate to Tools > Custom Tools > Create Custom Tool.
Paste the following OpenAPI specification to register CrawlForge's core tools:
Set the authentication to Bearer Token and enter your CrawlForge API key (cf_live_...).
Step 2: Define CrawlForge Tool Schemas
After importing the OpenAPI spec, Dify automatically generates tool cards for each endpoint. Configure each tool with descriptive names so the LLM agent can select them correctly:
| Dify Tool Name | CrawlForge Endpoint | Credits | When the Agent Should Use It |
|---|---|---|---|
| Fetch Web Page | /fetch_url | 1 | User provides a specific URL to read |
| Extract Content | /extract_content | 2 | Need clean, readable text from a page |
| Search the Web | /search_web | 5 | Need to find pages on a topic |
| Extract Structured Data | /scrape_structured | 2 | Need specific data points via CSS selectors |
For each tool in Dify, add a clear description that includes the credit cost. This helps the LLM agent make cost-efficient decisions.
Step 3: Build a Web Research Workflow
In Dify's workflow editor, create a new workflow with these nodes:
The visual workflow in Dify makes this a drag-and-drop operation. Each node connects to the next, with data flowing through template variables.
Step 4: Build a Content Extraction Pipeline
For recurring data extraction tasks, build a pipeline workflow:
Step 5: Handle Authentication and Errors
Authentication
CrawlForge uses Bearer token authentication. In Dify, set this once at the custom tool provider level:
- Go to Tools > Custom Tools > CrawlForge
- Click Configure Authorization
- Select API Key (Bearer)
- Enter your CrawlForge API key
All tool calls within workflows automatically include the auth header.
Error Handling
Add error handling nodes in your Dify workflow for common scenarios:
Dify's built-in retry mechanism handles transient failures automatically. For credit exhaustion errors (HTTP 402), route to a notification node that alerts the user.
Credit Cost Reference
| Credits | Tools | Dify Workflow Use Case |
|---|---|---|
| 1 | fetch_url, extract_text, extract_links, extract_metadata | Simple page fetching triggers |
| 2 | scrape_structured, extract_content, summarize_content, generate_llms_txt | Extraction pipeline nodes |
| 3 | map_site, process_document, analyze_content, localization | Site audit workflows |
| 5 | search_web, crawl_deep, batch_scrape, scrape_with_actions, stealth_mode | Research and bulk workflows |
| 10 | deep_research | Comprehensive analysis workflows |
CrawlForge Tools Available in Dify
All 18 CrawlForge tools can be registered in Dify. The most commonly used in visual workflows are:
| Tool | Credits | Why It Works Well in Dify |
|---|---|---|
| search_web | 5 | Natural starting point for research workflows |
| extract_content | 2 | Clean output feeds directly into LLM nodes |
| scrape_structured | 2 | CSS selectors return predictable, structured JSON |
| fetch_url | 1 | Cheapest option for simple page access |
| batch_scrape | 5 | Handles loops more efficiently than individual calls |
Next Steps
- Dify Documentation -- official Dify platform docs
- CrawlForge API Reference -- endpoint schemas for all 18 tools
- Complete MCP Guide -- understanding MCP protocol integration
- CrawlForge Pricing -- credit packs starting at $19/month
Add web scraping to your Dify apps today. Get your free API key with 1,000 credits and register CrawlForge as a custom tool in Dify. No code required.