LlamaIndex
LlamaIndex Integration
Integrate CrawlForge MCP with LlamaIndex to build data connectors, indexes, and query engines with web scraping capabilities. Perfect for RAG applications and knowledge bases.
Use Cases
Web Data Connectors
Create data connectors that fetch and index web content automatically
Knowledge Bases
Build searchable knowledge bases from web pages and documents
Query Engines
Create query engines with real-time web data retrieval
Document Processing
Extract and process documents from URLs for indexing
Installation
Install LlamaIndex and the CrawlForge MCP adapter.
Bash
You'll also need a CrawlForge API key from the dashboard.
Web Data Connector
Use CrawlForge as a data connector to fetch and load web documents.
Typescript
Tip: Use
extract_content for clean article extraction or extract_text for full page text.Vector Store Index
Create a vector store index from web documents for semantic search.
Typescript
Query Engine with Tools
Create a query engine that can fetch real-time web data on demand.
Typescript
Agent Tips: The agent will automatically choose which tools to use based on the query. Set
verbose=true to see tool selection.Custom Web Retriever
Build a custom retriever that fetches web data based on queries.
Typescript
Batch Processing with Async
Process multiple URLs efficiently with async batch operations.
Typescript
Performance Tip: Use
batch_scrape for processing multiple URLs—it's optimized for parallel execution and costs only 1 credit per URL.Best Practices
- Choose Efficient Tools — Use
batch_scrapefor multiple URLs,extract_contentfor clean text - Implement Caching — Cache indexed documents to avoid redundant fetches and save credits
- Use Async Operations — Leverage async/await for parallel processing to speed up bulk operations
- Monitor Credits — Track credit usage in document metadata and set up alerts in your dashboard
Ready to build with LlamaIndex?
Explore all 23 CrawlForge tools or check out other integrations.