LlamaIndex Integration
Integrate CrawlForge MCP with LlamaIndex to build data connectors, indexes, and query engines with web scraping capabilities. Perfect for RAG applications and knowledge bases.
Use Cases
Create data connectors that fetch and index web content automatically
Build searchable knowledge bases from web pages and documents
Create query engines with real-time web data retrieval
Extract and process documents from URLs for indexing
Installation
Install LlamaIndex and the CrawlForge MCP adapter.
Web Data Connector
Use CrawlForge as a data connector to fetch and load web documents.
extract_content for clean article extraction or extract_text for full page text.Vector Store Index
Create a vector store index from web documents for semantic search.
Query Engine with Tools
Create a query engine that can fetch real-time web data on demand.
verbose=true to see tool selection.Custom Web Retriever
Build a custom retriever that fetches web data based on queries.
Batch Processing with Async
Process multiple URLs efficiently with async batch operations.
batch_scrape for processing multiple URLs—it's optimized for parallel execution and costs only 1 credit per URL.Best Practices
Choose Efficient Tools
Use batch_scrape for multiple URLs, extract_content for clean text
Implement Caching
Cache indexed documents to avoid redundant fetches and save credits
Use Async Operations
Leverage async/await for parallel processing to speed up bulk operations
Monitor Credits
Track credit usage in document metadata and set up alerts in your dashboard