CrawlForge
AI Tool5 credits

generate_llms_txt

Crawl a site, analyze its structure, and emit a standard-compliant llms.txt (and optional llms-full.txt) file defining how AI models should interact with your content. Compliance levels from permissive to strict.

Use Cases

Ship AI-Ready Documentation

Publish llms.txt alongside your docs so Claude, ChatGPT, and other crawlers read clean guidelines.

AI Compliance Publishing

Use strict compliance to set training-data, caching, and attribution rules in one place.

Bot Policy Generation

Add custom guidelines and restrictions for specific AI user agents on your domain.

Endpoint

POST/api/v1/tools/generate_llms_txt
Auth Required
2 req/s on Free plan
5 credits

Parameters

Heavy operation: This tool may crawl up to 500 pages. It uses the reservation system so credits are held for the duration of the job.
NameTypeRequiredDefaultDescription
url
stringRequired-
The website URL to generate llms.txt for
Example: https://example.com
format
stringOptionalboth
Output format: "both" | "llms-txt" | "llms-full-txt"
Example: both
complianceLevel
stringOptionalstandard
Compliance level for generated guidelines: "basic" | "standard" | "strict"
Example: standard
analysisOptions
objectOptional-
Website analysis options (maxDepth 1-5, maxPages 10-500, respectRobots, detectAPIs, analyzeContent, checkSecurity)
Example: {"maxDepth": 3, "maxPages": 100, "detectAPIs": true}
outputOptions
objectOptional-
Output customization (organizationName, contactEmail, customGuidelines, customRestrictions, includeDetailed, includeAnalysis)
Example: {"organizationName": "Example Inc.", "contactEmail": "ai@example.com"}

Request Examples

cURL — both formats, standard compliance

terminalBash
curl -X POST https://crawlforge.dev/api/v1/tools/generate_llms_txt \
  -H "X-API-Key: cf_test_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "format": "both",
    "complianceLevel": "standard",
    "outputOptions": {
      "organizationName": "Example Inc.",
      "contactEmail": "ai@example.com"
    }
  }'

TypeScript — strict with custom guidelines

generateLlmsTxt.tsTypescript
const response = await fetch('https://crawlforge.dev/api/v1/tools/generate_llms_txt', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.CRAWLFORGE_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://docs.example.com',
    format: 'both',
    complianceLevel: 'strict',
    analysisOptions: {
      maxDepth: 4,
      maxPages: 250,
      detectAPIs: true,
      analyzeContent: true,
    },
    outputOptions: {
      organizationName: 'Example Inc.',
      contactEmail: 'ai@example.com',
      customGuidelines: [
        'AI crawlers must respect robots.txt',
        'Cache responses for up to 24 hours',
      ],
      customRestrictions: [
        'No training on user-submitted content',
      ],
      includeAnalysis: true,
    },
  }),
});

const { data } = await response.json();
await fs.writeFile('public/llms.txt', data.files['llms.txt']);
await fs.writeFile('public/llms-full.txt', data.files['llms-full.txt']);

Python

generate_llms_txt.pyPython
import requests, os

response = requests.post(
    'https://crawlforge.dev/api/v1/tools/generate_llms_txt',
    headers={
        'X-API-Key': os.environ['CRAWLFORGE_API_KEY'],
        'Content-Type': 'application/json',
    },
    json={
        'url': 'https://example.com',
        'format': 'llms-txt',
        'complianceLevel': 'basic',
    },
)

data = response.json()['data']
with open('public/llms.txt', 'w') as f:
    f.write(data['files']['llms.txt'])

Response Example

200 OK4.1s
{
"success": true,
"data": {
"url": "https://example.com",
"hostname": "example.com",
"compliance_level": "standard",
"files": {
"llms.txt": "# llms.txt for Example Inc.\n# Generated by CrawlForge — compliance: standard\n\nUser-Agent: *\nAllow: /\n\nContact: ai@example.com",
"llms-full.txt": "# llms.txt for Example Inc.\n..."
}
},
"credits_used": 5,
"credits_remaining": 995,
"processing_time": 4100
}
Field Descriptions
data.filesReady-to-publish text content for each file
data.compliance_levelEchoes the level you requested
credits_usedFlat 5 credits per call regardless of pages crawled

Credit Cost

5 credits
5 credits per request
Flat 5 credits no matter how many pages the crawler visits.

Tip: Pair with map_site (2 credits) when you just need the URL inventory before generating guidelines.

Related Tools

map_site
Discover URLs before generating llms.txt (2 credits)
crawl_deep
Deep BFS crawl with content extraction (4 credits)
Ready to publish AI interaction guidelines? Sign up for free and get 1,000 credits.