generate_llms_txt

Crawl a site, analyze its structure, and emit a standard-compliant llms.txt (and optional llms-full.txt) file defining how AI models should interact with your content. Compliance levels from permissive to strict.

Use Cases

Ship AI-Ready Documentation

Publish llms.txt alongside your docs so Claude, ChatGPT, and other crawlers read clean guidelines.

AI Compliance Publishing

Use strict compliance to set training-data, caching, and attribution rules in one place.

Bot Policy Generation

Add custom guidelines and restrictions for specific AI user agents on your domain.

Endpoint

POST/api/v1/tools/generate_llms_txt

Auth Required

2 req/s on Free plan

5 credits

Parameters

Heavy operation: This tool may crawl up to 500 pages. It uses the reservation system so credits are held for the duration of the job.

Name	Type	Required	Default	Description
url	string	Required	-	The website URL to generate llms.txt for Example: https://example.com
format	string	Optional	both	Output format: "both" \| "llms-txt" \| "llms-full-txt" Example: both
complianceLevel	string	Optional	standard	Compliance level for generated guidelines: "basic" \| "standard" \| "strict" Example: standard
analysisOptions	object	Optional	-	Website analysis options (maxDepth 1-5, maxPages 10-500, respectRobots, detectAPIs, analyzeContent, checkSecurity) Example: {"maxDepth": 3, "maxPages": 100, "detectAPIs": true}
outputOptions	object	Optional	-	Output customization (organizationName, contactEmail, customGuidelines, customRestrictions, includeDetailed, includeAnalysis) Example: {"organizationName": "Example Inc.", "contactEmail": "ai@example.com"}

Request Examples

cURL — both formats, standard compliance

terminalBash

curl -X POST https://crawlforge.dev/api/v1/tools/generate_llms_txt \
  -H "X-API-Key: cf_test_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "format": "both",
    "complianceLevel": "standard",
    "outputOptions": {
      "organizationName": "Example Inc.",
      "contactEmail": "ai@example.com"
    }
  }'

TypeScript — strict with custom guidelines

generateLlmsTxt.tsTypescript

const response = await fetch('https://crawlforge.dev/api/v1/tools/generate_llms_txt', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.CRAWLFORGE_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://docs.example.com',
    format: 'both',
    complianceLevel: 'strict',
    analysisOptions: {
      maxDepth: 4,
      maxPages: 250,
      detectAPIs: true,
      analyzeContent: true,
    },
    outputOptions: {
      organizationName: 'Example Inc.',
      contactEmail: 'ai@example.com',
      customGuidelines: [
        'AI crawlers must respect robots.txt',
        'Cache responses for up to 24 hours',
      ],
      customRestrictions: [
        'No training on user-submitted content',
      ],
      includeAnalysis: true,
    },
  }),
});

const { data } = await response.json();
await fs.writeFile('public/llms.txt', data.files['llms.txt']);
await fs.writeFile('public/llms-full.txt', data.files['llms-full.txt']);

Python

generate_llms_txt.pyPython

import requests, os

response = requests.post(
    'https://crawlforge.dev/api/v1/tools/generate_llms_txt',
    headers={
        'X-API-Key': os.environ['CRAWLFORGE_API_KEY'],
        'Content-Type': 'application/json',
    },
    json={
        'url': 'https://example.com',
        'format': 'llms-txt',
        'complianceLevel': 'basic',
    },
)

data = response.json()['data']
with open('public/llms.txt', 'w') as f:
    f.write(data['files']['llms.txt'])

Response Example

200 OK4.1s

{
  "success": true,
  "data": {
    "url": "https://example.com",
    "hostname": "example.com",
    "compliance_level": "standard",
    "files": {
      "llms.txt": "# llms.txt for Example Inc.\n# Generated by CrawlForge — compliance: standard\n\nUser-Agent: *\nAllow: /\n\nContact: ai@example.com",
      "llms-full.txt": "# llms.txt for Example Inc.\n..."
    }
  },
  "credits_used": 5,
  "credits_remaining": 995,
  "processing_time": 4100
}

Field Descriptions

data.filesReady-to-publish text content for each file

data.compliance_levelEchoes the level you requested

credits_usedFlat 5 credits per call regardless of pages crawled

Credit Cost