crawlfix.ai
Sign inRun free scan
crawlfix.ai
Docs

Crawlfix documentation

The short version: point Crawlfix at a URL, it renders the page like a real browser, diffs the rendered DOM against the raw HTML, and tells you what crawlers and AI agents are missing.

What Crawlfix checks

Each scan covers three planes:

  • SEO. Indexability (robots, meta, canonical), metadata drift between raw and rendered, hydration mismatches, schema.org markup, link graph, performance signals (LCP candidate, render-blocking CSS / JS).
  • AEO. Answer-engine readiness for Perplexity, ChatGPT search, Google AI Overviews. Looks at FAQ schema, Article / NewsArticle markup, structured citation surface, llms.txt presence, fact density.
  • GEO. Generative-engine optimization. Verifies that key facts (pricing, product names, headings) survive in the no-JS DOM, since most LLM crawlers do not execute JavaScript.

The scanner also runs a trust & safety pass (Web Risk, URLhaus, sensitive-file probes, security headers) so the report flags issues that would scare an enterprise buyer regardless of SEO.

Running a scan

Three ways:

  1. The marketing site for a free, anonymous one-off.
  2. The dashboard for repeat scans against your own domains.
  3. The run_scan MCP tool from any AI agent.

A typical scan takes 20-40 seconds: fetch the raw HTML, boot headless Chromium, wait for network idle plus 800 ms, snapshot the DOM, run detection rules, generate fix recipes via DigitalOcean GenAI.

Reading a report

The report opens with a risk score (0-100), a label (Low / Medium / High / Critical), and an AI verdict. Below that, issues are grouped by severity and category. Click any issue to see:

  • What the scanner saw, raw vs rendered.
  • Why it matters, in one or two sentences.
  • A framework-specific fix recipe.
  • A copy-paste prompt for your AI coding agent.

MCP setup

Crawlfix ships an MCP server so AI clients (Claude Code, Cursor, Windsurf, ChatGPT Desktop, custom agents) can call the audit toolset directly.

Add to your client config:

{
  "mcpServers": {
    "crawlfix": {
      "url": "https://crawlfix.ai/api/mcp",
      "headers": { "Authorization": "Bearer $CRAWLFIX_MCP_TOKEN" }
    }
  }
}

Get a token from Settings > MCP, or call the login tool first and the device-flow handler will mint one.

Available tools:

login                    Start the device-flow login (no auth)
login_poll               Poll for the resulting MCP token
run_scan                 Trigger a crawl + audit for a domain
get_audit_history        List the caller's past audits
get_issues               Issues for an audit (gated for free tier)
get_fix_prompts          AI-generated fix prompts (paid)
verify_fix               Re-crawl a page to verify a fix
compare_competitor       Side-by-side audit of two URLs (paid)
link_repo                GitHub OAuth read-only flow (paid)
export_report            Signed URL to PDF / JSON / HTML (paid)

REST API

Base URL https://crawlfix.ai/api. Bearer auth using the same MCP token.

POST /scan                       Start a scan
GET  /scan/{scan_id}             Scan status + summary
POST /scan/{scan_id}/share       Generate a public share slug
POST /scan/{scan_id}/verify      Verify on a preview URL
GET  /v1/audits/{websiteId}      Recent audits for a website
GET  /v1/audits/{websiteId}/trend  Score trend over time
GET  /v1/audits/compare?a=&b=    Diff two audits

Send Authorization: Bearer cfix_…. Tokens are scoped to your account; the dashboard is the only place you can mint or revoke them.

Fix recipes

Every detected issue ships with a recipe. A recipe is three things:

  1. Summary. Plain-language explanation of the underlying cause.
  2. Framework-specific code. The exact pattern for your stack: Next.js App Router, Pages Router, Nuxt, SvelteKit, Vue, Angular, Remix, plain React.
  3. Acceptance criteria. A short checklist your fix has to satisfy. verify_fix evaluates against this list.

Recipes are ours. We curate them by hand and iterate from real customer audits. The AI verdict generator can compose them; it cannot invent new ones. That is intentional: a hallucinated fix recipe in production code is a problem we do not want to ship.

The full library is browseable from the report viewer once you have run a scan. We will publish a standalone index here as it stabilizes.

Scoring

The risk score is a weighted sum of issue severities, normalized to 0-100. Critical issues weigh 25, high 12, medium 5, low 1, info 0. Homepages and pricing pages get a small additional weight on indexability and metadata issues since those are the URLs that hurt most when they go dark.

Risk labels: 0-20 Low, 21-50 Medium, 51-75 High, 76-100 Critical.

Limits & abuse

Crawlfix refuses to scan localhost, private IP ranges, or cloud metadata endpoints. We send User-Agent: CrawlFixBot/1.0 (+https://crawlfix.ai) and respect target-site robots.txt. Per-IP rate limits guard the public scan endpoint and the magic-link issuer.

See Terms for the full acceptable-use list.


Run free scanTalk to us