Robots.txt for AI crawlers: the complete config checklist (2026)

Jump to the interactive checklist 22 items
LinkGuard cover — Robots.txt for AI crawlers: the complete config checklist (2026)
LinkGuard cover — Robots.txt for AI crawlers: the complete config checklist (2026)

You open robots.txt and realise it hasn't been touched since 2022. Back then AI crawlers weren't a category. Now there are at least ten user-agent names you need an opinion on, and getting one of them wrong means either six weeks of being invisible to ChatGPT or six weeks of being scraped while you think you blocked the door.

The worst version of this story isn't the typo. It's the 3am Sunday when you discover the typo and realise nobody's been reading the file for two months — long enough that the missed citations are someone else's now.

This is the configuration checklist version of our long-form AI bots in robots.txt article. Same opinions, different format — a 22-item tiered list you can work through, save your progress in your browser, and re-open the next time you audit a site. Items are tagged critical (skipping costs you real traffic), important (skipping is a footgun), or nice to have (skipping is fine).

How to use this checklist

Tick items as you go — progress lives in your browser, no account needed. Click How to do this inside an item for the exact steps. Filter by tier if you only want to ship the critical fixes first. When you finish, hit "Share progress" to copy a one-line summary you can paste into your team's Slack ("20/20 done — robots.txt audit live").

If you only have 15 minutes, do every critical item in order. The rest can wait for the quarterly review.

Who this is for

Anyone touching robots.txt on a production site. Specifically:

  • SEO leads doing a quarterly audit of a client site.
  • SaaS founders configuring a new marketing site or migration.
  • Publishers deciding which AI crawlers to allow vs block.
  • Agencies standardising a robots.txt template across many clients.

If you have under a dozen pages and no plan to be cited by ChatGPT, you can probably skip most of this and ship the default User-agent: * group with a sensible Sitemap: line. For everyone else, the twenty-two items below.

One vocabulary note before we start

A user-agent group is one User-agent: line plus every Disallow: and Allow: line beneath it until the next User-agent:. Each crawler reads top-to-bottom, picks the single most-specific group that names it, and follows only that group. User-agent: * is the fallback for crawlers that don't have their own group — never a base layer that other groups inherit from. Carry that mental model through the rest of the checklist.

What success looks like

Done right, robots.txt is a file you touch four times a year, sleep easy about the rest of the time, and forget exists. A month from now, when somebody in your buyer's Slack asks ChatGPT which tools to use, your name is in the answer — not part of the cohort that got the file wrong. The 22 items below are the cost of buying that quiet.

0 / 22 · 0%
Filter by tier

Three concept checks before you open the file. Skip these and you'll make confident decisions for the wrong reasons.

These bots fetch your pages to fold into the next model. Blocking them costs nothing today; allowing them buys a non-zero chance of being named in future AI answers.

These bots only fire when a specific user pastes your URL into ChatGPT / Claude / Perplexity. Blocking them breaks a feature your readers might actually use — the user already chose to read your site, you're just refusing the assistant they're using to read it.

The syntax mistakes that show up most often in robots.txt code review. Two of these can take a site out of Google.

Six checks between "saved the file" and "pushed to production". Skipping these is how a typo silently breaks a site for six weeks — Google takes 4-8 weeks of recrawl cycles to fully reverse a bad robots.txt.

Your progress stays in your browser only — no account, no personal data collected. Clearing site data resets this checklist.

About the Author

Andrei

Andrei

SEO and digital marketing professional with 13+ years of experience. Started as a website administrator in 2011, transitioned to SEO, and achieved top-3 rankings for competitive keywords. Co-founded a consulting firm specializing in marketing audits for companies in Ukraine and internationally. Built LinkGuard to solve the problem he experienced firsthand: most SEO teams purchase links but never monitor their survival. Based in Kyiv, Ukraine.

Link copied to clipboard!