Features|

How to Check If AI Crawlers Can Access Your Site

Your robots.txt file tells AI crawlers what they can read on your site. Here is how it works, the crawlers to know, and how to check your own in seconds.

01 / What it is

robots.txt and AI crawlers.

02 / How it works

user-agent, disallow, allow.

03 / The crawlers

the AI crawlers to know.

04 / In your dashboard

monitor AI crawler access.

Free tool

check your site
against all 21 crawlers.

Paste in your domain and Centium’s free AI Access Tester shows you, crawler by crawler, who can read your site. If one is blocked that should not be, the fix is usually a single line. No signup required.

Check your site
Crawler access is just the first layer

see where
you stand.

Open crawler access gets AI to your door. Centium shows what happens next: whether ChatGPT, Claude, Gemini, Perplexity, and Grok actually recommend you, how you stack up against your competitors, and what you can do about it.

FAQ

questions, answered.

An AI crawler is an automated bot that an AI company sends to read websites. Some collect pages to train models like ChatGPT, Claude, and Gemini. Others visit in real time when someone asks an AI a question, so the model can ground its answer in current information. Each crawler identifies itself with a user-agent name, like GPTBot or PerplexityBot, which is the name you use to allow or block it in robots.txt.

Open yourdomain.com/robots.txt in a browser and read the rules, or paste your domain into Centium’s free AI Access Tester to see all 21 AI crawlers checked at once. Centium subscribers also get a live crawler check inside the dashboard that refreshes every time they open it.

Add a named block to your robots.txt: a line that reads User-agent: GPTBot, followed by a line that reads Disallow: /. That shuts GPTBot out of the entire site. To block a single section instead, point the Disallow at that path, like Disallow: /members/.

It works on an honor system. The major AI companies, including OpenAI, Anthropic, Google, and Perplexity, read your robots.txt and follow it, so for them it is reliable. It is not a security control. It tells well-behaved crawlers what they may do rather than physically blocking anyone, so it will not stop a bad actor that ignores the rules.

For most brands, all of them. Training crawlers like GPTBot and ClaudeBot teach the next generation of models about you. Live browsing crawlers like ChatGPT-User and PerplexityBot let models cite you in real-time answers. Search indexing crawlers like Googlebot power AI search results. Blocking any of them quietly removes you from where buyers are now discovering brands.

GPTBot is OpenAI’s training crawler: it collects pages to teach future models, and what it reads stays in the model. ChatGPT-User is the live browsing agent: it visits your site the moment a user asks ChatGPT something, and what it reads is used only for that one answer. You can allow or block each one separately in robots.txt.