• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here:Home » Let’s Block the AI Crawlers Using robots.txt File

By Abhishek Ghosh December 9, 2023 5:57 pm Updated on December 9, 2023

Let’s Block the AI Crawlers Using robots.txt File

Advertisement

AI web crawlers such as GPTBot, CCBot, and AI bots from Google now crawl our websites and collect data for their need. The question arises, should we block these AI bots in our robots.txt file to protect our content? The short answer is yes. If you examine the list from here – originality.ai/ai-bot-blocking then you’ll realize that a lot of websites already blocking them.

As we know, robots.txt is a plain text file which gives certain instructions to web crawlers (they are also called robots) about which pages or files they should or should not crawl. The file is placed in the root directory of any website: thecustomizewindows.com/robots.txt. So far, we need not to do anything much with this file since our intention was to get indexed by the search engines.

 

Why Block the AI Crawlers Scrapping Our Content?

 

Because they are scrapping our content and we are not getting permanent dofollow backlinks. Essentially your article will get spunned (they say it is AI) and maybe SEO-optimized content will be created. If you spend some time with Google Bard or any similar tool, you’ll realize that:

Advertisement

---

  • Part of your content cited somewhere else (such as StackOverflow or Reddit) became Bard’s content without mentioning your site. You can do nothing since the language after the modern text spinning (you can say that is Generative AI).
  • Your code/snippet reproduced

When you scrap 10 great webpages and spin them, obviously the content becomes great and may outperform the original websites on SERP or maybe, you just increase your engagement, or you make money. Most importantly, some of them tend to remove almost any chance of the source website receiving traffic. Bing usually links to information sources. None of the usages will do any good for you. Indeed, your SERP can fall (maybe it is already falling).

Lets Block the AI Crawlers Using robots-txt File


Image credit: seosandwitch.com

 

Why Not to Block All the AI Crawlers?

 

If you block Google or Bing or any such search engine, they can indirectly penalize you in future in some way or the other. Yoast pointed out this point in his website which is logically acceptable. They can tweak their algorithm and use AI bots to show meta descriptions or summaries of your article.

 

What robots.txt We Sugggest?

 

We will tend to suggest you block certain bots including ChatGPT (CCBot, ChatGPT-User, GPTBot), Anthropic (anthropic-ai), Omgili (Omgili bot, Omgili), Facebook (they use for their speech recognition). This is the robots.txt (it is part of the file):

Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
User-agent: CCBot
Disallow: /
 
User-agent: ChatGPT-User
Disallow: /
 
User-agent: GPTBot
Disallow: /
 
User-agent: anthropic-ai
Disallow: /
 
User-agent: Omgilibot
Disallow: /
 
User-agent: Omgili
Disallow: /
 
User-agent: FacebookBot
Disallow: /

Blocking them can not harm our site’s SEO. Please note that we have not blocked Google’s AI tools or Bing’s AI tools. You can add this directive to block Google Bard and similar tools:

Vim
1
2
User-agent: Google-Extended
Disallow: /

 

Will This Work?

 

God knows! Robots.txt just requests the bots to not crawl. There are other ways to block, such as blocking the IP range from the server (or serving them confusing stuff):

Vim
1
https://openai.com/gptbot-ranges.txt

You can use .htaccess rules or some WAF which supports blocking these bots (such as CloudFlare, Securi and so on). Also, there is an upcoming method named glazing (for images). Of course, we need some fail2ban filter for these odd scrappers.

Governments need to force rules.

Tagged With https://thecustomizewindows com/2023/12/lets-block-the-ai-crawlers-using-robots-txt-file/ , nativehee , testingK4Ibt9qe\)) OR 146=(SELECT 146 FROM PG_SLEEP(15))-- , testingnHFnxanT\; waitfor delay \0:0:15\ -- , testingnHqFIVMW\ OR 808=(SELECT 808 FROM PG_SLEEP(15))-- , testingo0K1qT73\) OR 355=(SELECT 355 FROM PG_SLEEP(15))--
Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Let’s Block the AI Crawlers Using robots.txt File

  • Configure the robots.txt in WordPress properly for easy crawling

    We will discuss about the optimum settings of this robots.txt file for your Wordpress website.

  • Humans.txt : Something Unlike Robots.txt

    Humans.txt unlike Robots.txt is a text file kept at root of the server to give credit the people behind a website and has only humane value.

  • Google AdSense Robots : All about Mediapartners Google and AdsBot

    Google AdSense Robots such as Mediapartners Google and AdsBot works are used by Google to automatically discover and scan content to show relevant AdSense Ads. Here are the things you must know about these Crawlers of Google.

  • Disallow crawler access to wp-admin folder to decrease server load

    Disallow crawler folder specific syntax can be used in robots.txt to stop Google crawler to crawl your Wordpress blog’s core areas to decrease the server load.

performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

vpsdime

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Cloud-Powered Play: How Streaming Tech is Reshaping Online GamesSeptember 3, 2025
  • How to Use Transcribed Texts for MarketingAugust 14, 2025
  • nRF7002 DK vs ESP32 – A Technical Comparison for Wireless IoT DesignJune 18, 2025
  • Principles of Non-Invasive Blood Glucose Measurement By Near Infrared (NIR)June 11, 2025
  • Continuous Non-Invasive Blood Glucose Measurements: Present Situation (May 2025)May 23, 2025
PC users can consult Corrine Chorney for Security.

Want to know more about us?

Read Notability and Mentions & Our Setup.

Copyright © 2026 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy