We previously discussed about robot.txt in a separate article. However, this time, we will discuss about the optimum settings of this robots.txt file for your WordPress website.
Actually this robots.txt is nothing but a simple text file containing few instructions to the search robots (as we said ago) that is located in the /root/ of your server. So, it can be accessed by adding <span style="background-color: #e9eef3; font-color: #000000 font-size; font-family: Arial,Tahoma,Verdana; font-weight: bold; text-shadow: #fff 1px 1px;">/robots.txt</span> after any WordPress based website’s url. For example, for WordPress’s own official website, it is : <span style="background-color: #e9eef3; font-color: #000000 font-size; font-family: Arial,Tahoma,Verdana; font-weight: bold; text-shadow: #fff 1px 1px;">http://wordpress.org/robots.txt</span>.
For few reason, we need to modify this file.

Note that, by default, there may be no actual robots.txt in the root of your server; Google get instruction set from WordPress itself (allow search engine’s to crawl option: do you remember?) and generates a file that generally allows to crawl the whole server by Google or other search engine bots and location of the sitemap file.
---
You can create own from Google Webmaster’s tool (Crawler access). However, we will not use it. Just use any simple text editor (do not use wordpad or MS word, they will insert unwanted extra chunks of characters).
So, our work flow will be:
- Creating a custom robots.txt file
- Uploading it to the root of your server using this tutorial or this tutorial or any FTP software or Online FTP.
Why we need to modify the robots.txt file?
- Tell me, do you want to index all the files in your wp-content or wp-admin folder? Why you should allow the search bot to crawl those folders?
- Technically, the crawling will be smooth, desired and well directed. That is, Google or other search engine bots reads the instruction which folders are marked as NOT TO CRAWL and spends the time for useful posts (which you desire, right?)
- You are adding rel=”nofollow” to all the folders you are adding to disallow to crawl : this is obvious and more robust than simple fragile rel=”nofollow”. This logically can eliminate the comments to get indexed and so <span style="background-color: #e9eef3; font-color: #000000 font-size; font-family: Arial,Tahoma,Verdana; font-weight: bold; text-shadow: #fff 1px 1px;">Disallow: /comments/</span> be used to remove duplicate content problem that shows up in Google Webmaster tool (HTML suggestion) : typical problem of WordPress. Though this type of duplicate content, does have very little issue with SERP or SEO, but still, we like clean things.
- As a corollary of the above logic, your spammer friends’ hard try to get indexed with your site is ruled out.
- Second probable corollary is, you are providing a relatively static text, which Google loves very much: you are getting comments and Google Bots are also happy.
So, you need to understand the basic to modify and apply according to your website’s need. We are using this robots file instruction set:
Sitemap: https://thecustomizewindows.com/sitemap.xml.gz <span style="color: #ff0000;"><--This is for Yahoo!</span>
Sitemap: https://thecustomizewindows.com/sitemap.xml <span style="color: #ff0000;"><---This is for other search engines</span> including Google.User-agent: * <span style="color: #ff0000;"><--This means allow all and except is added later</span>Disallow: /wp-admin/ <span style="color: #ff0000;"><--Bots will not crawl this folder</span>Disallow: /wp-includes/ <span style="color: #ff0000;"><--Bots will not crawl this folder</span>Disallow: /feed/ <span style="color: #ff0000;"><--Bots will not crawl this feeds</span>Disallow: /trackback/ <span style="color: #ff0000;"><--Bots will not crawl trackbacks</span>Disallow: /cgi-bin/ <span style="color: #ff0000;"><--Bots will not crawl this folder</span>Disallow: /*.php$ <span style="color: #ff0000;"><--Bots will not crawl all php files</span>Disallow: /*.js$ <span style="color: #ff0000;"><--Bots will not crawl javascript</span>Disallow: /*.cgi$ <span style="color: #ff0000;"><--Bots will not crawl this cgi bin</span>Disallow: /*.xhtml$ <span style="color: #ff0000;"><--Bots will not crawl any xhtml document</span>Disallow: /*.php* <span style="color: #ff0000;"><-- We have ensured not crawling of "php?p=123" format too.</span>Disallow: */trackback* <span style="color: #ff0000;"><--As above</span>Disallow: /*?* <span style="color: #ff0000;"><--As above</span>Disallow: /z/Disallow: /*.inc$Disallow: /*.css$Disallow: /*.txt$
Save the file as robots.txt will work fine and upload it to your server, thats it. For newbies, you can use our robots file (link on previous sentence), just change the domain name to yours, otherwise, Google will think you are cheating by providing the url of our sitemap instead your own.
Additional parameters like <span style="background-color: #e9eef3; font-color: #000000 font-size; font-family: Arial,Tahoma,Verdana; font-weight: bold; text-shadow: #fff 1px 1px;">Disallow: /comments/</span> , <span style="background-color: #e9eef3; font-color: #000000 font-size; font-family: Arial,Tahoma,Verdana; font-weight: bold; text-shadow: #fff 1px 1px;">Disallow: /tags/</span> etc. You can read on robotstxt.org for disallowing harmful search bots too.

Thanks Abhishek da,for this great info.Actually I am facing lot of duplicate content issue and some unwanted 302 redirection.I have done all the necessary SEO but yet its coming.I think now it will help me a lot.Thanks for share.
have a great day.
Manas Kabiraj.
Thanks and welcome Manas. I have no duplicate issue now. Seems that “Nofollow” has little value. I tried using Platinum SEO to nofollow : simply does not work.
You have to monitor it everyday, You’ll feel like playing a game : Google will throw every next day a new content duplicate issue. Disallow it (I use online FTP for easy editing), it’ll disappear next day. After a month or two, there will be no duplicate content.
(Edited to make the comment shorter)
But what about 302 redirection?I have edited my robots.txt file and have disallow comments and comments/feed,I will see what Google are doing this.But please suggest some about removing 302 redirection.I have checked .htaccess file too,its clear still now.What are causing this redirection?
302 error in WordPress generally happen for the Host. Its a problem that arises from the Apache server configuration. Check your cPanel settings if they are OK. If you are using GoDaddy its probable that, other than changing the Host, there is no way to solve it. Generally the .htaccess file is not faulty. Ask in your Hosting company’s forum, if there is anyway to solve it.
You can ask Mr. Sajal Kayan in his blog (sajalkayan.com), he might point it out more precisely.
Sorry that I missed your 302 error in earlier comment. Please report me back what happened next.
What should i add in robots.txt to allow only homepage , tag and post to appear in search result and disallow everything else including category .Because each tag with meta description is more important for me then category due to keyword. And also it will be good for creating this robots.txt file or not.