When we talk about the SEO of your website, robots.txt plays a major role in the search engine. It helps search engines to stop indexing the major part of your website.
Robots.txt basically inform search engine what to index and rank, and what not to be index and rank. In this session, we’ll let you know how to create a perfect robots.txt for your website.
The truth about the robots.txt is, it helps for SEO ranking of your website. It will hide the major part of your website like admin panel, login panel, forgot password, category, search bar, tag section from the search engine. You can restrict the search engine crawler to indexing the page which you don’t want to crawl on search engine.
If you are not good at creating robots.txt file for your website, we are here to help you to create perfect robots.txt file for your website and how to use it on the root directory.
What is crawler?
A crawler is a search engine program which visits the websites and read all pages, contents, images, and videos automatically to indexing all website. The search engine crawler will look for a robots.txt at the root of your domain and index your site.
Please note that a separate robots.txt file needs for each subdomain you want to create.
Create a perfect robots.txt file for your website:
First of all, we will follow the actual guidelines for creating robots.txt for you.
Things you should understand:
The basic structure of robots.txt file:
In the robots.txt file, you can give separate instructions to search engine bot to come and crawl your website. If you want any specific search engine bot to come over your website and crawl your site, then you can use Googlebot instead of *
Example: User-agent: Googlebot (It means only google search engine can crawl your site not other search engine crawler)
If you want to add a bing search engine to crawl your site, add Bing instead of *
Note: Here * means you will allow all the search engine crawlers to crawl your site.
If you don’t want to give permission to the search engine crawler to visit and crawl the specific page, then you can use Disallow. Sometimes you forget to hide your admin section or any specific page from the crawler. So your website admin section may be index and anyone can try to log in to your site.
For WordPress website user. It means search engine crawler not indexing your admin or login access.
How to use Disallow properly in robots.txt file:
If you are not giving permission to the crawler to visit any of your pages:
If it comes to SEO then you must follow these lines very carefully. What does search engine crawler think about duplicate content?
Everyone using Tags and categories to structure the content, you should too.
If you are using tag and categories to the structure of your blog, it will create a major issue when search engine crawlers find your content as duplicate contents. If you ever add any contents to the different categories or adding the tag to your contents, the crawler finds it as duplicate contents.
In that case, you need to stop the crawler to crawl your categories and tag.
There’s a question coming to your mind. If i will not add any category to my blog then what should happen?
Note: If you are not adding any category to your blog, your website may find all of your blog contents as Uncategorized.
Next part is Author archive. If your website has multiple authors, and you don’t want to index your author to search engine crawler. You can use:
If you don’t want to index comments, you can use:
Another major part is search engine crawler thinks search query as a duplicate content issue. You may find it top of your page “search bar”. When a user finds something, the crawler takes it as a duplicate content issue.
Suppose if you don’t want to index any of your URL or thankyou page. You can copy your URL link and paste it to the Disallow line.
You can also write:
For disallowing any page from the search engine you need to know about another 2 directives: noindex and nofollow
Make sure to add noindex to stop indexing crawler to crawl your page also. It means your thankyou page won’t show up in the SERP(search engine result page)
You won’t use nofollow to your robots.txt editor. It’s little different. You can add it to the header part of your website.
Make sure you’re not adding any wrong information to your robots.txt file. The final robots.txt file you can use:
How to add robots.txt to your website?
There are many ways to add robots.txt file to your website. If you are uploading robots.txt to the root folder of your website, you can upload a text document name it as “robots.txt” and start uploading.
You can also create a file inside your CPanel or hosting account and name it “robots.txt” and paste the above text to your root folder.
For WordPress user:
If you are using WordPress then install any SEO plugin and add your robots.txt to your plugins.
Let me show you how you can change setting in SEO plugins. I have installed Yoast SEO plugin to my WordPress site.
step 1: Go to Yoast SEO and click on Tools.
Step 2: Go to Tools and click on File Editor.
Step 3: Click on robots.txt file editor.
Step 4: Copy your text from robots.txt file and paste it to the text editor and save it.
Congrats you have added robots.txt to your website.
If you are not comfortable writing proper robots.txt for your site or you are confused at some point. These SEO plugin helps you to disable all your settings and turn on your robots.txt.
Let me show you how to turn off these setting:
Step 1: Go to Yoast SEO and click on Search Appearance.
Step 2: Click on Taxonomies.
You can disable your categories, Tags, and Post Formats at the taxonomies section. Start disable all the sections one by one.
Step 3: Go to the Archives section.
Here you can disable Author archives, date archives. I have mentioned above about the author’s archives and how it affects the site.
If you are using blogger platform then you will turn on some setting and disable some index.
Step 1: Go to Blogger setting and click on search preferences.
Step 2: Click on “Custom robots header tags” and enable it. Check this image and allow all the setting on.
Step 3: Click on Custom robots.txt.
Step 4: Add your robots.txt code to the text editor. Save it.
You have done all your setting and your robots.txt activated. All the search engine crawlers take minimum 30 days to index your robots.txt.
I hope you have enjoyed this blog. It’ll help you to rank higher in the search engines. This blog can help you to Create a perfect robots.txt file for your website.
Create a robots.txt is not taking too much time for a user. It’s a one time setup and you will modify when you needed. It makes a significant difference in SERP when crawler find it. It will help you to rank higher in search engine result page.