How to Configure Robots.txt and Meta Robots for Magento 2
Featured In - Magento 2,
If you are reading the post, it means either you are a Magento 2 store owner or is responsible for the SEO of the store. You must be aware of the importance of robots.txt file. It is a text file that instructs the web robots, i.e., search engines which pages to crawl and which not!
In the words of Google,
“A robots.txt file tells search engine crawlers which pages or files the crawler can or can’t request from your site.”
This tiny file is a part of EVERY website, and Magento 2 stores are no exception. Fortunately, the default Magento 2 supports to create a robots.txt file and I’ll show how to do so here.
Meta robots are a way for webmasters to offer search engines with details about their stores. It is a piece of code in the <head> section of your webpage that tells the search engines what to follow and what not to! Moreover, it tells the crawlers what links to follow and what links to stop with.
Create Magento 2 Robots.txt file and assist your SEO efforts because:
- Controls how the search engine spiders see and interact with the pages of Magento 2 store.
- Its improper use can affect the rankings of your store negatively.
- Robots.txt file is the foundation of how the search engine works.
- Restrict duplicate content pages from appearing in SERPs.
- It helps to avoid overloading your store with requests from Google’s crawler.
Now that we’ve talked enough about the robots.txt and its importance for Magento 2 store, let’s create one!
Steps to create Magento 2 Robots.txt:
- Log in to Admin Panel.
- Navigate to Content > Design > Configuration
- Expand the Search Engine Robots.
- Save the Configuration
The robots exclusion protocol or standard, also known as Robots.txt can be configured in default Magento 2 with the above steps.
Why Should I Set NoIndex NoFollow Tags to Links in Magento 2:
For example, you are launching a new product in your Magento 2 store. However, your team is still working on it. For time being, you can set that product page to NoIndex to tell the search engine to not index that page.
In this way, you can test the changes in the live store and at the same time restrict the search engine to index it.
Also, the NoFollow tags to links can be useful when you want to offer any additional information that is located on a particular web address but not pass the link equity.
Meta robots tags: NOINDEX, NOFOLLOW
Now that you have created robots.txt successfully, pay attention to the meta robots tags. Cover the unnecessary parts of the code from crawlers using these tags.
- No Index is an attribute of tag, that restricts the transfer of the page weight to a non certified source. Moreover, it can be used for pages with a large number of external links.
- No Follow hides the page from indexation
Apply Nofollow or Noindex to your configuration by either updating the robots.txt file or using meta name=“robots” tag.
Possible Combinations:
- INDEX, FOLLOW: Instructs the web crawlers to index the store and check back later for the changes.
- NOINDEX, FOLLOW: Instructs the web crawlers not to index the store but check back later.
- INDEX, NOFOLLOW: Instructs the web crawlers to index the site once and avoid checking back later.
- NOINDEX, NOFOLLOW: Instructs the web crawlers not to index the site and also not check back later
Add the following code to the robots.txt file in order to hide specific pages:
1 2 | User-agent: * Disallow: /myfile.html |
Alternatively, you can restrict indexation with the following code:
1 2 3 4 5 | <html > <head > <meta name=”robots” content=”noindex, follow”/ > <title>page title> </head > |
The recommended default settings for Magento 2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | User-agent: * # Directories Disallow: /app/ Disallow: /bin/ Disallow: /dev/ Disallow: /lib/ Disallow: /phpserver/ Disallow: /pkginfo/ Disallow: /report/ Disallow: /setup/ Disallow: /update/ Disallow: /var/ Disallow: /vendor/ # Paths (clean URLs) Disallow: /index.php/ Disallow: /catalog/product_compare/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ Disallow: /catalogsearch/ Disallow: /checkout/ Disallow: /control/ Disallow: /contacts/ Disallow: /customer/ Disallow: /customize/ Disallow: /newsletter/ Disallow: /review/ Disallow: /sendfriend/ Disallow: /wishlist/ # Files Disallow: /composer.json Disallow: /composer.lock Disallow: /CONTRIBUTING.md Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html Disallow: /COPYING.txt Disallow: /Gruntfile.js Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /nginx.conf.sample Disallow: /package.json Disallow: /php.ini.sample Disallow: /RELEASE_NOTES.txt # Do not index pages that are sorted or filtered. Disallow: /*?*product_list_mode= Disallow: /*?*product_list_order= Disallow: /*?*product_list_limit= Disallow: /*?*product_list_dir= # Do not index session ID Disallow: /*?SID= Disallow: /*? Disallow: /*.php$ # CVS, SVN directory and dump files Disallow: /*.CVS Disallow: /*.Zip$ Disallow: /*.Svn$ Disallow: /*.Idea$ Disallow: /*.Sql$ Disallow: /*.Tgz$ |
Check these frequently used Magento 2 robots.txt file examples:
- Allow full access to all directories and pages:12User-agent:*Disallow:
- Don’t allow access for any user-agent to any directory and page:12User-agent:*Disallow: /
- Default Instructions:12345678910Disallow: /lib/Disallow: /*.php$Disallow: /pkginfo/Disallow: /report/Disallow: /var/Disallow: /catalog/Disallow: /customer/Disallow: /sendfriend/Disallow: /review/Disallow: /*SID=
- Restrict User Accounts and Checkout Pages12345Disallow: /checkout/Disallow: /onestepcheckout/Disallow: /customer/Disallow: /customer/account/Disallow: /customer/account/login/
- To disallow duplicate content12Disallow: /tag/Disallow: /review
- To disallow CMS directories123456Disallow: /app/Disallow: /bin/Disallow: /dev/Disallow: /lib/Disallow: /phpserver/Disallow: /pub/
- To disallow Catalog and Search Pages1234Disallow: /catalogsearch/Disallow: /catalog/product_compare/Disallow: /catalog/category/view/Disallow: /catalog/product/view/
- To disallow URL Filter Searches12345Disallow: /*?dir*Disallow: /*?dir=descDisallow: /*?dir=ascDisallow: /*?limit=allDisallow: /*?mode*
Common Web crawlersL:
Here are some common bots on the internet:
1 2 3 4 5 6 7 8 9 10 | User-agent: Googlebot User-agent: Googlebot-Image/1.0 User-agent: Googlebot-Video/1.0 User-agent: Bingbot User-agent: Slurp # Yahoo User-agent: DuckDuckBot User-agent: Baiduspider User-agent: YandexBot User-agent: facebot # Facebook User-agent: ia_archiver # Alexa |
Communicate with search engines the correct way! After the successful robots.txt creation, you can check for the validation of it using Google’s robots.txt Tester.
Create Magento 2 Robots.txt file and improve your SEO today!
Feel free to ask for any help with the topic in the Comments section below.
Do rate the post with 5 stars if it was helpful.
Thanks 😊