Updated: 22nd September 2022
Reading time: 6 min
Everyone, who has ever had anything to do with website positioning, knows that when Google informs about the algorithm update, you need to adjust even before it goes live. In July 2019, Giant from Mountain View shared some news about the oncoming change in websites indexing.
We found out about this from the official blog - Googleblog. It's a regularly updated by programmers website, where you will find all the most relevant information about important changes.
Is no further support from Google for noindex in robots.txt files a new positioning revolution or just a small, irrelevant update?
“In the interest of maintaining a healthy ecosystem and preparing for potential future open source releases, we're retiring all code that handles unsupported and unpublished rules (such as noindex) on September 1, 2019.”
In other words, you will no longer be able to determine, what content inside of the robots.txt file will appear in the search results and which won't.
Before this update, you could have excluded certain pages or even single elements.
For some people, this decision might seem hasty, but Google has been warning us for some time now to not to trust robots.txt when it comes to noindex.
There exist some websites or catalogs with confidential data. You'd rather hide it from robots, right? The same story with testing new services with for example double content. There is no point for them to appear in search results.
When you run an online store, you might have plenty of empty subpages with no longer available products. Most people don't enjoy finding a link to the product just to find out they can't purchase it.
Regardless of the type of a website you run, there is a chance you do have terms of use, which doesn't have to be visible for everybody - after all, only customers who use your services need to see it.
Robots.txt is simply a text file responsible for communication with indexing robots.
It is placed on the server and is used to enable or disable access to certain files in particular directories of the website - in short, robotx.txt file is used for indexing. When you mark a certain folder as noindex, robots won't see it. Or at least wouldn't before.
WebWave AI Writer
Generate your website copy with just one click.
WebWave AI Writer
Generate your website copy with just one click
The official Google blog list five ways of controlling indexing:
Noindex in robots metatags - using noindex tag means that the internet search can index a website, but is not allowed to display it in search results.
404 and 410 of HTTP status codes - both status codes mean that the page does not exist, which will drop such URLs from Google's index once they're crawled and processed.
Password protection - hiding a page behind a login will generally remove it from Google's index.
Disallow in robots.txt - search engines can only index pages that they know about, so blocking the page from being crawled usually means its content won’t be indexed. While the search engine may also index a URL based on links from other pages, without seeing the content itself, we aim to make such pages less visible in the future.
Search Console Remove URL tool - the tool is a quick and easy method to remove a URL temporarily from Google's search results.
Is no further support from Google for noindex in robots.txt files a new positioning revolution or just a small, irrelevant update?
If you want to stay up to date, make sure to regularly check the Googleblog.
The noindex update is important, but it only relevant for well-built websites.
In WebWave, you get easy access to robots.txt file - without coding. All you need to do is copy and paste content to the file.
That's why, if you are still wondering how to build a website and which tool should you use - take a look at WebWave, the most user-friendly website builder.
Authors: Julia Madraszewska & Weronika Wawrzyniak
Company.
Help.
Templates.