I have previously written about the power (and danger) of the meta robots tag. It’s one line of code that can keep lower quality pages from being indexed, while also telling the engines to not follow any links on the page (i.e. don’t pass any link signals through to the destination page).
That’s helpful when needed, but the meta robots tag can also destroy your SEO if used improperly. For example, if you mistakenly add the meta robots tag to pages using noindex. If that happens, and if it’s widespread, your pages can start dropping from Google’s index. And when that happens, you can lose rankings for those pages and subsequent traffic. In a worst-case scenario, your organic search traffic can plummet in an almost Panda-like fashion. In other words, it can drop off a cliff.
And before you laugh-off that scenario, I can tell you that I’ve seen that happen to companies a number of times during my career. It could be human error, CMS problems, reverting back to an older version of the site, etc. That’s why it’s extremely important to check for the presence of the meta robots tag to ensure the right directives are being used.
But here’s the rub. That’s not the only way to issue noindex, nofollow directives. In addition to the meta robots tag, you can also use the x-robots-tag in the header response. By using this approach, you don’t need a meta tag added to each url, and instead, you can supply directives via the server response.
Here are two examples of the x-robots-tag in action:
Again, those directives are not contained in the html code. They are in the header response, which is invisible to the naked eye. You need to specifically check the header response to see if the x-robots-tag is being used, and which directives are being used.
As you can guess, this can easily slip through the cracks unless you are specifically looking for it. Imagine checking a site for the meta robots tag, thinking all is ok when you can’t see it, but the x-robots-tag is being used with “noindex, nofollow” on every url. Not good, to say the least.
How To Check The X-Robots-Tag in the Header Response
Based on what I explained above, I decided to write this post to explain four different ways to check for the x-robots-tag. By adding this to your checklist, you can ensure that important directives are correct and that you are noindexing and nofollowing the right pages on your site (and not important ones that drive a lot of traffic from Google and/or Bing). The list below contains browser plugins and online tools for checking single urls, as well as crawling tools for checking urls in bulk. Let’s jump in.
1. Browser Plugins
Web Developer Plugin
The web developer plugin is one of my favorite plugins for checking a number of important items, and it’s available for both Firefox and Chrome. By simply clicking the plugin in your browser, then “Information”, and then selecting “Response Headers”, you can view the http header values for the url at hand. And if the x-robots-tag is being used, you will see the values listed.
SEO Site Tools
I use the SEO Site Tools chrome extension often for checking specific SEO elements for a given page. The x-robots directives are somewhat hidden in this plugin, but you can still find them pretty easily. Just click the plugin in Chrome, then select the “Page Elements” tab, and then scroll all the way down to the bottom of the window. You’ll see the header response there, including the x-robots-tag directives if the tag is being used for the page at hand.
If you want to check headers on the fly, then there’s no better plugin than LiveHTTPHeaders. It’s available for both Chrome and Firefox and it easily enables you to view the header response for each page as you browse the web. For example, you can check headers and track down problems as you traverse a specific website.
Since it provides the header response for each page, you will also see the x-robots-tag directives for each url. Just click the url you want in the window to view the header response. The x-robots-tag will be listed if it’s used for the url at hand.
2. Online Tools For Checking The Header Response
In addition to plugins, you can use a number of online tools that take a url (or several urls) and return the header response. Like plugins, this is a good option when you are checking single urls or just testing a sample of urls.
SEO Tools Server Header Checker
There are two options when using the SEO Tools Server Header Checker. You can check a single url or you can use the bulk url option to check several urls at one time. For the single url option, just enter a url to test and click “Check Headers”. The tool will return the header response for the url at hand, including the x-robots-tag directives.
For the bulk header check, enter a series of urls (one on each line) and click “Check Headers”. You will see each response for each of the urls listed, along with the x-robots-tag if it’s being used.
URI Valet is a versatile online tool that returns a number of important pieces of information for the url at hand. For example, the header response, performance information, internal links, external links, validation information, etc. You can also select a user agent for checking the response based on various browsers, devices, and search engine bots. There’s quite a bit of functionality built in to this online tool, but I won’t go into detail about all the reports here. That’s because we are focused on the header response (to find the x-robots-tag).
Simply enter the url, select a user-agent (or just keep the default selected), click the “I’m not a robot” button, and then click submit. The header response will be listed below, along with the x-robots-tag directives (if used).
3. Crawling Tools
Now that I’ve covered some plugins and online tools that can help you check the x-robots-tag, let’s check out some robust crawling tools. For example, if you want to crawl many urls in bulk (like 10K, 100K, or 1M+ pages) to check for the presence of the x-robots-tag, then the following tools can be extremely helpful.
If you want to a robust, enterprise-level crawling engine, then DeepCrawl is for you. Note, I’ve been such a big proponent of DeepCrawl that I’m now on the customer advisory board. So yes, I’m a fan. :)
After crawling a site, you can easily check the “Noindex Pages” report to view all pages that are noindexed via the meta robots tag, the x-robots-tag header response, or by using noindex in robots.txt. You can export the list and then filter in Excel to isolate pages noindexed via the x-robots-tag.
I’ve also been a big fan of Screaming Frog for a long time. It’s an essential tool in my SEO arsenal and I often use Screaming Frog in combination with DeepCrawl. For example, I might crawl a large-scale site using DeepCrawl and then isolate certain areas for surgical crawls using Screaming Frog.
Once you crawl a site using Screaming Frog, you can simply click the Directives tab and then look for the x-robots column. If any pages are using the x-robots-tag, then you will see which directives are being used per url.
Summary – There’s more than one way to noindex a page…
OK, now there’s no excuse for missing the x-robots-tag during an SEO audit. :) If you notice certain pages are not being indexed, yet the meta robots tag isn’t present in the html code, then you should definitely check for the presence of the x-robots-tag. You just might find important pages being noindexed via the header response. And again, it could be a hidden problem that’s causing serious SEO issues.
Moving forward, I recommend checking out the various plugins, online tools, and crawlers I listed in this post. All can help you surface important directives that can be impacting your SEO efforts.