The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
    • Web Stories
  • Contact GSQi

Google’s Algorithms Can Ignore Rel Canonical When URLs Contain Different Content. Here’s Proof.

January 26, 2017 By Glenn Gabe 5 Comments

Share
Tweet
Share
Email
1.1K Shares

Can Google ignore rel canonical? Yes, that can happen if Google believes the urls are not equivalent. Google’s John Mueller explained this in a recent webmaster hangout video and I have seen this happen a number of times while helping clients. My post below contains John’s comments along with a case study.

Google Can Ignore Rel Canonical

Last year I wrote a post explaining how Google can treat redirects to non-relevant urls as soft 404s. Google’s John Mueller explained that during a webmaster hangout video and it was great to hear confirmation from Google that it can happen. I’ve seen that first-hand many times and provided a case study of that happening in my post. You should check it out if you haven’t already.

The reason I bring up that post is because we have another statement by Google, and again, I’ve seen it happen first-hand a number of times. In Tuesday’s webmaster hangout video (at 33:31), Google’s John Mueller explained if you are canonicalizing one url to another, but the content is different on both pages, then Google’s algorithms might think you made a mistake by using rel canonical. And if that happens, then Google may simply index the url that’s being canonicalized anyway. Yes, read that again. You might be surprised to learn that.

I find a lot of people don’t know that rel canonical is a hint and not a directive. There’s an important distinction between the two. You can watch the video below of John explaining that rel canonical can be ingored (at 33:31 in the video):

When that happens, the pages you believe are not being indexed, actually are being indexed. It’s important to understand this, since you can end up with many more urls indexed than you intended to have indexed. And if those pages aren’t the greatest in the world, then they can impact your site quality-wise (as Google will evaluate all of your content for quality purposes). In addition, those pages can now be surfaced in the search results, and rank for queries that you didn’t intend them to rank for.

Here’s a quick graphic explaining what can happen indexation-wise:

Google Can Ignore Rel Canonical

Quick Case Study – Proof This Happens
It’s interesting timing for me personally, since I’m working on a crawl analysis and audit right now where this exact situation surfaced. It was clear during the audit that an excessive number of pages are being canonicalized to other urls, and not to equivalent content.

And quickly checking indexation via a site command for those pages revealed over one thousand urls still being indexed (even though they are being canonicalized to other urls on the site). In addition, I decided to filter landing pages in GSC and found the site command wasn’t accurate. There are actually MORE urls indexed that shouldn’t be.

I exported the filtered list of landing pages via Analytics Edge and found 2,867 urls indexed that shouldn’t be (that are gaining impressions and clicks in the Google search results). Note, to learn how to use Analytics Edge to export all of your GSC data, follow my tutorial published on Search Engine Land.

Again, Google’s algorithms are simply believing the canonical setup was a mistake and choosing to ignore the canonical url tag on those urls.

Here are some screenshots of what I found:

Site command revealing 1,360 pages indexed that are canonicalized to other urls:
Canonicalized Pages That Are Indexed

GSC shows that even more urls are receiving impressions and clicks (2,867):
GSC reveals canonicalized urls receiving impressions and clicks.

And a crawl of those urls from GSC confirms they are being canonicalized to other urls:
GSC shows 2,867 canonicalized urls yielding impressions and clicks.

Therefore, the company I’m helping is having thousands of pages indexed that they didn’t intend to have indexed. I’m working with the company now on a new canonicalization strategy that will be much different than what they are doing now. It’s a great example of Google simply believing the canonical setup was a mistake and then indexing the urls anyway. Beware.

Options for SEOs
Some of you might be reading this post and feeling your blood pressure rise as you wonder how many pages on your site have been indexed that shouldn’t be. Well, that’s the point of my post! You should be wondering if your canonical setup is correct, how you can improve it, and how that can benefit your site SEO-wise.

When developing a canonicalization strategy, you have several options at your disposal. The exact setup depends on your own specific situation, but I’ll provide a quick set of bullets below (based on urls that have significantly different content). I highly recommend reviewing these options with your team and coming up with the optimal setup for your specific site.

Indexation Strategy Options When Content Differs Greatly:

  • Noindex:
    If the pages shouldn’t be indexed, and you definitely don’t want them indexed, then adding the meta robots tag using “noindex, follow” can work well. When using “noindex, follow”, you are telling Google to specifically not index the pages at hand. Noindex is a directive, so Google will follow the directive and not index those urls. And using “follow” enables Google to crawl the links on the page to discover content being linked to from the noindexed pages (while also passing signals to the destination urls).
  • Improve your canonicalization setup:
    Rel canonical can work well, but only when it’s used properly. If you have content that’s duplicative, then by all means, use rel canonical. Then Google’s algorithms can consolidate indexing properties from the canonicalized pages and fold them together with the canonical url. It’s a great way to go, but only when used properly. But if you have pages with unique content, then ask yourself if those pages should be indexed. Do you want them ranking for targeted queries? Should users be able to find them from the search results? So on and so forth.Don’t simply canonicalize mass amounts of urls to other urls with greatly different content. That’s not really the intent of rel canonical anyway. It was introduced to help cut down on duplicate content and help webmasters point Google in the right direction about which urls to index and surface in the search results. As demonstrated above, Google’s algorithms can think the canonical setup is a mistake and simply ignore the canonical url tag.


Summary – Improving Canonicalization
Canonicalization can be a complex subject, especially for larger-scale sites with many moving parts. When I perform a crawl analysis and audit for clients, I’m always keeping a keen eye on the canonical setup and if Google is following that setup. If Google ignores a flawed setup, then that can impact indexation. And that means Google will use those additional urls when evaluating “quality”. In addition, Google might also surface the wrong urls in the search results without the webmaster even knowing that’s happening (if the problem sits unnoticed).

Therefore, it’s always important to review the signals you are sending Google, and then determine how Google is responding. If Google ends up ignoring your hints, then maybe those aren’t the greatest hints to be sending in the first place.

GG

 

Share
Tweet
Share
Email
1.1K Shares

Filed Under: google, seo

Connect with Glenn Gabe today!

Latest Blog Posts

  • Amazing Search Experiments and New SERP Features In Google Land (2022 Edition)
  • Analysis of Google’s March 2022 Product Reviews Update (PRU) – Findings and observations from the affiliate front lines
  • How NewsGuard’s nutritional labels can help publishers avoid manual actions for medical content violations (Google News and Discover)
  • What Discover’s “More Recommendations”, Journeys in Chrome, and MUM mean for the future of Google Search
  • How to extend a multi-site indexing monitoring system to compare Google-selected and user-selected canonical urls (via the URL Inspection API and Analytics Edge)
  • Favi-gone: 5 Reasons Why Your Favicon Disappeared From The Google Search Results [Case Studies]
  • Google’s Broad Core Updates And The Difference Between Relevancy Adjustments, Intent Shifts, And Overall Site Quality Problems
  • Google’s December 2021 Product Reviews Update – Analysis and Findings Based On An Extended And Volatile Holiday Rollout
  • The Link Authority Gap – How To Compare The Most Authoritative Links Between Websites Using Majestic Solo Links, Semrush Backlink Gap, and ahrefs Link Intersect
  • How to identify ranking gaps in Google’s People Also Ask (PAA) SERP feature using Semrush

Web Stories

  • Google’s December 2021 Product Reviews Update – Key Findings
  • Google’s April 2021 Product Reviews Update – Key Points For Site Owners and Affiliate Marketers
  • Google’s New Page Experience Signal
  • Google’s Disqus Indexing Bug
  • Learn more about Web Stories developed by Glenn Gabe

Archives

  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2022 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in settings.

The Internet Marketing Driver
Powered by  GDPR Cookie Compliance
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

This site also uses pixels from Facebook, Twitter, and LinkedIn so we publish content that reaches you on those social networks.

Please enable Strictly Necessary Cookies first so that we can save your preferences!