The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
    • Web Stories
  • Contact GSQi

Google’s Algorithms Can Ignore Rel Canonical When URLs Contain Different Content. Here’s Proof.

January 26, 2017 By Glenn Gabe 5 Comments

Share
Tweet
Share
Email

Can Google ignore rel canonical? Yes, that can happen if Google believes the urls are not equivalent. Google’s John Mueller explained this in a recent webmaster hangout video and I have seen this happen a number of times while helping clients. My post below contains John’s comments along with a case study.

Google Can Ignore Rel Canonical

Last year I wrote a post explaining how Google can treat redirects to non-relevant urls as soft 404s. Google’s John Mueller explained that during a webmaster hangout video and it was great to hear confirmation from Google that it can happen. I’ve seen that first-hand many times and provided a case study of that happening in my post. You should check it out if you haven’t already.

The reason I bring up that post is because we have another statement by Google, and again, I’ve seen it happen first-hand a number of times. In Tuesday’s webmaster hangout video (at 33:31), Google’s John Mueller explained if you are canonicalizing one url to another, but the content is different on both pages, then Google’s algorithms might think you made a mistake by using rel canonical. And if that happens, then Google may simply index the url that’s being canonicalized anyway. Yes, read that again. You might be surprised to learn that.

I find a lot of people don’t know that rel canonical is a hint and not a directive. There’s an important distinction between the two. You can watch the video below of John explaining that rel canonical can be ingored (at 33:31 in the video):

When that happens, the pages you believe are not being indexed, actually are being indexed. It’s important to understand this, since you can end up with many more urls indexed than you intended to have indexed. And if those pages aren’t the greatest in the world, then they can impact your site quality-wise (as Google will evaluate all of your content for quality purposes). In addition, those pages can now be surfaced in the search results, and rank for queries that you didn’t intend them to rank for.

Here’s a quick graphic explaining what can happen indexation-wise:

Google Can Ignore Rel Canonical

Quick Case Study – Proof This Happens
It’s interesting timing for me personally, since I’m working on a crawl analysis and audit right now where this exact situation surfaced. It was clear during the audit that an excessive number of pages are being canonicalized to other urls, and not to equivalent content.

And quickly checking indexation via a site command for those pages revealed over one thousand urls still being indexed (even though they are being canonicalized to other urls on the site). In addition, I decided to filter landing pages in GSC and found the site command wasn’t accurate. There are actually MORE urls indexed that shouldn’t be.

I exported the filtered list of landing pages via Analytics Edge and found 2,867 urls indexed that shouldn’t be (that are gaining impressions and clicks in the Google search results). Note, to learn how to use Analytics Edge to export all of your GSC data, follow my tutorial published on Search Engine Land.

Again, Google’s algorithms are simply believing the canonical setup was a mistake and choosing to ignore the canonical url tag on those urls.

Here are some screenshots of what I found:

Site command revealing 1,360 pages indexed that are canonicalized to other urls:
Canonicalized Pages That Are Indexed

GSC shows that even more urls are receiving impressions and clicks (2,867):
GSC reveals canonicalized urls receiving impressions and clicks.

And a crawl of those urls from GSC confirms they are being canonicalized to other urls:
GSC shows 2,867 canonicalized urls yielding impressions and clicks.

Therefore, the company I’m helping is having thousands of pages indexed that they didn’t intend to have indexed. I’m working with the company now on a new canonicalization strategy that will be much different than what they are doing now. It’s a great example of Google simply believing the canonical setup was a mistake and then indexing the urls anyway. Beware.

Options for SEOs
Some of you might be reading this post and feeling your blood pressure rise as you wonder how many pages on your site have been indexed that shouldn’t be. Well, that’s the point of my post! You should be wondering if your canonical setup is correct, how you can improve it, and how that can benefit your site SEO-wise.

When developing a canonicalization strategy, you have several options at your disposal. The exact setup depends on your own specific situation, but I’ll provide a quick set of bullets below (based on urls that have significantly different content). I highly recommend reviewing these options with your team and coming up with the optimal setup for your specific site.

Indexation Strategy Options When Content Differs Greatly:

  • Noindex:
    If the pages shouldn’t be indexed, and you definitely don’t want them indexed, then adding the meta robots tag using “noindex, follow” can work well. When using “noindex, follow”, you are telling Google to specifically not index the pages at hand. Noindex is a directive, so Google will follow the directive and not index those urls. And using “follow” enables Google to crawl the links on the page to discover content being linked to from the noindexed pages (while also passing signals to the destination urls).
  • Improve your canonicalization setup:
    Rel canonical can work well, but only when it’s used properly. If you have content that’s duplicative, then by all means, use rel canonical. Then Google’s algorithms can consolidate indexing properties from the canonicalized pages and fold them together with the canonical url. It’s a great way to go, but only when used properly. But if you have pages with unique content, then ask yourself if those pages should be indexed. Do you want them ranking for targeted queries? Should users be able to find them from the search results? So on and so forth.Don’t simply canonicalize mass amounts of urls to other urls with greatly different content. That’s not really the intent of rel canonical anyway. It was introduced to help cut down on duplicate content and help webmasters point Google in the right direction about which urls to index and surface in the search results. As demonstrated above, Google’s algorithms can think the canonical setup is a mistake and simply ignore the canonical url tag.


Summary – Improving Canonicalization
Canonicalization can be a complex subject, especially for larger-scale sites with many moving parts. When I perform a crawl analysis and audit for clients, I’m always keeping a keen eye on the canonical setup and if Google is following that setup. If Google ignores a flawed setup, then that can impact indexation. And that means Google will use those additional urls when evaluating “quality”. In addition, Google might also surface the wrong urls in the search results without the webmaster even knowing that’s happening (if the problem sits unnoticed).

Therefore, it’s always important to review the signals you are sending Google, and then determine how Google is responding. If Google ends up ignoring your hints, then maybe those aren’t the greatest hints to be sending in the first place.

GG

 

Share
Tweet
Share
Email

Filed Under: google, seo

Connect with Glenn Gabe today!

Latest Blog Posts

  • How to compare hourly sessions in Google Analytics 4 to track the impact from major Google algorithm updates (like broad core updates)
  • It’s all in the (site) name: 9 tips for troubleshooting why your site name isn’t showing up properly in the Google search results
  • Google Explore – The sneaky mobile content feed that’s displacing rankings in mobile search and could be eating clicks and impressions
  • Bing Chat in the Edge Sidebar – An AI companion that can summarize articles, provide additional information, and even generate new content as you browse the web
  • The Google “Code Red” That Triggered Thousands of “Code Reds” at Publishers: Bard, Bing Chat, And The Potential Impact of AI in the Search Results
  • Continuous Scroll And The GSC Void: Did The Launch Of Continuous Scroll In Google’s Desktop Search Results Impact Impressions And Clicks? [Study]
  • How to analyze the impact of continuous scroll in Google’s desktop search results using Analytics Edge and the GSC API
  • Percent Human: A list of tools for detecting lower-quality AI content
  • True Destination – Demystifying the confusing, but often accurate, true destination url for redirects in Google Search Console’s coverage reporting
  • Google’s September 2022 Broad Core Product Reviews Update (BCPRU) – The complexity and confusion when major algorithm updates overlap

Web Stories

  • Google’s December 2021 Product Reviews Update – Key Findings
  • Google’s April 2021 Product Reviews Update – Key Points For Site Owners and Affiliate Marketers
  • Google’s New Page Experience Signal
  • Google’s Disqus Indexing Bug
  • Learn more about Web Stories developed by Glenn Gabe

Archives

  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2023 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy
This website uses cookies to improve your experience. Are you ok with the site using cookies? You can opt-out at a later time if you wish. Cookie settings ACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience. You can read our privacy policy for more information.
Cookie Consent