The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
  • Contact GSQi

Google Search Console (GSC) reporting for Soft 404s is now more accurate. But where did those Soft 404s go?

February 19, 2021 By Glenn Gabe Leave a Comment

On January 11, 2021 Google announced improvements to GSC’s Index Coverage data. With the update, which is clearly apparent in the reporting, Search Console removed the generic “crawl anomaly” issue (and mapped those to more granular issues), changed how urls are reported that are blocked by robots.txt, added a new issue called “indexed without content” as a warning, and then had one final note about soft 404s. That last bullet said, “Soft 404 reporting is now more accurate.” And based on my work with large-scale sites, I was particularly interested in that final bullet.

A number of the sites I help had tens of thousands, or even hundreds of thousands of soft 404s being reported based on the nature of their content. Starting on January 5, 2021, the number of soft 404s reported in GSC fell off a cliff. I’m talking about a severe drop, like these examples:

In order to understand why alarms were going off in my office, and why I’m particularly interested in those drops, let me quickly cover more about soft 404s. Then I’ll explain what I found once I dug into each situation.

A Deeper Look at Soft 404s:
Google explains that a soft 404 is a url that tells the user the page doesn’t exist, but it still returns a 200 header response code (instead of a hard 404). In practice, the soft 404s that Google has surfaced historically have been urls that span different page types. For example, the most obvious is a product that’s no longer available that still returns a 200 response code. It might have very little content and a message telling users the product is not available on the site anymore.

But, other types of pages have showed up in the soft 404s reporting over time. For example, I’ve seen super-thin pages show up there like articles, slideshows, job listings, pages with images or video with little textual content, and more. And on larger-scale sites, those pages can be problematic quality-wise depending on how Google is handling them.

For example, a soft 404 is treated as a hard 404. Google will not index those urls identified as soft 404s. But, if a site had a big soft 404 problem, I often found that Google wasn’t always handling all of those pages types as soft 404s… And that led to many thinner urls getting indexed.

So, if you had 50K super-thin pages and Google saw them as soft 404s, then technically that shouldn’t be a huge problem quality-wise. Since they are seen as hard 404s, Google will obviously not take them into account when evaluating quality.

But, what if suddenly those urls aren’t seen as soft 404s anymore? Well, those urls could be counted when Google evaluates quality overall for the site. That’s a big reason my “quality antennae” went up when I saw tens of thousands, or even hundreds of thousands, of soft 404s suddenly disappear. Are they indexed now, categorized as another issue in GSC and not being indexed, or maybe not even known to Google anymore? I NEEDED TO KNOW!

In addition, soft 404s can impact crawl budget, which can be a concern for larger-scale sites. Note, most sites do not need to worry about crawl budget. Only very large sites (typically sites with over one million urls) need to worry about crawl budget. For example, this site needs to worry about crawl budget:

Google’s own documentation explains that soft 404s can eat up crawl budget. So it’s another concern when it comes to soft 404s (and understanding where they magically went on 1/5/21). In the past, if you saw them flagged in GSC, then you could identify patterns and handle appropriately. But if they suddenly disappear, it could be harder to surface and handle.

In Search Of: Lost Soft 404s
This all leads me back to one question… where did all of those soft 404s go?? After seeing the number of soft 404s drop off a cliff on 1/5/21, I decided to dig into the data across sites and find out what happened. So, I grabbed my virtual pickaxe and headed into the GSC mine. I was determined to find out how those urls were being treated now.

Now, if you choose to dig into search console after a change is rolled out by the GSC product team, you’ll quickly realize that you cannot access historical data. That sucks and means you will have a hard time finding urls previously categorized as soft 404s unless you documented them somehow. I had this covered since I always dig heavily into GSC data when auditing client sites. I was able to open my vault of recent audit documents and resurface many soft 404s that were reported over the past six to twelve months.

Based on digging into the data, I was sometimes relieved with what I discovered for certain situations, but also alarmed by what I found in other situations. I’ll explain more below.

Where Did Soft 404s Go and How Is Google Currently Handling Those URLs?
Below, I’ll cover the various categories of findings based on digging into a number of large-scale sites that previously reported many soft 404s. On 1/5/21, the number of soft 404s dropped off a cliff and I was determined to find out how they were currently being treated.

Super-interesting: “The URL is unknown to Google”
This was incredibly interesting to me. I noticed many url across sites that were once reported as soft 404s now returning “The URL is unknown to Google” when inspected in GSC. It’s almost like Google completely wiped out the history for these urls.

This isn’t a category that shows up in the Coverage reporting for obvious reasons… since it’s NOT known to Google. But, it was known to Google at some point recently. I know that for sure. Google’s documentation does explain more about this categorization, but I’m not sure it answers why the soft 404s are now “unknown to Google”. Here is what Google’s documentation explains:

“If the index coverage status “URL is unknown to Google” appears, it means that Google hasn’t indexed the URL either because it hasn’t seen the URL before, or because it has found it as a properly marked alternate page, but it can’t be crawled.”

I’m going to reach out to Google to learn more about this. I would imagine that Google handling many urls as this category is helping it improve the efficiency of its own systems, but again, I’ll see if Google can comment about this. But if it’s unknown to Google, it won’t hurt your site quality-wise. That’s good.

“Crawled, not indexed” and “Discovered, not indexed”:
I have covered this category a number of times in my other posts about technical SEO and broad core updates. Both “Crawled, not indexed” and “Discovered, not indexed” can signal quality problems and/or crawl budget issues. With “Crawled, not indexed”, Google discovered and crawled the url, but has chosen to not index it. I have often found low-quality or thin content when analyzing this category across sites (or just parts of a site Google isn’t digging for some reason).

So, if your soft 404s end up as “Crawled, not indexed”, then that’s ok in the short-term (since they aren’t being indexed and can’t be held against the site quality-wise), but I would still heavily dig into the urls and handle them appropriately. This is also one of the reasons I believe we need larger exports from GSC’s Coverage reporting! Exporting this category in bulk would help site owners identify more patterns across the site that could be causing problems.

And for “Discovered, not indexed”, Google knows about the url, but has decided to not even crawl it (and clearly not index it). I have seen quality problems and problematic url patterns while analyzing this category of urls. For example, with some large and complex sites with deep quality issues, it’s like Google knows ahead of time what it’s going to find on many of those urls (based on url structure) and doesn’t even want to crawl them.

So, if you see many soft 404s now being categorized as “Discovered, not indexed”, then that’s also ok in the short-term since they can’t be held against the site quality-wise if they aren’t being indexed. But just like with “Crawled, not indexed”, I would dig in and make sure the urls are being handled properly. If they should 404, then make sure they return hard 404s. If they are thin and low-quality, then boost the content there. And if it’s a problematic area of the site, then work on improving that area overall. Remember, “Discovered, not indexed” means Google knows about the urls, but doesn’t believe it’s worth it to even crawl them (let alone index them).

URLs Now Indexed: Danger Will Robinson! Danger!
Yes, this is a super important category. When digging into urls that were previously categorized as soft 404s, I noticed some of them were now being indexed! That can be a dangerous situation for some sites. For example, if they were thin and low-quality urls that Google is now indexing for some reason, those urls will now be taken into account while Google evaluates quality for the site overall. “Quality indexing” is very important (especially for larger-scale sites), so having an influx of low-quality content get indexed is not a good thing.

If you notice many urls that were once flagged as soft 404s now being indexed, then it’s important to handle those urls appropriately now. For example, you can noindex low-quality content, boost the content there (improve it so it meets or exceeds user expectations), 301 it to a newer version of the content, or 404 the pages if they shouldn’t be on the site anymore. But I would not leave them as-is if they are low-quality or thin.

True soft 404s still seen as soft 404s:
Even though many soft 404s disappeared from the reporting in GSC, there were still urls categorized as true soft 404s. For example, pages that were literally blank that returned 200 header response codes, pages that returned “not found” messages on the page, but returned 200 response codes, etc. Again, true soft 404s. I guess this is the “handling soft 404s more accurately” part from Google. For these urls, I would make sure to handle them appropriately. If they should 404, then have them return hard 404s and not 200 header response codes. And if they shouldn’t be seen as soft 404s, then fix or boost the content there.

Moving Forward: What Site Owners Can Do
If you have seen a big drop in soft 404s starting in early January 2021, then I highly recommend digging into the situation. If you don’t have historical data containing urls that were once categorized as soft 404s, there’s not really much you can do to see how those soft 404s are being reported now. But, if you did document soft 404s for your site, then you can inspect those urls to see how Google is handling them now.

Are they “Unknown to Google” now, are they being categorized as “Crawled, not indexed”, “Discovered, not indexed”, are they still soft 404s, or are they being indexed? And if they are being indexed, then they can be counted when Google evaluates quality for the site overall. That’s why it’s important to handle the urls appropriately.

Summary – Know Your Soft 404s
With the latest data improvements in GSC, Google started better handling soft 404s. That’s great, but many sites saw a big drop in urls reported as soft 404s at that time. Since that category in GSC often yielded thin or low-quality content on a site, it’s important to understand how Google is currently handling the urls. Are they still not being indexed, or are they now being indexed and counting towards quality? I recommend digging into the situation for your site and then handling those urls appropriately.

GG

Filed Under: google, seo, tools

Google’s December 2020 Broad Core Algorithm Update Part 2: Three Case Studies That Underscore The Complexity and Nuance of Broad Core Updates

January 13, 2021 By Glenn Gabe Leave a Comment

In part one of my series on the December 2020 broad core update, I covered a number of important items. If you haven’t read that post yet, I recommend doing that and then coming back to this post. For example, I covered the rollout, timing, tremors and reversals, the impact I was seeing across verticals, and then I covered a number of important points for site owners to understand about broad core updates.

Those items spanned quality indexing, the importance of providing a strong user experience, avoiding hammering users with ads, technical SEO problems that cause quality problems, the A in E-A-T, machine learning in Search, and more. My first post will provide a strong foundation before jumping into the case studies I’ll cover below.

Three Case Studies, Three Interesting Lessons About Broad Core Updates
In this post, I’ll cover three case studies that underscore the complexity and nuance of broad core updates. Each is a unique situation and site owners (and SEOs) can learn a lot from what transpired with the sites I’ll be covering. I was planning on covering four cases, but this post was getting too long. I might cover that other case in another post in the future, though.

Remember, Google is evaluating many factors over time with broad core updates. It’s never about one or two things. Instead, there could be many things working together that cause significant impact. Google’s Gary Illyes once explained that Google’s core ranking algorithm is comprised of “millions of baby algorithms working together to output a score”. That’s why I recommend using a “kitchen sink” approach to remediation where you surface all potential problems impacting a site and fix them all (or as many as you can). The core point, pun intended, is to significantly improve the site overall.

It’s also important to understand that Google wants to see significant improvement over the long-term. Google’s John Mueller has explained this many times over the years, and it’s what I have seen while helping many companies deal with broad core updates. And the case studies below support that point as well.

In the case studies below, I’ll cover each site’s background leading up to the December broad core update, some of the important issues I surfaced while helping those companies, how those issues were addressed (or not), and what the outcome was when the December update rolled out. Again, my hope is that these cases will help site owners understand more about broad core updates, how to deal with negative impact, and understand various factors that could be leading to negative impact. Let’s jump in.

Below, I have provided a table of contents in case you want to jump to a specific part of this post:

  • Case Study 1: Niche News Publisher – A Double Core Update Hit, But E-A-T Galore
  • Case Study 2: Crushed By The December Update. It’s Like Google Doesn’t Even Know You Anymore
  • Case Study 3: Overcoming “The Creep”
  • Tips and recommendations for site owners.

Case Study 1: Niche News Publisher – A Double Core Update Hit, But E-A-T Galore
The first case I’m going to cover is a super interesting one (and I felt especially connected to it since I loved the site even before the site owner reached out to me!) It’s a news publisher that focuses on a very specific niche that reached out to me after getting hit hard by the January 2020 core update. Again, I knew the site right away (since I had been reading it for years) and I was initially extremely surprised to hear the site had been hit hard.

Note, they contacted me in March of 2020, so months after the January update rolled out. Waiting to improve a site is a big risk site owners take and I’ll touch on that later in this case.

The site had lost 45% of its Google organic traffic overnight with the January 2020 broad core update:

I wish that was the only drop the site had to deal with in 2020, but it wasn’t. More about that soon.

Google News, Top Stories, and Discover:
Based on how Google was viewing the site, it rarely showed up in Top Stories, almost never ranked highly in Google News, and had no Discover traffic at all. And I mean none… the Discover reporting didn’t even show up in GSC. That means it had no impressions or clicks over time. And remember, it’s arguably the top publisher in its niche.

E-A-T Galore:
The site, including its core writers, had E-A-T galore. That was definitely not the problem, not even close. It’s a well-known, well-respected, leader in its niche. I would argue it’s the top news site in its niche and has broken some of the biggest stories in its category.

With regard to the A in E-A-T, which is authority, Google has explained that one of the best-known signals it uses is PageRank, or links from across the web. Google explained this in a whitepaper it released about fighting disinformation.

Also, Gary Illyes once explained that E-A-T was largely based on links and mentions from well-known sites. Note, I covered the A in E-A-T in my first post in this series (if you are interested in learning more about that topic). For this site, this was covered big-time. The site is absolutely an authority in its niche. The site has over 2M inbound links from over 26K referring sites. And a number of the sites linking to this news publisher are some of the most powerful and authoritative websites in the world.

So what was the problem? Why did the site take such a big hit? If you read part one of this series on the December 2020 broad core update, then you know that core updates are never about one thing. Google is evaluating many factors across a site and for an extended period of time. I always say there’s never one smoking gun, there’s typically a battery of them.

Therefore, I dug in the way I normally would in order to surface all potential issues that could be causing problems. That included auditing content quality, user experience, the advertising situation, technical SEO, and more. I clearly explained this process to my client and they were on board. They said they would clear a path, implement as many changes as possible, and as quickly as possible, in order to see recovery in the future.

Well, Hello May 2020 Broad Core Update, How Nice To Meet You…
It’s also worth noting that as I was auditing the site, and my client was implementing changes, the May 2020 core update rolled out AND THEY GOT HIT AGAIN. This can absolutely happen if a site hasn’t improved significantly (or if a site chooses to not improve at all). In addition, even if you are implementing changes, it’s important to understand that recent changes would not be reflected in the next broad core update. I covered that several times in my previous posts, including both the May 2020 core update and part one about the December 2020 broad core update.

With the May 2020 core update, this news publisher dropped by an additional 51%. That’s on top of dropping 45% with the January core update. So they were down over 90% from where they once were:

Needless to say, my client felt defeated. They couldn’t believe they dropped even more. They were now receiving a trickle of traffic compared to what they used to. I explained that they should continue to make changes, that not enough had been implemented, and not over the long-term.

I basically felt like Mickey in Rocky pushing his fighter to keep driving forward:

Needless to say, my client was eager to keep driving forward, implement big changes, improve the site significantly, etc. But, it’s important to know that you can get hit by multiple core updates in a row if Google isn’t digging your site. I have covered that before in blog posts and in my presentations on broad core updates.

Below, I’ll cover several of the core items I surfaced during the audit that could have been contributing to the drops the site experienced during two broad core updates. I can’t cover everything that was addressed during the engagement, but I will provide a number of important findings. And keep in mind, there are millions of baby algorithms working together to output a score. It’s not about one of the following items, but more about improving all of them (including other items not covered below).

Lower-hanging fruit: Thin content, cruft:
With many news publishers, there’s a tendency for low-quality content to build up over time. They are typically larger, more complex sites and cruft can definitely build. And after years go by, some sites might have a lot of lower-quality content in pockets across the site.

Although this site produces some of the best content in its niche, I did surface a lot of cruft that had built up over time. I explained what “quality indexing” was to my client, which is making sure only high-quality content is indexed, while ensuring low-quality or thin content is not indexed. They understood the concept quickly and had no problem nuking thin and low-quality content from the site. Remember, Google is on record explaining that every page indexed is taken into account when evaluating quality.

Here is Google’s John Mueller explaining this:

Content quality (and strategy):
As I was analyzing content that dropped heavily during each update, I noticed something interesting. There were a number of round-up-like posts that were thin and somewhat misleading. So instead of creating one strong page that could be updated for the content at hand (which I can’t reveal), the site was publishing many thinner pages over time. This led to many pages on the site that were ranking and receiving traffic, but had no way to meet or exceed user expectations. This was a huge find in my opinion.

My client tackled this head-on and used the approach I recommended for creating one piece of content that can live over time and always contain the latest information, while archiving older pages when needed. This led to higher quality content that can meet or exceed user expectations, while also improving the “quality indexation” levels on the site overall. It was a win-win.

User Frustration and “Hell hath no fury like a user scorned”:
While continuing to analyze content that dropped on the site, I noticed some content that could easily confuse users (and it looked like it was confusing Google too). I can’t go too in-depth here, but let’s just say that users searching Google to accomplish a task were ending up on the site and on these pages. But the pages did not provide that functionality at all.

So users were visiting that page, then digging further into the site believing they could ultimately find what they were looking for. But they couldn’t find that information no matter where they turned, which led them back to the original landing page. And then using behavior flow in Google Analytics, I could see they were then trying other pages on the site in a never-ending journey. This is similar to the case study I wrote about the engagement trap and user frustration.

After explaining this to my client, and showing them behavior flow screenshots of what was happening, they nuked those pages. I also believe this was an important finding. Remember, “hell hath no fury like a user scorned”.

UX, Unavailable Videos, and Broken pages:
There are also times where news publishers push an article live and don’t revisit those pages (since they are working on the next big story). Unfortunately, those older pages can break for one reason or another. For example, I surfaced pages with either missing images or videos that weren’t available anymore. And when the videos were core to the article, that made the pages useless. There is no way a page can meet or exceed user expectations when the core element doesn’t even load.

So in an effort to enhance the user experience across the site, my client tackled these issues page by page. And if a page couldn’t meet or exceed user expectations anymore based on query, then the page was nuked.

Here was a third-party video used to support an article, but it was no longer available (and was actually removed due to a copyright claim against whoever uploaded the video to YouTube):

Sponsored links: Handle with care:
I’ll keep this one quick. The site has sponsors that are prominently displayed on the site with sponsored links. But, it wasn’t clear they were sponsors, so that required a clear disclaimer. Second, the links were followed. My client nofollowed all of those links and used rel sponsored as well. This was an easy fix, but the right one on several levels.

Google canonicalizing pages to third-party sites:
There was also a canonicalization issue where the publisher had uploaded a number of pdfs from other sites to support content from specific articles. In Google Search Console (GSC), you could clearly see that Google was canonicalizing all of those pdfs to pages on other sites (knowing where the original pdfs originated).

Having a few here and there isn’t a big deal, but you have to wonder what type of signal it sends to Google when many are seen as duplicate (and Google is choosing pages from third-party sites as the canonical urls). My client determined the pdfs were not essential to the articles and nuked many of them from the site.

In addition, there were a number of pages that had canonical tags pointing at urls that didn’t resolve with 200 codes. Some were 404ing, others were redirecting, etc. Rel canonical is a strong signal to Google about which pages should be indexed and ranking, so I always recommend having accurate canonical tags.

Here is an example of Google canonicalizing pdfs to urls on other sites:

Downstream Sites and Maintaining User Happiness (and Security):
For larger-scale publishers, it’s easy to not revisit older articles over time, including checking the outbound links on those pages. And over time, you could be driving users to very strange (and risky) sites. I’ve always recommended auditing outbound links to make sure you aren’t putting your users, and essentially Google’s users, in a tough situation.

Based on the niche, this news publisher had a number of links that led to sketchy websites, and I would say even risky websites. It’s hard to say if those domains changed hands at some point, or if something else was going on, but the outbound link situation wasn’t great. My client worked on making sure outbound links were accurate, helpful, and didn’t put users at risk.

Chrome was even showing a safe browsing warning when visiting some of the downstream sites:

Search URLs and playing hide and seek with robots.txt:
When I began auditing the site, I quickly noticed that many internal search urls were showing up in the crawl data, in GSC, etc. Some were indexed, some were not, and the site’s robots.txt was supposed to be blocking them (or I thought anyway). It ends up there were many internal search urls triggered from outside the site that didn’t return any content. I saw thousands of these urls in the reporting.

But… the directory holding internal search urls was blocked by robots.txt. So what was going on?? Well, I wrote a post last year that covered how robots.txt files are by protocol and subdomain, and that’s exactly what was going on here. Basically, you can have multiple robots.txt files running at one time and the directives in each file can conflict. For this site, there were two files running on the same subdomain (based on protocol), and one blocked internal search and the other didn’t. My client cleaned this up pretty quickly.

Beyond what I covered above, there were a number of other findings surfaced during the audit. I can’t cover everything here or the post would be huge. But just understand that they tackled many different problems across the site. Some major, some minor, but all with the goal in mind of improving the site significantly over the long-term.

Liftoff in December: 149% Surge
Months went by since the last broad core update in May and my client would email me periodically to see if I had any idea when the next update was coming. I explained there’s no set timeframe for broad core updates and that we would hopefully see one soon. I told them to keep driving forward.

And then nearly seven months after the May update, the December 2020 broad core update rolled out. And it was a very, very good one for this news publisher. They surged almost immediately, and it was hockey stick growth. The site surged 149% with the update:

But that doesn’t even do the surge justice. Remember, they had almost no rankings and traffic from Google News traffic, Top Stories, or Discover. After the update, they started ranking in Top Stories and in Google News more prominently. It was awesome to see. Here is the news tab in GSC showing the growth there:

Regarding Discover, that is still a mystery to me. They still have no Discover traffic at all. I believe it could be due to the niche they cover and not necessarily the site, but it’s hard to say for sure. So, this would be the one area that didn’t surge with the update.

Case Study 2: Crushed By The December Update: It’s Like Google Doesn’t Even Know You Anymore
The next case study I’ll cover is a site that has seen its fair share of movement over the years (dating back to medieval Panda). It has been doing well recently, and even saw a nice surge during the May 2020 core update. But there was a growing, even insidious, problem that could have heavily contributed to the drop the site experienced with the December broad core update.

I can’t go into too much detail about the niche, but it’s an affiliate site with several types of content (which target a different user intent). The December update was not a good one for this site… It lost 57% of its Google organic traffic overnight starting on 12/3/20 and Google represents a majority of overall traffic. Needless to say, this is extremely problematic for the business.

I know the CEO of the company and he reached out quickly after the December update started rolling out. The drop, as you can see above, was dramatic. I had a call with the CEO, and as I learned more about their situation and setup, my “broad core update antenna” went up. I knew the site pretty well, and specific points he was making about what they had been doing recently had me concerned. I’ll cover what I can below. My hope is that they will be making big changes moving forward.

Swerving Out Of Your Lane: Major injection of thin content (and content not targeting their core competency):
I’ve mentioned the importance of “staying in your lane” several times over the years after seeing companies expand beyond their core competency and publishing content that they either didn’t have the expertise to write about, or content that could confuse Google topicality-wise. For this site, the latter was my concern.

And in addition to potentially confusing Google’s algorithms about what the site’s core competency was, a lot of the new content was thin and lower quality. This is exactly the type of content that sites should avoid having indexed. I’ve covered “quality indexing” many times in the past and this is exactly the type of content I would recommend keeping out of Google’s index. Unfortunately for this site, thousands of pages were added over time and Google was actually ranking some of it well. That’s a dangerous situation and could end up very badly SEO-wise.

Before they knew it, a big shift had happened percentage-wise for the site from an indexing standpoint. Now there was more low-quality and thin content on the site than core content. And that core content is what got them to where they were SEO-wise (performing well). And in my opinion, Google is clearly not digging that shift. Again, the site dropped by 57% since the December broad core update rolled out.

To add insult to injury, many of those thin pages contained content that expired, yet the pages remained on the site and indexable. Google was seeing some of those pages as soft 404s, while others remained in the index. So, if users were searching Google and finding those pages, the content had no chance of meeting or exceeding user expectations. Again, this was an insidious problem in my opinion. And those thin pages were causing multiple problems from an SEO standpoint.

Here was a quick site query showing close to 1,500 urls indexed that held expired content:

Beyond those findings, there were also a number of other issues that I surfaced which should be fixed. Those included the use of popups on some versions of the pages (including on mobile), more thin content across other page types, urls categorized as “crawled, not indexed” that should be handled properly (which spanned several page types), and more.

I’m still in the middle of analyzing the site, so there will be more findings. But the core point with this case is to always maintain strong quality indexing levels, make sure you don’t tip the scales with lower quality content, always make sure you can meet or exceed user expectations, and make sure technical SEO problems don’t end up causing quality problems. I hope to report back about this site after subsequent broad core updates (based on the changes the site owner decides to implement).

Case Study 3: Overcoming “The Creep”
A large-scale site in a tough niche roars back with the December update:
I can’t tell you how happy I am to be covering this recovery. This was actually one of the case studies in my post about the May 2020 core update, but as a big drop! As I mentioned in that post, many problems had crept back into the site over time. They had surged with previous major algorithm updates only to get hammered by the May 2020 core update. They dropped 41% overnight in early May.

Revisiting Core Problems: Bombarding Users With Ads
Leading up to the May broad core update, the ad situation had gotten much, much worse over time. It was extremely aggressive, disruptive, and even deceptive in some cases. This was not a new problem for the site since they had tackled aggressive ads previously (years ago). But like I said earlier, some bad problems crept back in over time.

For some pages, there were 15-20 ads per page and organized in a way that was extremely disruptive to the user experience. After sending my findings through (including many screenshots of what I was seeing), the site owner moved quickly to tone down the ad situation. That resulted in a reduction of 50% of ads per page in some situations.

Needless to say, this improved the user experience greatly on the site. And those aggressive ads were impacting performance as you can see in the Core Web Vitals reporting in GSC. CLS scores were not good. The number of urls with CLS problems is now down to 0 post-update (after the site owners worked on toning down the ad situation and improving performance overall). They also moved to AWS after the May update to improve performance for the site overall.

Large and Complex Site With Pockets of Thin Content:
In addition, there was also a thin content problem (which is common in this niche). This is also something that has been brought up in the past for this site. Once the pockets of thin content were uncovered, the site owner moved quickly to noindex or 404 as much of that content as they could (depending on the page type). This led to an increase in “quality indexing”, which is making sure only high-quality pages are indexed, while ensuring low-quality or thin content is not indexed. This is something I have written about many times and it can rear its ugly head easily on larger-scale and complex sites.

Here are some screenshots of thin content I was surfacing. Also, notice the ads + thin content in the second screenshot. That’s a lethal combination in my opinion:

The Power of Providing Unique Content For Sites With The Same, Or Very Similar Content:
In my post about the December 2020 core update, I mentioned how sites with the same, or very similar content, need to differentiate their sites as much as possible. John Mueller has covered this several times in webmaster hangout videos and I have helped a number of companies deal with this situation.

For this site, I recommended adding some type of unique and valuable content beyond what every other site was providing. Once I ran through the reasons why they should consider this, along with sending John’s video covering the issue, the site owner brainstormed a few ideas and came up with a strong solution. I can’t explain what they did specifically, but they were able to add unique content in a prominent place on core pages that was helpful for users.

Percentage-wise, this didn’t impact a large percentage of pages on the site, but it did impact some of the most popular pages. I’m hoping they roll this out to more pages as time goes on.

Here is Google’s John Mueller explaining the importance of differentiating your content as much as you can:

Technical Problems Causing Quality Problems: Mobile and Desktop Parity Issues
When digging into the site last May, I started noticing a problem with the mobile setup. This site still runs an m-dot subdomain holding the mobile pages, and I started checking for parity across mobile and desktop. What I found was alarming across some urls.

When switching to mobile, the urls were redirecting to the desktop version, but to a different url! This wasn’t site-wide and was just impacting certain urls, but needless to say, this was extremely problematic. Also, Google still hasn’t moved the site to mobile-first indexing based on the m-dot setup and the parity problems I discussed in my May 2020 case study post. For example, the pages lacked parity from a content standpoint, links standpoint, and more. So, Google’s systems are clearly having issues with moving the site to mobile-first indexing. And to clarify, there’s no ranking benefit to being moved to mobile-first indexing, but not being moved underscores the parity problems I just covered.

Since the desktop content is what’s indexed for this site, it’s hard to say how the weird redirect problem was impacting the site SEO-wise, but it’s clear this was impacting users. Once I communicated the problem to the site owner, they tackled the strange redirect problem quickly. But who knows how long that problem was in place, how many users that impacted, etc.

Beyond what I covered above, there were a number of other problems riddling the site (across several categories of issues). There were more quality problems, UX issues, technical problems causing quality problems, and more. Again, it’s a large-scale and complex site with many moving parts (and it’s in a tough niche). The site owner moved to make as many changes as possible.

And by the way, they still have a huge change in the works. As I mentioned earlier, they run an m-dot now (which is not optimal), but they have a responsive version of the site ready to launch soon! So theoretically, things will get even better over time usability-wise, site maintenance-wise, etc. Needless to say, I’m eager to see that go live.

Happy Holidays! The December Update Comes Roaring Through…
Since you can’t (typically) see a recovery in between broad core updates, the site owner fixed as much as they could and kept driving forward. The site owner was fully aware that they probably needed to wait for the next broad core update to see significant movement in the right direction and hoped for the best.

Well, the December update didn’t disappoint!

Within 24 hours of the rollout, the site absolutely started to surge. The site is up 40% since the December update. It’s been amazing to watch and it underscores how a “kitchen sink” approach to remediation worked well for this client.

Also, they are moving to a responsive design soon, which will alleviate parity problems across mobile and desktop and hopefully get them moved to mobile-first indexing soon. Again, there’s no ranking benefit to getting moved to mobile-first indexing, but it decreases the number of issues that can arise based on maintaining two pages for every one on the site. Also, mobile-first indexing will be enabled for all sites in March of 2021, so I’d love to see them with a solid setup and possibly moved before that date.

Moving Forward: Another List of Tips and Recommendations for Site Owners
As you can see via the case studies I covered, broad core updates are complex and nuanced. Google is evaluating many factors across a site and over the long-term. When a site is negatively impacted by a broad core update, there’s never going to be one thing a site can surface that’s causing the drop. Instead, it’s often a combination of issues that is dragging the site down.

Below, I’ll cover some final bullets for site owners that are dealing with negative impact from broad core updates.

  • Relevancy adjustment? – First understand whether or not there was a major relevancy adjustment, which could be correct. For example, was your site ranking for queries it had no right ranking for? There’s not much you can do about that… But if it’s not a relevancy adjustment, then dig into your site to surface all potential issues that could be causing problems.
  • The power of user studies – To understand how real users feel about your site, content, expertise, etc., run a user study through the lens of broad core updates. I’ve written a case study about doing this that you can review, Google recommends running user studies for this purpose, and it makes sense on multiple levels. I can’t tell you how powerful it can be to receive objective feedback from real users.
  • Never one smoking gun – Avoid looking for one smoking gun. You won’t find it. Instead, objectively analyze your site and surface all potential problems that could be causing issues. Then form a plan of attack for fixing them all (or as many as you can).
  • Recovery – Understand you (typically) cannot recover until another broad core update rolls out. I have covered this in previous posts and in my SMX Virtual presentation about broad core updates. So, don’t roll out the right changes only to roll them back after a few weeks. That’s not how this works. You will end up spinning your wheels.
  • Read the Quality Rater Guidelines (QRG) – There’s a boatload of amazing information directly from Google about what they deem high versus low quality. The document is 175 pages of SEO gold.
  • Recovery Without Remediation – Understand that some sites recover without performing any remediation. That can definitely happen since there are millions of baby algorithms working together to output a score. In my opinion, if you don’t significantly improve your site over the long-term, you leave your site susceptible to seeing negative impact down the line. That’s why you can see the yo-yo effect that some sites experience (as they sit in the gray area of Google’s algorithms). That’s a maddening place to live for site owners. I would try and improve as much as possible.

Summary – Google’s Broad Core Updates Are Complex and Nuanced
I hope you found the case studies I covered interesting. One thing is for sure, they underscore the complexity and nuance of Google’s broad core updates. If you have been negatively impacted by a broad core update, then I recommend reading several of my posts about core updates, form a plan of attack for thoroughly analyzing your site through the lens of those updates, and then work to significantly improve your site over time. You can absolutely recover, but it takes time, hard work, and patience. Good luck.

GG

Back To Table of Contents

Read part one of this series on the December 2020 broad core update.

Filed Under: algorithm-updates, google, seo

Google’s December 2020 Broad Core Algorithm Update: Analysis, Observations, Tremors and Reversals, and More Key Points for Site Owners [Part 1 of 2]

December 20, 2020 By Glenn Gabe Leave a Comment

Well, we waited nearly seven months, but it finally arrived (and right in the middle of the holiday shopping season). Google rolled out another broad core update on December 3, 2020, and as we expected, it was a huge update. The last broad core update was May 4, 2020 so we waited nearly seven months in between updates. I’ll explain more about why the time in between updates is important soon.

A Big Update Warrants A Two-Part Series:
I was going to write one post about the update, including the case studies, but as I was writing it I thought a two-part series would fit much better. In this first post I’ll cover a number of important topics regarding the December update and about broad core updates overall. For example, I’ll cover the rollout, timing, and early movement. I’ll also cover tremors, which yielded some reversals for certain websites. Then I’ll move into (more) important points for site owners to understand about broad core updates, including information from my SMX Virtual presentation which focused on broad core updates.

Then in part two of the series, I’ll cover three case studies that underscore the complexity and nuance of Google’s broad core updates. That’s in addition to the four I already covered in my post about the May 2020 broad core update. Each case is unique and provides an interesting view of how certain sites are impacted by these updates and how site owners responded. I’ll also end each post with tips and recommendations for site owners that have been impacted by the December update (or any broad core update for that matter).

Here is a quick table of contents if you want to jump around to a specific part of this post. I recommend reading it from start to finish to understand all of the context about broad core updates, but I understand that everyone is short on time.

The December Broad Core Update: Table of Contents

  • Rollout and Timing.
  • Google Goldilocks and the Holiday Shopping Season.
  • Tremors and Reversals.
  • Alternative Medicine Impact.
  • e-Commerce Movement.
  • Online Games, Lyrics, Coupons and other sites with the same or very similar content.
  • (More) important points for site owners to understand about core updates.
  • “Hell hath no fury like a user scorned” and Aggressive Ads.
  • Yo-Yo Trending and The Gray Area of Google’s Algorithms.
  • Technical SEO problems causing quality problems.
  • Discover and Top Stories Visibility.
  • Content is King.
  • The A in E-A-T.
  • Site Issues and “The Usual Suspects”.
  • BERT and Broad Core Updates.
  • Machine Learning and “It Depends”.
  • Tips and Recommendations.

Rollout and Timing:
Danny Sullivan announced the broad core update on December 3, 2020 and linked to Google’s own blog post about core updates for reference. Soon after, Danny announced the update began rolling out at 1PM ET on 12/3 and that the update could take a week or two to fully roll out.

Later today, we are releasing a broad core algorithm update, as we do several times per year. It is called the December 2020 Core Update. Our guidance about such updates remains as we’ve covered before. Please see this blog post for more about that:https://t.co/e5ZQUAlt0G

— Google SearchLiaison (@searchliaison) December 3, 2020

And then we finally heard from Danny again that the rollout was complete on 12/16/20. So that’s about two weeks to fully roll out, which made sense.

The December 2020 Core Update rollout is complete.

— Google SearchLiaison (@searchliaison) December 16, 2020

Google Goldilocks & The Holiday Shopping Season – Not Too Early, Not Too Late, Just Right For Most Site Owners
The timing of this update was a bit controversial. Some thought that with the pandemic surging and the holiday shopping season upon us, that Google should not roll out such as big update. I get that, but there were also many site owners that have been waiting since the May broad core update to see recovery. Heck, there were some that had been hit by the January core update that didn’t recover in May that were waiting for this update. So although I see both sides, I’m in the camp that the timing was fair.

In my SMX Virtual presentation about broad core updates from 12/8 (which was crazy timing since it was literally right after the December update rolled out), I explained that Thanksgiving, Black Friday, and Cyber Monday had passed already. So, sites that might be negatively impacted by the December core update could have benefited from those big shopping days even if they were going to drop. And for those waiting for recovery, they missed those important shopping days since they were down, but could potentially recover for the remaining holiday shopping season. Needless to say, it was tricky for Google, but I believe the timing was fair.

John Mueller About The Timing and Passage-based Ranking
It’s also worth noting that Google’s John Mueller explained in a Search Central Hangout (at 5:43 in the video) that although he wasn’t involved in the decision about the timing, he thought it seemed fair. He also answered a question about passage-based ranking and if it rolled out with the December update. John explained that a change like passage-based ranking would typically not be bundled with something like a broad core update. He also didn’t believe it rolled out yet, although Google had explained it could roll out before the end of the year. So stay tuned about passage-based ranking. Here is my slide from SMX about this:

The December 2020 Core Update: Interesting Observations From The Front Lines
Google is looking at many factors with broad core updates, and over an extended period of time. They have explained this many times and I have shared this often on Twitter, in presentations, in my blog posts about core updates, etc. For example, Google’s Paul Haahr, a lead ranking engineer at Google, explained at the webmaster conference in Mt. View that they complete an extraordinary amount of evaluation in between broad core updates. Actually, they do so much evaluation in between that it can be a bottleneck to rolling them out more often. And check the last bullet point about decoupling algorithms and running them separately. Yep, welcome to SEO. :)

Google is trying to understand a site overall, and across many factors. This is why it’s very hard to identify specific things that changed with broad core updates. And it’s also why it’s hard to know exactly what’s going on with a certain website unless you work on it, understand the fully history of the site, problems it had over time, improvements that have been implemented, etc.

In my post about the May core update, I covered how Google’s Gary Illyes explained that Google’s core ranking algorithm is made up of “millions of baby algorithms working together to output a score”. That’s super important to understand and it’s why you cannot focus on just one or two things when analyzing and improving your site. It’s much broader than that, pun intended. My slide from SMX Virtual underscored this point:

This is also why I provided four case studies in my post about the May 2020 core update, which underscored the complexity and nuance involved with broad core updates. On that note, part two of this series covers three more cases based on the December update.

I just wanted to bring up the complexity of broad core updates so you don’t focus too narrowly when reviewing your site or other sites that were impacted.

Before we move forward, a quick disclaimer:
I do not work for Google. I do not have access to Google’s core ranking algorithm. I did not dress up as a Fedex employee and try to force my way into Gary Illyes’ home to commandeer his laptop holding all of Google’s secrets. Broad core updates are very complex and it’s hard to write a post that covers all of the things I’m seeing, the complexity of the updates, and the nuance involved. With that out of the way, let’s get started.

Quick movement, very quick, almost too quick:
Once a broad core update rolls out, it’s typically a few days before we see a lot of movement across sites. But with the December update, we saw movement very quickly (within 24 hours). A number of people (myself included) were sharing hourly trending from Google Analytics of sites beginning to see impact from the update. For example:

Many sites that were impacted saw a ton of movement (either surging or dropping) right after the update on 12/4 and 12/5. Then it calmed down a bit. It was almost too calm after that first wave of volatility. But then 12/9 came around. I’ll cover that volatility in the next section.

Here are two examples of surges and drops after the update rolled out. Some were dramatic and I’ll be covering more about this in the case studies in part two of the series. Both examples below are very interesting cases by the way.

Tremors and Reversals:
Again, I kept thinking to myself that this was quick… almost too quick, which also reminded me of an important point that John Mueller confirmed years ago about major algorithm updates. After medieval Panda updates rolled out, I would sometimes see strange fluctuations after the rollout. For example, distinct and large changes in rankings several days after the rollout (sometimes even reversing course).

When I asked John about this, he explained that Google can implement smaller tweaks after major algorithm updates roll out based on what they were seeing. That made complete sense and I called them “Panda tremors” at the time. I’m sure the same approach applies with broad core updates and I believe we actually saw that with the December 2020 update. Starting on 12/9, we saw a lot of additional volatility, with many sites seeing more movement in the same direction. For example:

And here is John explaining more about what I call “tremors”:

But, we also saw some sites reverse course. And some completely reverse course. It was wild to see. Once I shared what I was seeing, I had a number of companies reach out to me explaining that was happening with their own sites! And several were sending screenshots from Google Analytics and Google Search Console (GSC) showing the reversals.

As you can imagine, this was incredibly disappointing for those site owners (and tough for them to experience). Imagine thinking you were surging with the update, only to drop back down to where you were before the update (and some dropped even further!) Here are some screenshots of the reversals from GSC and Google Analytics:

And here is what search visibility looks like for one of those sites. Insane:

Major Affiliate Impact with Reversals:
It’s worth noting that a number of affiliate sites saw massive volatility with the December update, and even more than usual (in my opinion). And some were impacted by the reversals I just covered. And then within affiliate marketing, those focused on health and medical saw a ton of volatility.

On that note, I mentioned an important point in my SMX Virtual slides about affiliate marketing and health/medical sites. I explained that YMYL (Your Money or Your Life) content is held to a higher standard. And when you mix commerce or affiliate marketing with health and medical, there’s a fine line between educating and selling. I saw a number of sites in this situation get absolutely smoked by the December 2020 core update. Beware.

Some Alternative Medicine Sites See Improvement, But Still Down Big-time From Previous Levels:
There were some alternative medicine sites that saw very nice increases with the December core update. If you remember, many of these sites have seen significant drops in the past, especially starting with the Medic Update in August of 2018. That was interesting to see, but you must zoom out to see how far they have come back (or not). Although some did see nice increases during this update, they are still way below where they were previously. Just an interesting side note.

For example, here’s a nice bump in visibility during the December update, but it pales in comparison to where the site once was.

A home remedy sub-niche as a microcosm of alt medicine volatility:
I also surfaced a great example that demonstrates the insane volatility a specific niche can see with broad core updates. Within alternative medicine, there is a home remedy sub-niche which many sites focus on. This space saw an incredible amount of volatility, with many sites dropping off a cliff. As with many home remedies, there are some claims being made that aren’t backed by science and can be dangerous to follow. Remember, YMYL content is held to a higher standard.

Here are some examples of the volatility in that niche:

Some Major e-Commerce Players Surge and Drop:
Remember I mentioned the timing of this update earlier? Well, some major e-commerce sites did not fare well during the update. The only good thing for them is that at least it wasn’t right before Thanksgiving week with Black Friday and Cyber Monday approaching. Although they dropped, other e-commerce sites waiting to recover were able to gain visibility during the holiday season. Again, I think this was fair. Some sites were waiting for nearly seven months to recover.

Here are two big players in e-commerce with very different outcomes based on the December update:

Online Games, Lyrics, and Coupons With Major Volatility (and a warning to sites with the same, or very similar content to other sites):
I wanted to mention the online games niche for a minute (and other categories like it). It’s a tough area since many of the sites contain the same or very similar content. I also covered this situation in my SMX Virtual presentation.

Google’s John Mueller has explained that if you provide the same content, or very similar content, to many other sites on the web, then it’s hard for Google’s algorithms to determine which site should rank. And that can lead to a lot of volatility over time. It’s super important to differentiate your site as much as you can.

I have provided my tweet below where I linked to a video from John Mueller explaining this. I also included the video below (which starts at 39:32).

Run a site that has the same exact content as other sites (like ringtones)? Via @johnmu: It's tricky b/c our algos look for unique, compelling, & high-quality sites. G's algos might not believe it's an important site that needs to be highlighted in Search: https://t.co/rP4PHLc9fH pic.twitter.com/ErQ9bnMUnG

— Glenn Gabe (@glenngabe) April 13, 2020

Well, the online games niche saw some crazy movement with the December update. If your site is in this situation (having the same or very similar content to many other sites), then I highly recommend trying to differentiate your site as much as possible, provide some type of value-add for users, etc. If not, you can end up seeing similar visibility trending to the examples below. Unfortunately, there will be razor-thin margins between sites (scoring-wise and rankings-wise). Beware.

News Publisher Volatility:
There are always many moving parts with large-scale news sites. Millions of pages indexed, tens of millions of pages when you take into account each site’s crawlable footprint, maybe an advertising situation that can be aggressive and disruptive, sites trying to strike a balance between information, UX, and monetization. I’ve helped many news publishers over the years and there are typically many things I surface during those audits to improve.

Here is some volatility from the space (and this is before the Page Experience Signal rolls out, which is May 2021). I bring that up since news sites often have many issues based on the factors and sub-factors involved with the Page Experience Signal (so it should be interesting to see how news publishers are impacted once it rolls out). I also covered the sub-signals in a Web Story earlier this year if you want to learn more about those.

More Key Points About Broad Core Updates & My SMX Virtual Presentation:
In my post about the May 2020 broad core update, I included several important points for site owners to understand about Google’s broad core updates. Those points seemed to resonate with site owners, since the topic is extremely nuanced and site owners can end up confused about how they roll out, what Google is evaluating, and when you can see recovery.

I included those in my SMX Virtual presentation as well (presented on 12/8), but I also added more items to the list. I’ll include those additional points below. I definitely recommend reading my post about the May broad core update so you can understand all of the key points about these updates.

First, the points I covered in my May post include:

  • There’s Never One Smoking Gun – When reviewing a site that’s been heavily impacted by a broad core update, there’s never one smoking gun, there’s typically a battery of them.
  • Relevancy Adjustments – Google can implement relevancy adjustments during broad core updates, which aren’t necessarily a bad thing for your site. Dig into the drop and objectively figure out if there were major relevancy adjustments (like your site ranking for queries it had no right ranking for), or if there are deeper issues at play.
  • “Millions of baby algorithms…” – Google’s Gary Illyes explained at Pubcon in 2019 that Google’s core ranking algorithm is comprised of “millions of baby algorithms working together to output a score”. I love that quote and it’s why you can never pinpoint one issue that’s riddling a site from a broad core update standpoint.
  • Recovery – You typically cannot recover until another broad core update rolls out. This is super important to understand. If your site has been heavily impacted by a broad core update, you will need to significantly improve the site over the long-term. That’s what Google is looking for and you (typically) can only see that reflected during subsequent broad core updates.
  • Recent Changes Not Reflected – Recent changes will not be reflected in broad core updates. Google’s John Mueller has explained this in the past, and even again in a recent Search Central Hangout. For example, changes you implement 2-3 weeks before a broad core update will typically not be reflected in that update. Google is evaluating many factors over a long period of time with broad core updates. It’s not about that recent tweak or change you made.

And now based on my SMX Virtual presentation, I’m including some additional important points that site owners should understand about broad core updates. Again, these are foundational points that you should understand before tackling remediation:

Aggressive Ads and “Hell hath no fury like a user scorned”:
I continue to see terrible user experiences across many sites heavily impacted by broad core updates. Aggressive, disruptive, and deceptive ads yield a terrible UX for many people. Don’t do this. You can pay a heavy price. Google has even mentioned aggressive ads in its own blog post about broad core updates. Always remember, and respect, your users.

Below is a slide from my SMX presentation where I cover a common pitfall I see with sites impacted heavily by broad core updates. This was about one specific site, but I see this often.

The Gray Area of Google’s Algorithms (and Yo-Yo Trending):
For sites surfing the gray area of Google’s algorithms, it’s easy to continue to surge and drop during subsequent broad core updates. For example, dropping in January, surging in May, only to drop again in December (or vice versa). That’s why it’s super important to significantly improve your site overall over the long-term. You want to clearly get out of the gray area so you can limit volatility down the line. There are many examples of sites that either haven’t improved significantly, or that injected more problems into their sites that continue to see this type of movement.

And on the flip side, there were definitely sites that did nothing to improve and surged. But in my opinion, unless they get out of the gray area, they could very well see drops again. I’ve covered the gray area heavily in the past and it’s a maddening place to live for site owners. I will also cover a very special case study in part two of this series that fell into this category.

Beware Technical SEO Problems Causing Quality Problems:
There were several examples I picked up with the December broad core update of this happening and it can be sinister, since it can sit below the surface without site owners realizing it’s happening. I actually covered this in my SMX Virtual presentation as well, and I received several questions in the Q&A about clarifying what I’m referring to.

To clarify, I’m NOT talking about basic technical SEO problems. I’m referring to technical SEO problems that cause quality issues. Google is on record explaining that every page indexed is taken into account when evaluating quality. So, if you have pockets of pages that get published due to technical SEO problems, and those pages are low quality and/or thin, and they get indexed, then that’s what I’m referring to.

For example, stripping noindex out for thousands of low-quality pages by accident, having major canonical problems and Google indexing many additional pages that shouldn’t be indexed, mistakenly using parameters or session IDs causing Google to find many additional low quality urls, so on and so forth. This is why it’s incredibly important to thoroughly analyze your site through the lens of broad core updates. If you miss those underlying problems, you can spin your wheels having no idea what’s going on.

It’s also worth noting that Google’s John Mueller presented at SMX Virtual and gave some tips for 2021. In those tips, he explained that in 2021 and beyond, sites that are “technically better” have an advantage. Sometimes that’s a small advantage, but it can be bigger depending on the niche. He said it’s good to get that advantage. Remember, content is king, but strive for strong technical SEO. Here’s a tweet I shared when covering John’s presentation from SMX Virtual:

Sites that are "technically better" (I'm assuming technical SEO here) have an advantage. Sometimes that's a small advantage, but can be bigger depending on the niche. It's good to get that advantage. Remember, content is king, but strive for strong technical SEO. pic.twitter.com/bg3M4Vuc9y

— Glenn Gabe (@glenngabe) December 8, 2020

Google Discover and Top Stories Visibility Can Be Impacted:
I’ve covered this before and it’s important to understand that both Discover and Top Stories visibility can be impacted by broad core updates. So if Google’s Discover feed or Top Stories feature are important for your business, then you should take a hard look at your visibility after a broad core update rolls out.

For example, here are some examples of Discover traffic impacted after a broad core update. Notice the distinct upticks in clicks and impressions right after a broad core update rolls out:

And here is Top Stories visibility for a site after the December update. Note, they never were in Top Stories before at all. They ranked pretty well in Google News, but have struggled to break into Top Stories. This was a great sign for the company, which has been working hard to improve (after getting hit in 2019 by a broad core update, recovering during a subsequent update, and continuing to improve since then). It’s not a lot of visibility, but again, they were never in Top Stories before (ever):

Content is King: Extremely Relevant Content Can Rank Despite Other Problems
Google is always looking to surface the highest quality and most relevant content for each query. Google has reiterated that site owners should focus on building the best content based on what users are searching for. Don’t overlook this point.

Great content can still win despite the site having other major issues. This is why you can see some sites with many issues still ranking and not being impacted heavily by core updates. There are many factors being evaluated and if a site produces the most relevant content for users, Google can still end up ranking that site highly across queries. You can also see this with sites that have a ton of authority. They can rank despite having other issues.

This is also why it’s important to not blindly follow other sites that seem to be doing well. You might follow them right off a cliff. Your site is different than theirs. You might not get away with what they get away with. And if you don’t, you could end up dropping heavily when a broad core update rolls out.

The A in E-A-T and the Power of the Right Links:
When analyzing specific drops and gains in a vertical, it’s sometimes apparent that certain sites have an enormous amount of authority (and other less authoritative sites have trouble competing against them). For example, Google has explained that the A in E-A-T is heavily influenced by PageRank (or links from around the web). They explained this in a whitepaper on fighting disinformation that was published in 2019 (screenshots are below).

Also, Google’s Gary Illyes once explained at Pubcon that E-A-T was largely based on links and mentions from well-known sites. So when Google is trying to understand how authoritative a site is, then having the right links matter. It’s not about quantity, it’s about quality. And that’s why you can’t easily fix this situation if you currently don’t have a lot of authority…

For example, you can’t just go out and get links and mentions from powerful sites across the web (like CNN, The New York Times, or even smaller sites within a niche that have a lot of authority). You have to earn that naturally over time by doing the right things content-wise, promotion-wise, etc.

It’s worth noting I saw this in action with the December broad core update. I compared a site that had been hit hard with the sites that now rank in the top spots and it was very clear there was an authority difference there. One site had less than 1K links total and not many from extremely authoritative sites, while the others had millions of links, including many from some of the most authoritative sites on the web.

Below is a screenshot from Majestic’s Solo Links where you can compare the top links from each site. It’s never about one thing… but it can be hard to compete against sites with massive amounts of authority.

Below you can see some of the top domains that are linking to another site that is ranking well across many of the top queries, but not linking to the site that dropped. Again, it’s never about one thing, but authority matters:

The Usual Suspects:
I have covered this many times in my posts about core updates, but it’s worth mentioning here again. If you are looking for ways to improve your site, then it’s important to keep a look out for what I call “The Usual Suspects”. It’s a great movie, but it’s not so great for core updates. :)

Google is on record that it wants to see significant improvement in quality over the long-term to see gains during subsequent core updates. That’s why it’s important to surface all potential issues and address as many as you can. This ties with the “Kitchen Sink” approach to remediation, which I have covered in many of my posts about broad core updates.

For example, I would:

  • Hunt down all low-quality or thin content on the site and address that.
  • Rectify any user experience barriers on the site.
  • Make sure you don’ have an aggressive, disruptive, or deceptive advertising experience (I mentioned this earlier).
  • Review the site from an E-A-T perspective (expertise, authoritativeness, and trust). i.e. The site may lack E-A-T, which is a nuanced topic that confuses many site owners. Also, E-A-T is weighted heavier for YMYL queries, so it’s super important to understand this if you focus on a YMYL topic.
  • Hunt down technical SEO problems that cause quality problems. I covered this earlier.
  • And again, understand there could be relevance adjustments, which might be correct. For example, maybe your site was ranking for queries it had no right ranking for. If Google pushes a relevancy adjustment impacting those, then that’s fine. There’s nothing for you to do there.

Are Broad Core Updates Related To BERT?
There has been some confusion in the industry about whether broad core updates were related to BERT. If you’re not familiar with BERT, it’s an AI natural language processing (NLP) algorithm that helps Google understand queries and content better. Google announced the rollout in October of 2019 and called it one of the biggest leaps forward in the history of Search. It is now used for nearly 100% of English queries conducted on Google.

Regarding broad core updates and how they are related to BERT, Google’s Danny Sullivan replied to Barry Schwartz on Twitter that the two are unrelated. So to be clear, broad core updates have nothing to do with BERT directly.

No, it's not.

— Danny Sullivan (@dannysullivan) December 22, 2020

This makes complete sense. Google is evaluating many factors with broad core updates, and over an extended period of time. Remember, there’s never one smoking gun. There’s typically a battery of them. And I have covered a lot in this post related to that statement! So yes, BERT is important, but it’s not related to broad core updates.

Machine Learning and “It Depends”:
And there’s one more topic I wanted to cover before wrapping up part one of this series. Both Google and Bing have explained recently that they are using machine learning in Search in various capacities. That’s where they identify the signals, identify the desired outcomes, and then let machine learning figure out the weighting of those signals. Yes, they let machine learning determine the weighting of those signals.

That’s extremely important to understand and it’s why you might hear Google and Bing representatives say “it depends” when answering a question about how important something is. They literally don’t know the weighting, and the weighting can actually change over time. You can listen to Bing’s Fabrice Canel explain this in the 302 of a Kind podcast with Marcus Tandler and Izzi Smith (at 35:02 in the video).

This is also why it’s important to simply improve your site overall. Don’t focus on one or two things. Significantly improve your site over the long-term. That’s what both Google and Bing want to see.

Coming Soon: Part 2 of the Series with 3 (More) Case Studies That Emphasize The Complexity of Broad Core Updates
As I explained earlier in the post, part two of this series covers three interesting case studies based on the December 2020 broad core update. My hope is that between part one and two, and my post about the May core update, site owners can have a strong understanding of what Google is doing with these core updates, as well as how certain sites tackled being impacted by previous updates.

Moving forward: Tips and recommendations for site owners impacted by broad core updates
If you have been impacted by the December broad core update, or a previous core update, then definitely read the following bullets containing some tips and recommendations. I know it can be frustrating to see a big drop in visibility and I hope you find these bullets helpful.

  • Improve your site overall, don’t cherry-pick changes. Google wants to see significant improvement in quality over the long-term.
  • Remember there are “millions of baby algorithms” working together. Don’t miss the forest through the trees. Objectively surface all potential issues that could be impacting the site.
  • Use a “kitchen sink” approach to remediation. Fix it all (or as much as you can).
  • Conduct a user-study through the lens of broad core updates. I can’t emphasize enough how powerful this can be. Read my case study and form a plan of attack. Hearing from real users can help you identify issues you could easily miss. You might be too close to your own site.
  • Read and internalize information about E-A-T. I mentioned in my SMX Virtual presentation that both Lily Ray and Marie Haynes have published some excellent information about E-A-T. I recommend going through those articles and presentations.
  • Read the Quality Rater Guidelines (QRG). Then read it again. It contains 175 pages of SEO Gold. It’s a document published by Google that explains what it considers high versus low quality, what raters should look at while evaluating sites, and much more. If you haven’t read it, I think you will find it enlightening. :)
  • If you are impacted by a broad core update, you will typically need to wait for another broad core update to see recovery. So don’t roll out changes for a few weeks and then roll them back (after not seeing recovery). That’s not how it works. Google is on record explaining you (typically) will not see recovery until another broad core update rolls out, and only if you have significantly improved the site.
  • And most importantly… DON’T GIVE UP. You can absolutely recover from these updates. It just takes a lot of work and time.

Part 2 Coming Soon With Three Case Studies:
And remember, part two of this series contains three case studies based on sites impacted by the December broad core update. Each case provides a unique view of how broad core updates can impact a site, how site owners responded, how those situations ended up, and more. You can subscribe to my RSS feed or follow me on Twitter to be notified about the next post.

GG

Back to Table of Contents

Filed Under: algorithm-updates, google, seo

Exit The Black Hole Of Web Story Tracking – How To Track User Progress In Web Stories Via Event Tracking In Google Analytics

November 2, 2020 By Glenn Gabe Leave a Comment

How to track user progress in Web Stories via event tracking in Google Analytics.

Google’s Web Stories, previously called AMP Stories, can provide an immersive AMP experience across both desktop and mobile. Google has been pushing them hard recently and stories can rank in Search, Google Images, and in Google Discover. On that front, Google recently rolled out a Web Story carousel in Discover, which can definitely attract a lot of eyeballs in the Discover feed. And those eyeballs can translate into a lot of traffic for publishers.

I’ve covered Web Stories heavily over the past year or so and I’ve written a blog post covering a number of tips for building your own stories. I have also developed several of my own stories covering Google’s Disqus indexing bug and the upcoming Page Experience Signal.

Building those stories by hand was a great way to learn the ins and outs of developing a story, understanding the functionality available to creators, the limitations of stories, and how to best go through the life cycle of developing a story. As I explained in my post covering various tips, it’s definitely a process. Planning, creativity, and some technical know-how go a long way in developing an engaging and powerful Web Story.

From a feedback perspective, analytics can help creators understand how well their story is being received, if users are engaged, and how far they are progressing through a story. Unfortunately, that has been challenging to understand and accomplish for many publishers starting off with Web Stories. And that situation has led me to research a better way to track stories via Google Analytics. That’s what I’ll be covering in this post. By the end, you’ll be tracking Web Stories in a more powerful and granular way. I think you’ll dig it.

Analytics for Web Stories – Confusing For Many Creators
From the start, it seemed like analytics took a back seat for stories. There wasn’t great documentation about how to add analytics tracking and the WordPress plugin originally didn’t even have the option for including tracking codes. That changed recently, which was great to see, but questions still remained about how to best track Web Stories. For example, can you use Google Tag Manager, can you add advanced tracking to understand more about how users are engaging with your story, can you track specific elements in your story, etc.?

Basic page-level tracking in Web Stories.

After looking at basic metrics for my stories in Google Analytics (yawn), I went on a mission to enhance my story tracking setup. Unfortunately, there’s still not a one-stop resource from Google for tracking Web Stories (hint-hint Paul Bakaus), but I was able to dig into various documents and articles and figure out a pretty cool solution that’s easy to set up. I’ll provide that setup below so you can start tracking your own stories in a more powerful and granular way.

Tracking User Progress Through A Web Story: A Simple Goal
If you just add a basic tracking code to your story, you will at least know how many people are viewing the story and gain basic metrics for the page (just like any other page in Google Analytics). But that doesn’t really do Web Stories justice…

Web Stories are a unique combination of pages within a story. In other words, Web Stories string together multiple pages, which make up the larger story. Users can click back and forth to view each page within the story. You can also automatically advance the user to the next page after a specific amount of time. And once a user completes a story, they are presented with a “bookend”, which is a final page that contains information selected by the creator.

With a basic tracking setup, Web Stories are like a black hole. People enter, and you have no idea what’s going on within the story. For example, how many pages have they viewed, how far are users progressing through the story, did they reach the bookend, how long did it take to get to the end, etc.?

Wouldn’t it be awesome to be able to track that information??

The good news is that you can, and it’s pretty easy to set up. Below, I’ll cover how to add the necessary tracking to your Web Stories so you can gain more information about how users are engaging with your stories. And beyond just setting up this level of tracking, I wanted to provide more information about how events and triggers work in stories so you can start testing your own advanced tracking setup. Let’s jump in.

Web Story Tracking: A Top-level View of What We Are Trying To Accomplish
Before I cover the tracking framework you can utilize today in order to better track your Web Stories, let’s cover the basic bullet points of what we are trying to achieve. Here is what we want to achieve:

  • Track user progress through each Web Story you have published. i.e. Track each page within the story to understand how far users are progressing.
  • Document the Web Story title and organize each page within the story so they can be tracked properly.
  • Track when users reach the final page in your Web Story so you can identify how many users actually reach the end.
  • Track when users enter your bookend, which is a special page at the end of your Web Story that contains social sharing and related links. It’s just another way to understand when users have reached the final part of your story.

For example, wouldn’t it be incredible to see the following? That’s a sample Web Story and data for each page within the story. Yep, this is what we want… let’s go get it:

Event reporting in Google Analytics for Web Stories.

The Inner Workings: Events, Triggers, and Variables
Every Web Story issues events as a user progresses through a story. For example, when a user progresses from one page to another within a story, the “story-page-visible” trigger fires every time a new page is loaded. You can capture an event like that and report it in Google Analytics using event tracking.

When sending those events to Google Analytics from within your story, you can provide the typical event parameters like event_action, event_category, and event_label so you can track your Web Story data in your GA reporting.

Event tracking in Google Analytics.

List of Web Story Triggers and Variables:
There are several triggers you can capture and a list of them can found on github for AMP Project/AMP HTML. In addition, you can view the variables available to you via AMP by checking out the AMP HTML Variable Substitutions page. Between the two documents, you can learn how to combine triggers and variables to set up advanced tracking.

Web Story triggers.
AMP variables.

A Template From The AMP Project!
During my research, I was excited to see that the AMP blog published a post about tracking Web Stories and it contained a skeleton structure for advanced tracking! For example, the post listed a code snippet for firing an event every time a user progresses from one page to another within a web story (to track user progress). We could use this snippet, and expand on it, to customize our tracking.

Here is an image from the AMP project’s blog post about tracking user progress. Again, this is exactly what we are looking to do.

Analytics setup for web stories.

By using the right triggers and variables and then firing events from our Web Story, we can get a much stronger picture of user engagement. Below, I’ll provide the triggers and events we’ll use and then I’ll provide the final code later in the post.

Note, there are three triggers we’ll be capturing in our Web Story and we’ll fire an event when those triggers are captured.

  • Trigger: story-page-visible. When each new page in the story loads, story-page-visible fires. When that fires, you will send an event to Google Analytics with the following variables.
  • event_name: You can name this whatever you want. Event tracking in Google Analytics focuses on the following three fields.
  • event_action: Name this something descriptive. For this example, we’ll use “story-progress” which is what the original blog post used covering this approach.
  • event_category: For this field, I’m going to use a variable for Web Stories, which is the title of Web Story. The variable is ${title}, which is what’s present in your title tag. I linked to the variables available to you earlier in this post.
  • event_label: For the final field, we’ll use both the page index value (page number) and ID for the page within the Web Story (which is the descriptive name for the page you provide in your story). This will enable us to see how many times a specific page within the Web Story is loaded by users. The variables are ${storyPageIndex} and ${storyPageID} and you can combine the two in your code. I added “Page: ${storyPageIndex} ID: ${storyPageID}” to combine both in your event reporting. It makes it easier to see the page number and then the ID associated with that page. BTW, thank you to Bernie Torras who pinged me on Twitter about storyPageIndex, which is a great way to capture the page number within your story.

Next, we want to know when users visit the final page of each Web Story. That can help us understand how many people are actually reaching the end. To accomplish that, we can add another trigger:

  • Trigger: story-last-page-visible. Note, this is not the bookend. Instead, this is the last page in your story before the bookend is displayed. Story-last-page-visible fires when a user reaches that final page in your story before the bookend.
  • event_name: You can name this whatever you want. Just like earlier, the reporting in Google Analytics focuses on the following fields.
  • event_action: Name this something descriptive. For this example, we’ll use “story-complete” since the original blog post covering this tracking framework used that action name.
  • event_category: Make sure to use the same event_category for this trigger as you did earlier to keep the various triggers organized by Web Story. The variable is ${title}. Then you can drill into a specific story in Google Analytics and view the actions and labels associated with that one story.  

And finally, let’s add one more trigger to understand when users reach the bookend in your Web Story, which is a special page at the end that contains social sharing and related links. It’s just another way to understand that users made it to the very end of your Web Story. You’ll need to add one more section of code to your tracking script:

  • Trigger: story-bookend-enter
  • event_name: You can name this whatever you want. As I mentioned earlier, the reporting in Google Analytics focuses on the following fields.
  • event_action: You can also name this whatever you want. For this example, let’s use story-bookend-enter.
  • event_category: Like earlier, I’m going to use a variable for Web Stories, which is the title of Web Story. The variable is ${title}. Remember to keep the category consistent with each trigger so you can view all events within a single web story in your reporting.

By adding this setup, the event tracking reporting in Google Analytics will enable you to drill into specific stories, see the number of “pageviews” for each page within a story, know how many users are reaching the final page in a story, and then how many are viewing the story bookend. It’s a much stronger setup than just seeing a single pageview for your Web Story (AKA, the black hole of Web Stories).

Here is the final code based on what I mapped out above. Make sure you replace the placeholder GA account ID with your own:

<amp-analytics type="gtag" data-credentials="include">
  <script type="application/json">
	{
	  "vars": {
		"gtag_id": "UA-XXXXXX-X",
		"config": {
		  "UA-XXXXXX-X": {
			"groups": "default"
		  }
		}
	  },
	  "triggers": {
		"storyProgress": {
		  "on": "story-page-visible",
		  "vars": {
			"event_name": "custom",
			"event_action": "story_progress",
			"event_category": "${title}",
			"event_label": "Page: ${storyPageIndex} ID: ${storyPageId}",
			"send_to": ["UA-XXXXXX-X"]
		  }
		},
		"storyEnd": {
		  "on": "story-last-page-visible",
		  "vars": {
			"event_name": "custom",
			"event_action": "story_complete",
			"event_category": "${title}",
			"event_label": "${totalEngagedTime}",
			"send_to": ["UA-XXXXXX-X"]
		  }
		},
		"storyBookendStart": {
		  "on": "story-bookend-enter",
		  "vars": {
			"event_name": "custom",
			"event_action": "story_bookend_enter",
			"event_category": "${title}",
			"send_to": ["UA-XXXXXX-X"]
		  }
		}
	  }
	}
  </script>
</amp-analytics>

How To Add Your GA Tracking Script To A Web Story:
Once you have your tracking script ready, you need to add that to your Web Story code. I’ve been hand-coding my stories so it’s easy to have control over where the amp analytics tag is placed. In my stories, I place the amp analytics tag after the final page in my story, but before the bookend tag and the closing <amp-story> tag. If you place the amp analytics tag outside of your <amp-story> tag, the Web Story will not be valid. You can see the placement of my amp analytics tag in the screenshot below.

Amp analytics placement in Web Story code.

Make sure your amp-analytics tag is placed before the closing amp-story tag.

Placing amp analytics tag before bookend and closing amp story tag.

A Note About The Web Stories WordPress Plugin:
Again, I have been hand-coding my stories and haven’t dug too much into the WordPress plugin. I’ve heard good things about it, but I really wanted to learn the ins and outs of building a story, so I stuck with hand-coding my stories.

The WordPress plugin finally added the ability to easily include your Google Analytics tracking ID, but it doesn’t look like you can add advanced-level tracking easily (like what I’m mapping out in this post). I’ll reach out to the Web Story team to see if they will add the ability to accomplish this in the future, but for now I think you’ll be limited to the basic tracking I mentioned earlier.

{Update: WordPress Plugin Automatically Firing Events}
I have very good news for you if you are using the Web Stories WordPress plugin. Brodie Clark pinged me today after going through my post. He is using the Web Stories plugin, checked the Events reporting in Google Analytics, and noticed the plugin is automatically firing those events! That’s amazing news for any plugin users!

Again, I’ve been hand-coding my stories so I haven’t played around too much with the plugin. But that’s outstanding news, since plugin users can view user progress and a host of other events being fired within their stories.

Once you add your GA tracking ID, it seems the plugin is automatically capturing the data and firing events:

Adding a Google Analytics tracking ID to the WordPress Web Story plugin.

Here are the triggers being captured based on what Brodie sent me:

WordPress Web Story Plugin automatically firing events.

And here is what it looks like once you click into story_progress. The plugin is using storyPageIndex versus storyPageID so you can see the page number in the reporting. I’m thinking about combining the two actually.

Tracking user progress through Web Stories via the WordPress plugin.

How To Test Your Tracking Via Google Analytics Real-time Reporting
The easiest way to test your new tracking setup is to upload your story to your site and view real-time reporting in Google Analytics. There’s a tab for Events, where you can see all of the events being triggered. Just visit your story and look for the various events, actions, and labels.

Viewing real-time reporting in Google Analytics for Web Story events.

Viewing Web Story Tracking In Google Analytics:
Once your story is live that contains your new tracking setup, and users are going through your story, you can check the Events reporting within the Behavior tab in Google Analytics to view all of the events that have been captured. This is where the naming convention we used comes in handy. When you click “Top Events”, you will see the event categories listed. We used the story title as the category, so you will see a list of all stories where events were captured.

When you click into a story, you will see each action that was captured (each trigger that was captured as an event).

Viewing Web Story triggers in the Events reporting in Google Analytics.

And by clicking an action, you can see the labels associated with that action. For example, story_progress enables you to see the list of pages that users viewed within your story and how many events were triggered for each page (helping you understand how far users are progressing through each story).

Viewing user progress through a Web Story in Google Analytics.

And there you have it! You can now track your Web Stories in a more powerful and granular way. It’s not perfect, but much stronger than the “black hole” approach of just basic page-level metrics. And remember, you can totally expand on this setup by adding more triggers and using the variables available to you.

Summary – Creep out of the black hole of Web Story tracking.
I hope you are excited to add stronger tracking to your Web Stories. As I documented in this post, you can creep out of the black hole of story tracking and analyze user progress through your story. By doing so, you can better understand user engagement and then refine your stories to increase engagement and user happiness.

So don’t settle for black hole reporting. With a basic level of understanding of event tracking, triggers, and variables, you can set up some very interesting tracking scenarios. Good luck.

GG

Filed Under: google, google-analytics, seo, tools, web-analytics

Image Packs in Google Web Search – A reason you might be seeing high impressions and rankings in GSC but insanely low click-through rate (CTR)

October 14, 2020 By Glenn Gabe Leave a Comment

Image pack rankings in Google web search.

While analyzing performance in Google Search Console (GSC), some site owners have reported a strange situation with impressions and position (and some even believe it’s a bug). I’ve seen this many times and I often get questions about it from site owners left scratching their heads after witnessing those metrics in their own stats. Although it’s easy to chalk it up to a GSC bug or anomaly, the answer can be hiding in plain sight.

I’m referring to seeing a site ranking very highly in the search results (according to GSC), but when you check the search results, you can’t find the page ranking. And even stranger, you might see that #1 or #2 ranking with insanely low click-through rates compared to other top rankings (which is often a sign of the situation I’m covering today).

So, what can cause this strange situation in the Google search results with high rankings but insanely low click-through rate? Well, it’s sometimes just an image pack ranking in the SERPs or an image ranking in a knowledge panel. Most site owners scan right past those SERP features and look for high rankings in the 10-blue links, when those very features are often the culprit. Let’s explore this situation in greater detail.

A quick reminder about how Google calculates impressions and position in GSC:
Last year I wrote a post demystifying clicks, impressions, and position in Search Console. In my post, I covered how image blocks were tracked and calculated. For example, if Google was ranking an image block #1 in the SERPs and that block contained 10 images, then each of those images take on the rank of the entire block. Therefore, your image, even if ranking seventh within the image block, would rank #1 in GSC.

If you are surprised by that, which is totally understandable since it’s confusing, then you should definitely read my entire post on Search Engine Land. There are several other surprises in the article that can help you better understand tracking performance metrics in GSC.

Image block ranking in Google search.

Google only counts clicks outside of the search results and not query refinement links. But is that really correct??
To register an impression, Google explains that a link must lead out of the Google search results and to your site. If it’s just a query refinement link (which generates a new search result), then that link will not register an impression.

What is a click in Google Search Console (GSC)?

This is where image packs stand out (and maybe not in a good way).

Image packs contain a series of images from various websites. When clicking a thumbnail, you are not taken directly to the site and outside of the search results. Instead, you are taken to Google Images with that image highlighted and expanded. Then, if you click the image again, you are taken to the third-party site. So the click leading outside of Google doesn’t happen from the web search results. It happens from Google Images.

So, if that’s the case, then images within image packs in the web search results should NOT register an impression. But they do. And that’s super confusing for many site owners, and even experienced SEOs.

For example, here is an image pack ranking in Google for “new york photos”. Notice the image pack ranking at the top of the search results (web search).

Image block for New York Photos in the Google web search results that lead to Google Images.

When clicking an image in the image pack, you are taken to Google Images and not the third-party site that contains the image. But… that image will register an impression and position in GSC. More about that soon (with evidence). After clicking the second image in the image pack, I was taken to Google Images with that image expanded:

Image search results for New York Photos.

The reason that urls from publisher sites do gain an impression is that the image tag’s title attribute contains the url for the site. And that url is the one gaining the impression and position in your GSC reporting. Here is the code showing the title attribute containing the site’s url:

Image pack links actually lead to Google Images but the title attribute contains the url from third-party websites.

To clarify, the image doesn’t link to third-party sites. The link is to Google Images and not to your site. But, the image tag’s title attribute does contain your link, so that is what gains an impression and click in GSC.

Confusing? Yep… And can that throw off your reporting? Affirmative.

But it’s not just image packs. It’s knowledge panels as well:
Just a quick note that this also impacts knowledge panels that contain a block of images. So, if you search for “tesla model 3” and see the knowledge panel containing a number of images, those will behave the same way from a tracking standpoint. They will take on the rank of the knowledge panel (the parent search feature). And to make things even more confusing, a knowledge panel will often rank on the right side of the SERP on desktop, which means it can be in position #11, #12, #13, etc. depending on the query and the resulting search results. The knowledge panel below ranks #11 for this query on desktop:

Knowledge panel in Google search that contains images from publisher sites.

And that same knowledge panel on mobile is at the top of the search results (which would show up as position of #1 in GSC). So you really need to filter your GSC reporting by desktop and mobile to see an accurate position for queries that are impacted.

Knowledge panel on mobile that also contains images from publisher sites.

Now, if you have a lot of images ranking in the SERPs, you might have squeezed your coffee mug so hard it splintered into 35 pieces while reading this post. That’s ok, just clean up the coffee, brew another cup, and get ready for some evidence. Below, I’ll cover several different examples of image packs and knowledge panels registering impressions and clicks in GSC based on simply an image ranking in those features (that actually lead to Google Images and not to the third-party website).

Examples of image pack rankings with GSC metrics:
Below, I’ll provide three examples of image packs or knowledge panels yielding high rankings and low CTR in GSC. I’ll provide the SERP, GSC metrics for web search, and then GSC metrics for image search. I think you’ll get the picture, so to speak. :)

Here’s an image pack ranking for the query “rainbow”. The pack ranks #1 in the SERPs, which means the images take on the rank of the pack (so they also rank #1):

Image pack ranking number one for the query "rainbow".

For image search in GSC, you can see the image has yielded many clicks… but that’s image search. Web search is a different story.

Image search metrics yield stronger numbers.

For web search, there are almost 3M impressions, but only 430 clicks (for a CTR of basically 0%). This is exactly what can happen when an image ranks in an image pack in the web search results:

Web search results for image packs in GSC.

For the second example, here’s an image pack ranking #2 for the query “jaguar suv”.

Image pack ranking in Google for "jaguar suv".

In image search, you can see 562 clicks based on 55K impressions for a CTR of 1%:

Image search statistics for "jaguar suv".

But in web search, there have only been 70 clicks based on 43K impressions. The CTR is only .2%. So although GSC shows strong rankings and 43.8K impressions, the reality is that the ranking leads to almost no traffic at all from Google. Again, this can cause a lot of confusion while site owners are analyzing their performance in Search:

Web search stats in GSC for "jaguar suv".

The last example I’ll provide is from a knowledge panel. There are images ranking in the knowledge panel that make for an interesting case. On desktop, the knowledge panel ranks on the right side (in position 11 for this query):

Knowledge panel with images ranking in the web search results.

But on mobile, the knowledge panel ranks #1 at the top of the SERP:

Knowledge panel ranking in the mobile search results.

The images in the knowledge panel for desktop rank #11 and the images in the knowledge panel on mobile rank #1. The following screenshots from GSC show this is the case. Here are the desktop stats. Yep, just 9 clicks with a position of #11 since the knowledge panel ranks on the right side:

GSC metrics for a knowledge panel ranking in the web search results.

And here are the mobile search stats. 210 clicks with a position of #1 (since the knowledge panel ranks at the top of the SERP on mobile). There were 242K impressions and just 210 clicks:

GSC metrics for a mobile knowledge panel ranking in Google.

And compare both of those to the image search stats… 3.3K clicks for the query. A big difference and shows the gap between images ranking in web search (via image packs or knowledge panels) versus ranking in image search:

Image search results versus knowledge panel results.

Image Packs, Knowledge Panel Images, and GSC: Key Takeaways
As you can see from the examples above, images ranking in image packs and knowledge panels can definitely yield some strange metrics in Google Search Console. It’s important to understand this, since it can lead to a lot of confusion while analyzing a site’s performance in Search.

Below, I’ll cover some key takeaways below based on my own research.

  • First, if you think GSC is wrong, check again, but keep your eyes peeled. There may be an image ranking in an image pack or a knowledge panel that’s throwing off your reporting.
  • I would fully understand how GSC calculates metrics. There are some very interesting and confusing situations based on various SERP features (especially on mobile). I would read my post about calculating clicks, impressions, and position in GSC and Google’s help document covering the subject. Between the two, you can gain a stronger understanding of how everything is calculated.
  • High rankings with extremely low click-through rate could be a sign you are experiencing this issue. If you see high rankings, but very low CTR, that could mean an image is ranking well in web search either within an image pack or a knowledge panel.
  • Although Google explains that only links outside of the search results register an impression, that’s not the case for images in the web search results (within image packs and knowledge panels). The image tag’s title attribute contains the url of the page containing your image, and that’s what is gaining an impression and click.
  • This is a great example of why it’s tough to look at aggregated statistics in GSC. It takes many different variables into account. The devil is in the details (like everything else in SEO). For example, those images ranking well in web search could absolutely be dragging down CTR overall. Make sure you know that while reviewing your stats.

Summary – Look for image packs and knowledge panels throwing off your reporting:
If you see high rankings with crazy-low click-through rates in GSC and you’re confused about why that’s showing up, you might want to check for images packs or knowledge panels for those queries. You might have images ranking in one of those two search features. It’s a great example of when ranking number one doesn’t really mean very much. Users often scan right by those images in the SERPs, and when that happens, you can end up with a false sense of accomplishment and little traffic to your site. So again, keep your eyes peeled. The answer might be hiding in plain sight.

GG

Filed Under: google, seo, tools

  • 1
  • 2
  • 3
  • …
  • 30
  • Next Page »

Connect with Glenn Gabe today!

Latest Blog Posts

  • Google Search Console (GSC) reporting for Soft 404s is now more accurate. But where did those Soft 404s go?
  • Google’s December 2020 Broad Core Algorithm Update Part 2: Three Case Studies That Underscore The Complexity and Nuance of Broad Core Updates
  • Google’s December 2020 Broad Core Algorithm Update: Analysis, Observations, Tremors and Reversals, and More Key Points for Site Owners [Part 1 of 2]
  • Exit The Black Hole Of Web Story Tracking – How To Track User Progress In Web Stories Via Event Tracking In Google Analytics
  • Image Packs in Google Web Search – A reason you might be seeing high impressions and rankings in GSC but insanely low click-through rate (CTR)
  • Google’s “Found on the Web” Mobile SERP Feature – A Knowledge Graph and Carousel Frankenstein That’s Hard To Ignore
  • Image Migrations and Lost Signals – How long before images lose signals after a flawed url migration?
  • Web Stories Powered by AMP – 12 Tips and Recommendations For Creating Your First Story
  • Visualizing The SEO Engagement Trap – How To Use Behavior Flow In Google Analytics To View User Frustration [Case Study]
  • The May 2020 Google Core Update – 4 Case Studies That Emphasize The Complexity Of Broad Core Algorithm Updates

Web Stories

  • Google’s Disqus Indexing Bug
  • Google’s New Page Experience Signal

Archives

  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2021 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in settings.

The Internet Marketing Driver
Powered by  GDPR Cookie Compliance
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

This site also uses pixels from Facebook, Twitter, and LinkedIn so we publish content that reaches you on those social networks.

Please enable Strictly Necessary Cookies first so that we can save your preferences!