The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
    • Web Stories
  • Contact GSQi

The September 2023 Google Helpful Content Update – Did Google’s Announcement in April About Page Experience Foreshadow What We’re Seeing With The Current HCU(X)?

September 27, 2023 By Glenn Gabe Leave a Comment

Based on what I’m seeing with sites heavily impacted by the HCU, and what Google explained in its April documentation update, the combination of unhelpful content and poor UX can be an extremely problematic combination for site owners.

The September 2023 Helpful Content Update


On September 14, 2023 Google rolled out the September helpful content update. This was the third helpful content update, and it has been the biggest of the three HCUs so far. I have over 500 domains documented so far that have been hammered by the update, and my list continues to grow each day.

Here are some examples of the extreme drops based on the September HCU. These are medieval Panda-like drops:

Example of a site dropping heavily based on the 2023 September helpful content update
Another example of a site's search visibility dropping heavily based on the 2023 September helpful content update
Semrush graph showing a site heavily impacted by the September 2023 helpful content update.


I might write a post covering more about the update once the dust settles, but I wanted to cover a specific theory I had based on analyzing many sites impacted by the update. First, after going through those sites, most just don’t have great content… I’m seeing AI-generated content, programmatic content used to churn out many unhelpful pages, sites that cover every imaginable variation of a topic without providing truly insightful or valuable information, and more. Also, I have analyzed sites that have dropped heavily across a number of categories, including lyrics, calculations, travel, translations, downloads, gaming, and many more.

Clearly the system is targeting unhelpful content, but that’s not the only thing I’m seeing. Some other sites I analyzed have definitely caught my attention. I’m not saying they have the best content in the world, but they don’t seem to have the worst content either… For example, there are sites that got crushed in the recipes, how-to, and reviews verticals that don’t fall into the same category of unhelpful content as what I explained earlier.

And a common thread on many sites impacted, including ones with content that may not be horrible, is a terrible user experience. In other words, the ad situation is severe on many of the sites, users are being bombarded with ads all over the page, auto-playing video follows you as you scroll down the page, and then you might be hit with popups as well.

This is exactly the type of aggressive, disruptive, and even sometimes deceptive, advertising situation I have spoken about for years. And when I say years, I mean from medieval Panda days back in 2011. I saw that on many sites I was helping recover from Panda. And now here we are, with what looks like Panda on steroids. I have always said, “hell hath no fury like a user scorned” and I feel like that might be the motto for the latest HCU.

First, here’s a slide from one of my presentations about broad core updates where I cover aggressive ads:

Glenn Gabe's slide about aggressive and disruptive ads causing problems during broad core updates.


And then just one of many tweets I’ve shared about aggressive ads and core updates (this one links to a video from John Mueller explaining more about taking a step back and reviewing the site overall):

Still think aggressive ads can't hurt you? Via @johnmu: Impacted algorithmically? Take a step back & improve your site overall. E.g. Are there a ton of ads on your pages making it hard to find the content? That can be reflected in our algos over time: https://t.co/R4HSYx4pnY pic.twitter.com/c7Sv1NO8T4

— Glenn Gabe (@glenngabe) June 14, 2019

And I even covered this on a podcast with Dre last Friday when covering what I’ve seen so far (at 55:10 in the video):


The HCU evolves to the HCU(X)?
With the original HCU in August of 2022, the sites that got hit were the most egregious from a content quality standpoint. But Google’s Danny Sullivan explained at the time that Google was serious about the helpful content system and that they would continue to improve it, and the system would evolve over time.

And evolve it did… the September 2023 helpful content update has been the most powerful of the HCUs so far (by a mile). Again, I have hundreds of domains documented that have been crushed by the update, I’ve also had many site owners reach out to me for help, and I’ve heard from many across verticals about the extreme volatility that the HCU is causing across sites. It’s a beast.

So what changed?

First, Google explained the HCU has an improved classifier, and that is clearly obvious. The sheer number of sites getting impacted is infinitely larger than previous HCUs. Second, and this is the heart of my post today, Google updated its documentation in April of 2023 about page experience, and specifically included that information in its documentation about creating helpful content.

Yes, this might have been the warning about what the September HCU was going to target. With that update to the documentation, Google took the emphasis away from core web vitals and wanted site owners to focus on the user experience overall. You can clearly see in the questions that Google covers a wider range of REAL user experience issues, versus a core web vital score that might be a few milliseconds off.

For example, Google covers an excessive amount of ads that distract or interfere with the main content, intrusive interstitials, how easily can visitors navigate to, or locate, the main content of the page, can users easily distinguish the main content from other content on your page, and more.

New bullets added to Google's documentation about page experience and the impact in Search.


Again, I am NOT saying UX alone is going to get a site hammered. Google is using machine learning with the HCU, which means it’s sending many signals to the machine learning system, which dictates weighting of those signals, which ultimately dictates rankings. But the addition of this information in the helpful content documentation, and seeing those horrible UX situations across many sites impacted, has me thinking the two are connected.

Google also provided a specific FAQ in that blog post from April about the helpful content system. As you can see below, content is the focus, but UX does play a role:

FAQ about the roll of page experience in the helpful content update.


So, I believe the focus of the HCU is removing unhelpful content from the SERPs, but the user experience plays a part. And the combination of lower-quality content with a terrible UX is the kiss of death with the HCU. I explained in my SMX Advanced presentation a few months ago that the helpful content system was Google’s secret weapon for fighting various types of low-quality content, but it doesn’t seem like content is the only thing they are looking at algorithmically. In my opinion, UX crept its way in when they added that documentation change in April. I just don’t think it hit home with many site owners until this update rolled out.

The helpful content update is Google's secret weapon in fighting various forms of low-quality content.


Improving The User Experience – And I’m not referring to core web vitals…
For site owners heavily impacted by the September 2023 helpful content update, the focus should clearly be on providing high-quality and insightful content that can help users. But, I would make sure you don’t drive those users insane while they are trying to consume your content. And if you’re in a situation where you think your content is ok, but you still got hammered, then definitely take a hard look at your UX, ad situation, popups, and more. I would not bombard users with ads, I would not trigger popups like crazy, I wouldn’t follow users down the page with auto-playing video, and I would make sure your main content can be easily and quickly identified by users.

Too close to your own site? Run a user study!
If you feel you are too close to your own site, then run a user study. I wrote a post about running user studies through the lens of broad core updates a few years ago and Google ended up linking to that post from their own article about broad core updates. You can learn so much when objective third-party users go through your site and evaluate their experience through the lens of specific algorithm updates. You can read my post for more details about running a study.

The power of user studies for sites impacted by Google's broad core updates.


Summary: Dealing with a big September 2023 HCU hit.
Again, I might cover more about this update in a future post, but I wanted to provide this information about UX after going through many sites impacted by the September HCU, or should I say HCU(X)? Also, just a reminder that with the helpful content update, you typically cannot recover quickly. Google will need to see significant changes over the long-term in order for the HCU classifier to be dropped (which can be months). So if you have seen heavy impact, then you will need to work hard on improving both your content and the user experience. Good luck.

GG

Filed Under: algorithm-updates, google, seo

How To Find Lower-Quality Content Being Excluded From Indexing Using Bing’s XML Sitemap Coverage Report (and Its “Content Quality” Flag)

September 25, 2023 By Glenn Gabe Leave a Comment

Finding lower-quality content via Bing's Sitemap Index Coverage Report in Bing Webmaster Tools


Bing finally rolled out its XML Sitemap Coverage Report in Bing Webmaster Tools, which is a great addition for site owners. Using the report, you can check indexing levels based on the urls being submitted via XML sitemaps. This is similar to what Google offers in its Coverage reporting, but it’s great to have another major search engine provide this data.

Hello “Content Quality” flag:
When I first dug into the reporting, I quickly started checking urls excluded from indexing across sites. Like Google, Bing provides a number of categories for urls being excluded,  including noindexed, redirected, 404s, and more. But one of those categories really struck me – “Content Quality”. With “Quality” being the most important thing that site owners should focus on, understanding when a major search engine believes you have quality problems, and surfacing those specific urls, is pretty awesome.

Bing's "Content Quality" flag in the sitemap index coverage report


And once you click the “Content quality” category, you can view all of the urls from that sitemap that were flagged as having content quality issues:

Viewing urls flagged with "content quality" issues in Bing Webmaster Tools


Bing is not Google, but Bing is a major search engine: And will Google follow?
With major algorithm updates evaluating quality on several levels, having this information from Bing could potentially help site owners surface and improve lower-quality content. And with Google’s broad core updates, reviews updates, and now helpful content updates, digging into urls flagged as lower quality could help jumpstart a site owner’s analysis. Sure, Bing is not Google, but the content that Bing is surfacing in its Sitemap Index Coverage reporting could be a proxy for what Google also believes is lower-quality content. You don’t want to take that at face value, but it’s definitely worth investigating…

And maybe a bigger question is… will Google follow Bing here and provide a “Content Quality” category in its own Coverage reporting? I know Google has toyed with this idea in the past, but never officially rolled out a content quality category in Search Console. To be honest, I’m not sure that would ever happen, since it could reveal a bit too much of the secret sauce. I know they don’t want to provide too much link data either based on that happening.

I mean, imagine waking up one day and seeing this in Google Search Console. :)


Finding the Index Coverage reporting in Bing Webmaster Tools:
If you have at least 10K urls indexed in Bing, then you should be able to see the index coverage reporting for your site in the Sitemaps reporting. But, based on what I’m seeing, a number of sites do not have that option. If you don’t see the option, then I would make sure you are submitting xml sitemaps in BWT or including a reference to them in your robots.txt file.

For example, here is a large-scale site with sitemaps in BWT, but the index coverage option isn’t available.


Maybe the Index Coverage reporting is still rolling out to more sites… I’ll reach out to Bing’s Fabrice Canel to see why those sites don’t have index coverage reporting and then update this post with more information.

Reviewing content quality problems across sites: Were the urls actually low quality?
I was eager to investigate the “Content Quality” category across sites to see what types of content were surfaced there. So I dug in across several sites, and across several verticals. I’ll quickly cover what I found below.

First, although many of the urls were ones that I would consider lower-quality or thin, not all were. Do not take Bing’s word blindly… you definitely need to review the urls yourself. Some were exactly what I would surface as lower-quality, while others seemed ok for users (they were not great, but not terrible either)…

For example, I found the following types of lower-quality urls in the reporting across sites:

  • Short and unhelpful Q&A posts.
  • Thin press releases.
  • Thinner and dated news articles.
  • Spider traps. Little content leading to more thinner pages.
  • Ultra-thin business or organization listing pages.
  • Lower-quality content focused on sensitive categories (YMYL).
  • Thin video pages covered in ads.
  • Low-quality “reference” content.
  • Thin user profile pages.
  • Thin tutorials.

And more…

More Ways To Find Content Quality Problems in BWT:
After tweeting about this the other day, and thanking Fabrice Canel from Bing, he replied with an interesting note. Fabrice explained that the Index Coverage reporting wasn’t the only place you can surface content quality problems in Bing Webmaster Tools. He explained you can also see this when inspecting specific urls and via Site Explorer.

You are welcome @glenngabe. The same classifications are used in URL inspection, and in my favorite tool SEO Explorer. Here is a link to SEO Explorer filtered on content quality issues. https://t.co/NNgSX5U6Gn. Note: Data can be +/- 1 to 2 days not in sync between these tools.

— Fabrice Canel (@facan) September 22, 2023

When checking the link he provided, I noticed that Site Explorer was filtered by “URLs with other issues”. So it seems that category means the same thing as “Content Quality” in the Index Coverage reporting for sitemaps. In other words, it won’t say “Content Quality” in Site Explorer, but it means the same thing.

Finding quality problems in the Site Explorer feature in Bing Webmaster Tools


And when inspecting specific urls that were flagged as lower quality in the Sitemap Index Coverage reporting, I typically saw other categories appear for why the urls weren’t indexed. It did not say “Content Quality”. Fabrice did say the data might not be in sync and there could be a 1-2 day lag there between the tools, but it’s worth noting.

For example, a url that was flagged as “Content quality” in the Sitemap Index Coverage reporting actually yielded “Discovered but not crawled” when inspecting that url. That category can signal quality problems too, but it doesn’t say “Content quality”.

Cross-referencing the url inspection tool in Bing Webmaster Tools for urls that are flagged as low quality.


Summary – “Content Quality” is being flagged by a major search engine. Dig in there…
Again, I was pretty excited to see Bing Webmaster Tools provide a flag for content quality. With so much emphasis on “quality” from the major search engines, it’s great to dig in and analyze urls being surfaced as having quality issues. The reporting will never be perfect, and I would not blindly act on what’s being surfaced there, but it’s a nice starting point for site owners trying to understand content quality issues across their sites. I highly recommend digging in there. :)

GG

Filed Under: bing, google, seo, tools

Analyzing the removal of FAQ and HowTo snippets from the Google search results [Data]

August 23, 2023 By Glenn Gabe Leave a Comment


Update: September 14, 2023
Google just announced that HowTo snippets will now be removed from the desktop results as well. The original announcement explained desktop HowTo snippets would remain, but Google reversed course and has now removed them.

Update: September 13, 2023
Google finally removed FAQ snippets from the desktop results. Learn more about the removal.

—–
On August 8, 2023 Google announced a big change that would impact both FAQ and HowTo snippets. First, Google explained it was removing FAQ snippets in the SERPs for most sites other than “well-known and authoritative government and health websites”. The change would impact both the desktop and mobile search results, although I have more to share about that in my analysis.

And from a HowTo standpoint, Google announced those incredibly visual snippets that take up precious SERP real estate would be removed on mobile. Google originally said that they would still remain on desktop (usually in a list format), but HowTo snippets have now been removed from both the mobile and desktop search results.

For sites that received FAQ and/or HowTo snippets, the change was not well received by most site owners. Rich snippets can help listings stand out in the SERPs, can help boost click through rate, and can provide more information to entice users to click through. And as someone who spent the time adding HowTo structured data for my SEO tutorials, I loved having HowTo snippets in the SERPs. Seeing them disappear was tough… and had me wondering how CTR would be impacted by their removal. And many other site owners who had FAQ or HowTo snippets also wondered how the removal would impact clicks and CTR. Therefore, I decided to dig in and analyze data across sites to find out.

Tracking The Changes To Clicks and Click Through Rate (CTR):
After quickly scanning GSC’s performance reporting after the changes went live, I was eager to analyze the data across sites, verticals, countries, etc. I have access to a number of sites that were receiving both FAQ snippets and HowTo snippets and I was very interested to see the impact of the removal on clicks and CTR. Below, I’ll quickly cover the methodology I used when analyzing the impact and then I’ll cover each site’s data.

Methodology:

  • I analyzed six sites in total, three that heavily received FAQ snippets and three that heavily received HowTo snippets.
  • I analyzed the change in clicks for both mobile and desktop through the removal of FAQ and HowTo snippets starting on August 8, 2023. Note, click through rate is a better indicator since clicks can jump or drop based on demand and impressions. But I included clicks for the six sites below.
  • I also reviewed the change in click through rate based on filtering pages that received rich snippets prior to the change. i.e. How will the removal of rich snippets impact visibility of those listings, and potentially how many people click through those listings.
  • And to clarify, I isolated pages receiving FAQ and HowTo snippets in the SERPs and filtered in GSC via regular expressions. This enabled me to gain a closer view of specific sets of pages that received FAQ or HowTo snippets to see the change in clicks and CTR.

First up, I analyzed the impact of FAQ snippets being removed:

Case 1 – Stable.
First, you can see the removal from the mobile search results as expected on August 8, 2023. Clicks drop off a cliff for listings that contained FAQ snippets. Note, this doesn’t mean clicks overall dropped for the site, just that Google stopped providing FAQ snippets in the mobile SERPs.

In the announcement, Google explained that FAQ snippets would also be removed from the desktop search results, but that’s not what I’m seeing across sites. For example, checking desktop, you can see FAQ snippets are alive and well (at least for now). Clicks actually increased after the changes were implemented (but that can be influenced by impressions increasing too). Note, Google finally removed FAQ snippets in the desktop search results on 9/13/23.

When checking the impact to clicks and CTR when filtering by urls via regex that received FAQ snippets on mobile, I didn’t see much impact at all. Clicks are stable and CTR looks normal based on the history of the site. I’ll cover more about why I think this is happening after providing the data for the sites.

Note, you can first see an increase in CTR after the change went live and then a drop back down. But again, there was nothing that stood out based on the historical trending of the set of pages.


Case 2 – Also stable.
When analyzing the data (isolating pages that used to have FAQ snippets via regex), I noticed the same situation. FAQ snippets were removed from mobile, but not from desktop (yet). Again, Google did finally remove FAQ snippets from the desktop results on 9/13/23.

And I’m also not seeing a change in mobile clicks or CTR overall for the set of pages when filtering by urls yielding FAQ snippets before the removal (via regex). Clicks are a bit volatile, which is more of a demand situation, but CTR is stable for the set of pages.


Case 3 – Click Through Rate Drops…
Based on what I saw for the first two sites, I was fully expecting to see the others follow (with no change in clicks and CTR). But the third site I dug into was an interesting one. Mobile CTR clearly dropped when FAQ snippets were removed from the mobile search results… For this site, it could be the niche that’s the reason… It’s a YMYL (Your Money or Your Life) category and those FAQ snippets could have helped the listings stand out, provide helpful information to users, and then get more people to click through. It’s hard to say exactly why this site was an outlier, but you can clearly see the drop below.

First, desktop FAQ snippets were not impacted yet (like the others):

But for this site, CTR was impacted when filtering by urls that used to receive FAQ snippets:

Clicks do drop, but not by much. Again, clicks could be impacted by an increase in impressions (which did happen). First, here are clicks:

And here are impressions increasing, which yields more clicks. That’s why CTR is a better indicator of impact based on the removal of FAQ snippets. And CTR did drop.


Next up, HowTo snippets:

Case 1 – Totally stable after removal.
As I mentioned earlier, HowTo snippets provided an amazing SERP treatment (especially on mobile). For example, you would often see a carousel of thumbnails for each step, the snippet would take up a large amount of screen real estate in the viewport, they were extremely visual and engaging, etc. They were hard to overlook, that’s for sure.

So how would the removal of those HowTo snippets impact clicks and CTR? Not much at all from what I’m seeing, which was surprising. As you can see below, clicks and CTR are stable through the removal on mobile.


Case 2 – Also stable.
The next site I checked revealed the same situation. I did not see any changes to clicks or CTR based on HowTo snippets being removed from the mobile SERPs.


Case 3 – Yep, also stable.
And the third site revealed the same result. Clicks and CTR were stable through the removal of HowTo snippets in the mobile SERPs. CTR drops slightly recently, but it’s not too far off based on historical trending for the set of pages.


Key points and insights:

  • For most sites I’m checking, the removal of FAQ and HowTo snippets are not having a big impact on click through rate and clicks from the search results. That was surprising to me, but that was the case for most sites I’m checking (even beyond the six I covered in this blog post).
  • That said, there are outliers with some sites seeing a drop in CTR after the snippets were removed. Note, I haven’t seen that for the removal of HowTo snippets, but I provided an example earlier for FAQ. And that change I documented could be based on niche… For example, the case I covered where CTR dropped was a site focused on a YMYL category.
  • Although FAQ were supposed to be removed on both mobile and desktop, I was only seeing them removed on mobile until recently. Google finally removed FAQ snippets from the desktop search results on 9/13/23. You can read more about that below.
  • Google also announced on 9/14/23 that HowTo snippets would be removed on desktop as well as mobile. Originally, only the mobile SERPs would be impacted, but Google seems to have reversed course and removed HowTo snippets on desktop too.


Desktop FAQ snippets finally drop out of the SERPs on September 13, 2023:
We knew that FAQ snippets on desktop would eventually be removed, and now the time has come. As of 9/13/23 I am now seeing FAQ snippets removed from the desktop search results. I checked across sites and the snippets are gone. You can see a before and after photo below.

Prior to 9/13/23:

After 9/13/23 when FAQ snippets were removed:


Summary – The removal of FAQ and HowTo Snippets Are Not Having A Big Impact on CTR
For site owners, it’s always tough losing a special SERP treatment like FAQ and HowTo snippets, but there’s not much you can do about it. They are gone and probably not coming back. The good news is that I’m not seeing much impact to clicks and click through rate for sites that lost FAQ and HowTo snippets. I’ll keep tracking this over time and will update this post if anything changes performance-wise.

GG

Filed Under: google, mobile, seo

Why Noindexing Syndicated Content Is The Way – Tracking 3K syndicated news articles to determine the impact on indexing, ranking, and traffic across Google surfaces [Case Study]

August 4, 2023 By Glenn Gabe Leave a Comment

Syndicated Content SEO Case Study

Last month John Shehata from NewzDash published a blog post documenting a study covering the impact of syndication on news publishers. For example, when a publisher syndicates articles to syndication partners, which site ranks and what does that look like across Google surfaces (Search, Google News, etc.)

The results confirmed what many have seen in the SERPs over time while working at, or helping, news publishers. Google can often rank the syndication partner versus the original source, even when the syndicated content on partner sites is correctly canonicalized to the original source.

And as a reminder, Google updated its documentation about canonicalization in May of 2023 and revised its recommendation for syndicated content. Google now fully recommends that syndication partners noindex news publisher content if the publisher doesn’t want to compete with that syndication partner in Search. Google explained that rel canonical isn’t sufficient since the original page and the page located on the syndication partner website can often be different (when you take the entire page into account including the boilerplate, other supplementary content, etc.) Therefore, Google’s systems can presumably have a hard time determining that it’s the same article being syndicated and then rank the wrong version, or even both. More on that situation soon when I cover the case study…

Google canonicalization help document with syndicated content recommendations.

And here is information from Google’s documentation for news publishers about avoiding duplication problems in Google News with syndicated content:

Previously, Google has said you could use rel canonical pointing to the original source, while also providing a link back to the original source, which should have helped their systems determine the canonical url (and original source). And to be fair to Google, they did also explained in the past that you could noindex the content to avoid problems. But as anyone working with news publishers understands, asking for syndication partners to noindex that content is a tough situation to get approved. I won’t bog down this post by covering that topic, but most syndication partners actually want to rank for the content (so they are unlikely to noindex the syndicated content they are consuming.)

Your conversations with them might look like this:

Syndication partners ignoring site owners.

The Case Study: A clear example of news publisher syndication problems.
OK, so we know Google recommends noindexing content on the syndication partner website and to avoid using rel canonical as a solution. But what does all of this actually look like in the SERPs? How bad is the situation when the content isn’t noindexed? And does it impact all Google surfaces like Search, Top Stories, the News tab in Search, Google News, and Discover?

Well, I decided to dig in for a client that heavily syndicates content to partner websites. They have for a long time, but never really understood the true impact. After I sent along the study from NewzDash, we had a call with several people from across the organization. It was clear everyone wanted to know how much visibility they were losing by syndicating content, where they were losing that visibility, if that’s also impacting indexing of content, and more. So as a first step, I decided to craft a system to start capturing data that could help identify potential syndication problems. I’ll cover that next.

The Test: Checking 3K recently published urls that are also being syndicated to partners.
I took a step back and began mapping out a system for tracking the syndication situation the best I could based on Google’s APIs (including the Search Console API and the URL Inspection API). My goal was to understand how Google was handling the latest three thousand urls published from a visibility standpoint, indexing standpoint, and performance standpoint across Google surfaces (Search, Top Stories, the News tab in Search, and Discover).

Here is the system I mapped out:

  • Export the latest three thousand urls based on the Google News sitemap.
  • Run the urls through the URL Inspection API to check indexing in bulk (to identify any type of indexing issue, like Google choosing the syndication partner as the canonical versus the original source). If the pages weren’t indexed, then they clearly wouldn’t rank…
  • Then check performance data for each URL in bulk via the Search Console API. That included data for Search, the News tab in Search, Google News, and Discover.
  • Based on that data, identify indexed urls with no performance data (or very little) as candidates for syndication problems. If the urls had no impressions or clicks, then maybe a syndication partner was ranking versus my client.
  • Spot-check the SERPs to see how Google was handling the urls from a ranking perspective across surfaces.

No Rhyme or Reason: What I found disturbed me even more than I thought it would.
First was the indexing check across three thousand urls, which went very well. Almost all of the urls were indexed by Google. And there were no examples of Google incorrectly choosing syndication partners as the canonical. That was great and surprised me a bit. I thought I would see that for at least some of the urls.

Indexing check across recent news articles.

Next, I exported performance data in bulk for the latest three thousand urls. Once exported, I was able to isolate urls with very little, or no, performance data across surfaces. These were great candidates for potential syndication problems. i.e. If the content yielded no impressions or clicks, then maybe a syndication partner was ranking versus my client.

GSC performance data across recent news articles.

And then I started spot-checking the SERPs. After checking a number of queries based on the list of urls that were flagged, there was no rhyme or reason why Google was surfacing my client’s urls versus the syndication partners (or vice versa). And to complicate things even more, sometimes both urls ranked in Top Stories, Search, etc. And then there were times one ranked in Top Stories while the other ranked in Search. And the same went for the News tab in Search and Google News. It was a mess…

I’ll provide a quick example below so you can see the syndication mess. Note, I had to blur the SERPs heavily in the following screenshots, but I wanted to provide an example of what I found. Again, there was no rhyme or reason why this was happening. Based on this example, and what I saw across other examples I checked, I can understand why Google is saying to noindex the urls downstream on syndication partners. If not, any of this could happen.

First, here is an example of Yahoo Finance ranking in Top Stories while the original ranks in Search right below it:

Syndication partner ranking in Top Stories while the original source ranks in Search.

Next, Yahoo News ranks twice in the News tab in Search (which is an important surface for my client), while the original source is nowhere to be found. And my client’s logo is shown for the syndicated content. How nice…

Syndication partner ranking twice in the News tab of Search over the original source.

And then in Google News, the original source ranks and syndication partners are nowhere to be found:

The original source ranking in Google News over syndication partners.

As you can see, the situation is a mess… and good luck trying to track this on a regular basis. And the lost visibility across thousands of pages per month could really add up… It’s hard to determine the exact number of lost impressions and clicks, but it can be huge for large news publishers.

Discover: The Personalized Black Hole
And regarding Discover, it’s tough to track lost visibility there since the feed is personalized and you can’t possibly see what every other person is seeing in their own feed. But you might find examples in the wild of syndication partners ranking there versus your own content. Below is an example I found recently of Yahoo Finance ranking in Discover for an Insider Monkey article. Note, Insider Monkey is not a client and not the site I’m covering in the case study, but it’s a good example of what can happen in Discover. And if this is happening a lot, the site could be losing a ton of traffic…

Here is Yahoo Finance ranking in Discover:

Syndicated content ranking over the original source in Google Discover.

And here is the original article on Insider Monkey (but it’s in a slideshow format). This example shows how Google can see the pages are different, which can cause problems understanding that they are the same article:

Original article that is being syndicated to Yahoo Finance.

And here is Yahoo Finance ranking #2 for the target keyword in the core SERPs. So the syndication partner is ranking above the original in the search results:

Syndication partner outranking the original source in Search.


Key points and recommendations for news publishers dealing with syndication problems:

  • First, try to understand indexing and visibility problems the best you can. Use an approach like I mapped out to at least get a feel for how bad the problem is. Google’s APIs are your friends here and you can bulk process many urls in a short period of time.
  • Weigh the risks and benefits of syndicating content to partners. Is the additional visibility across partners worth losing visibility in Search, Top Stories, the News tab in Search, Google News and Discover? Remember, this could also mean a loss of powerful links as well… For example, if the syndication partner ranks, and other sites link to those articles, you are losing those links.
  • If needed, talk with syndication partners about potentially noindexing the syndicated content. This will probably NOT go well… Again, they often want to rank to get that traffic. But you never know… some might be ok with noindexing the urls.
  • Understand Discover is tough to track, so you might be losing more traffic there than you think (and maybe a lot). You might catch some syndication problems there in the wild, but you cannot simply go there and find syndication issues easily (like you can with Search, Top Stories, the News tab, and Google News).
  • Tools like Semrush and NewzDash can help fill the gaps from a rank tracking perspective. And NewzDash focuses on news publishers, so that could be a valuable tool in your tracking arsenal. Semrush could help with Search and Top Stories. Again, try to get a solid feel for visibility problems due to syndicating content.

Summary – Syndication problems for news publishers might be worse than you think.
If you are syndicating content, then I recommend trying to get an understanding of what’s going on in the SERPs (and across Google surfaces). And then form a plan of attack for dealing with the situation. That might include keeping things as-is, or it might drive changes to your syndication strategy. But the first step is gaining some visibility of the situation (pun intended). Good luck.

GG

Filed Under: google, seo, tools

Jarvis Rising – How Google could generate a machine learning model “on the fly” to predict answers when Search can’t, and how it could index those models to predict answers for future queries [Patent]

July 13, 2023 By Glenn Gabe Leave a Comment

Google machine learning models for predicting answers when search can't

After analyzing a Google patent related to PAA and PASF, I started reviewing other recently-granted patents. And it wasn’t long before I surfaced another very interesting one regarding the use of machine learning models. The patent I just analyzed focuses on using and/or generating a machine learning model in response to a query (when Google needs to predict an answer since the standard search results could not provide an adequate answer). After reading the patent multiple times, it underscored how sophisticated Google’s systems could be when needing to provide a quality answer (or prediction) for users.

Like with any patent, we never know if Google actually implemented what the patent covers, but it’s always possible. And if it was implemented, not only could Google be utilizing a trained machine learning model to help predict an answer to a query, but it can index those machine learning models, associate them with various entities, webpages, etc., and then retrieve and use those models for subsequent related searches. Think about how powerful and scalable that can be for Google.

In addition, the patent explains that Google can return an interactive interface to the machine learning model in the search results, which enables users to add parameters which can be used to generate a prediction for queries when the search results aren’t sufficient. That part of the patent had me thinking about the message Google rolled out in the SERPs in April of 2020 when there aren’t quality search results being returned for a query. The current implementation doesn’t provide a form for users to interact with, but it sure could at some point. And maybe that interface could be used for more queries in the future versus just the more obscure ones it surfaces for now. I’ll cover more about this in the bullets below.

Google's prompt that there aren't great matches for your search

Key points from the patent:
Similar to my last post covering another recent Google patent, I think the best way to cover the details is to provide bullets containing key points.

Generating and/or Utilizing a machine learning model in response to a search request
US 11645277 B2
Date Granted: May 9, 2023
Date Filed: December 12, 2017
Assignee Name: Google LLC

Diagrams from Google's patent about using machine learning systems to generate predictions

1. Google’s patent explains that if an answer cannot be located with certainty, and the user submits a request that is predictive in nature, a trained machine learning model can be used to generate a prediction.

2. For example, Google could first generate search results based on a query, but if the results aren’t of sufficient quality, then a machine learning model can be used to provide a stronger predicted answer. So, the system can provide predicted answers based on a machine learning model when an answer cannot be validated by Google.

Google's patent explaining that machine learning models can be used when there isn't a quality answer via search

3. Also, the machine learning model can be generated “on the fly”, and Google might store trained machine learning models in a search index. Yes, Google could index machine learning models that were just trained to provide predictions based on specific types of queries. I’ll cover more about this soon.

Training machine learning models on the fly and then indexing those models for future use

4. The patent provided an example based on the query, “How many doctors will there be in China in 2050?” If an authoritative answer cannot be provided via the standard search results, then the query can be passed to a trained machine learning model to generate a prediction.

An example of utilizing a machine learning model to generate a prediction

5. The patent goes on to explain that the system might take other years like 2010, 2015, 2020, etc. and use those to generate a prediction (via a machine learning model trained on those parameters).

6. The patent explains that trained machine learning models can be indexed by one or more content items from “resources utilized to train the model”. And for future queries, when the system identifies parameters that are related to a machine learning model (e.g. if a subsequent user asks a related question like, “How many doctors where there be in China in 2040?”), the machine learning model could be used to generate a prediction.

Machine learning models using parameters from a query to help generate a prediction

7. The patent goes on to explain that the machine learning models could be stored with one or more content items, like entities in a knowledge graph, table names, column names, webpage names, and more. In addition, words associated with the query like “China” and “doctors” could be used by the machine learning model to generate a prediction.

8. The patent goes on to explain that the system might provide an interactive interface for users to select parameters that can be passed to the machine learning model. That can be a text field, a dropdown menu, etc. Also, the response could include a message presented to the user that the response is a prediction based on a trained machine learning model. So Google wants to make sure users understand it’s a prediction based on a machine learning model versus answers provided based on data it has indexed.

Google providing an interactive interface enabling users to add parameters that can help generate an answer

9. The trained model can then be validated to ensure the predictions are of at least a “threshold quality”. Anything below a certain threshold can be suppressed and not provided to the user. In that case, the standard search results can be displayed instead.

Validating a response from a machine learning model trying to generate a prediction

10. Beyond public search results, the patent explains that the system could be used on a private database to help companies predict certain outcomes. The patent explains, “private to a group of users, a corporation, and/or other restricted sets.” For example, an employee of an amusement park might ask, “how many snow cones will we sell tomorrow?” The system could then query a private database to understand sales of previous days, weather information, attendance data, etc., to predict an answer for the employee.

11. The patent explains that the system could provide push notifications from an “automated assistant” at some point. And just thinking out loud, I’m wondering if that could be from a Jarvis-like assistant like I explained in my post about Google’s Code Red that triggered thousands of Code Reds at publishers. 

Push notifications from a machine learning model after it generates a prediction

12. From a latency standpoint, the patent explains that there could be a delay after a user submits a query. When that happens, the standard search results could be initially displayed along with a message that “good” results are not available for the query and that a machine learning model is being used to generate a prediction. In those situations, the system could push that prediction to the user at a later time or provide a hyperlink for users to click to view the machine learning output.

13. Also, the patent says for some situations that the user would have to affirm the prompt in order for the process to continue. For example, the system might provide a message stating, “A good answer is not available. Do you want me to predict an answer for you?” Then the machine learning model would be trained only if affirmative user input is received in response to the prompt. Like I explained earlier, I see a connection with the “There aren’t great matches for your search” message that rolled out in April of 2020. I’m wondering if that could expand to utilize this model in the future…

Prompting users to generate a prediction when search can't provide a quality answer

Summary: Google could be predicting quality answers in a powerful and super-efficient way via (indexed) machine learning models.
Although we don’t know if any specific patent is being used, the power and efficiency of this process makes a lot of sense for Google. From generating machine learning models “on the fly” to indexing those models for future use to utilizing an interactive interface with push notifications, Google seems to be setting the stage for an assistant like Jarvis. So, the next time you ask Google to predict an answer, think about this patent. And you might just be prompted for more information at some point (until Jarvis can do all of this in a nanosecond). :)

GG

Filed Under: google, patents, seo

  • 1
  • 2
  • 3
  • …
  • 39
  • Next Page »

Connect with Glenn Gabe today!

Latest Blog Posts

  • The September 2023 Google Helpful Content Update – Did Google’s Announcement in April About Page Experience Foreshadow What We’re Seeing With The Current HCU(X)?
  • How To Find Lower-Quality Content Being Excluded From Indexing Using Bing’s XML Sitemap Coverage Report (and Its “Content Quality” Flag)
  • How To Bulk Export GSC Performance Data For A Specific List Of URLs Using The Google Search Console API, Analytics Edge, and Excel
  • Analyzing the removal of FAQ and HowTo snippets from the Google search results [Data]
  • Why Noindexing Syndicated Content Is The Way – Tracking 3K syndicated news articles to determine the impact on indexing, ranking, and traffic across Google surfaces [Case Study]
  • Jarvis Rising – How Google could generate a machine learning model “on the fly” to predict answers when Search can’t, and how it could index those models to predict answers for future queries [Patent]
  • Analysis of Google’s Perspectives Filter and Carousel – A New Mobile SERP Feature Aiming To Surface Personal Experiences
  • People Also Search For, Or Do They Always? How Google Might Use A Trained Generative Model To Generate Query Variants For Search Features Like PASF, PAA and more [Patent]
  • Disavowing The Disavow Tool [Case Study] – How a site owner finally removed a disavow file with 15K+ domains, stopped continually disavowing links, and then surged back from the dead
  • Google’s April 2023 Reviews Update – Exploring its evolution from PRU to RU, a powerful tremor on 4/19, and how its “Review Radar” found larger publishers

Web Stories

  • Google’s December 2021 Product Reviews Update – Key Findings
  • Google’s April 2021 Product Reviews Update – Key Points For Site Owners and Affiliate Marketers
  • Google’s New Page Experience Signal
  • Google’s Disqus Indexing Bug
  • Learn more about Web Stories developed by Glenn Gabe

Archives

  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2023 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy
This website uses cookies to improve your experience. Are you ok with the site using cookies? You can opt-out at a later time if you wish. Cookie settings ACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience. You can read our privacy policy for more information.
Cookie Consent