Mark Zuckerberg Talks Classifiers, Confidence, and Algorithmic Collateral Damage (And it relates to Google Algorithm Updates)

A recent appearance by Mark Zuckerberg on the Joe Rogan podcast provides a great explanation of how classifiers, confidence levels, and precision work with machine learning algorithms. He also covered how adjusting confidence can impact the amount of collateral damage occurring when updates roll out. And if you swap “Social” for “Search”, Zuckerberg could be talking about any major Google algorithm update.

Mark Zuckerberg about machine learning classifiers and how confidence levels impact collateral damage.

I recently listened to the Joe Rogan podcast with Meta’s CEO Mark Zuckerberg to hear Mark’s thoughts about AI, augmented reality, the future of social media, etc. Although Mark heavily discussed topics related to social media, there was one part that stuck out for me from an SEO standpoint. Mark was talking about building systems to identify harmful content, creating classifiers to identify that content, and then precision levels (and how that can impact collateral damage).

He clearly and concisely explained how that works, and I immediately thought it could help site owners better understand how machine learning classifiers work for Search. And from a Google standpoint, it could help site owners and SEOs understand why some sites impacted by major algorithm updates probably shouldn’t be.

Over time, there have been some famous punitive Google algorithm updates that have used classifiers to identify sites that were problematic from a quality standpoint. And yes, there has been collateral damage along the way. Understanding how classifiers work in machine learning underscores the challenge that Google has in reeling in bad actors algorithmically, while also highlighting the scary reality of collateral damage. And that collateral damage can be devastating for site owners impacted. I’ll cover more about that later in this post.

First, I’ll provide the video clip of Mark below. Then I’ll cover how this translates to SEO, Google’s algorithm updates, and some of the most famous punitive algorithm updates Google has released over time. I’ll also cover how this relates to Google’s broad core updates, which typically roll out 4-5 times per year and can have a huge impact on search visibility across sites.

The Video Clip: Classifiers, Confidence Levels, and Collateral Damage.
Below at 30:45 in the video, Mark discusses the challenges of building systems and classifiers to target dangerous content. For site owners and SEOs out there, you can swap out “dangerous content” for any type of quality problem that Google wants to tackle from a machine learning standpoint. For example, “unhelpful content” or “spammy content”.

How this translates to SEO and Google algorithm updates.
As you can see based on Mark’s comments, there is always a struggle between confidence levels and collateral damage. For example, if you have a 99% confidence level with a classifier, then you might only take down a small percentage of low-quality content (or whatever problem you are targeting). But if you lower the confidence level to say 90%, then you might pick up 60% of the low-quality content, but you’ll have much more collateral damage.

And even at 90% confidence, that means 10% of the sites flagged (and potentially impacted by an algorithm update) could be collateral damage. And when you multiply that across all of the sites on the web, or within a category that could be targeted, then that could translate to a lot of sites being impacted by an algorithm update that might not be problematic.

When you think about major algorithm updates of the past like Panda, Penguin, product reviews updates (PRU) and the helpful content update (HCU), they all used classifiers. Google’s Pirate algorithm also uses a classifier, and that update is still running on a regular basis from what we know. Now, we don’t know the confidence levels employed by Google over time with those systems and updates, but there clearly have been cases of collateral damage.

The most recent example of that is the September 2023 helpful content update, or what I call the HCU(X). Google has even explained to some creators that were heavily impacted by that update that there was nothing wrong with their sites. Note, Google isn’t saying ALL sites impacted by the September 2023 helpful content update were fine, just that some were collateral damage.

Then with the March 2024 broad core update, Google baked the HCU into its core ranking systems, got rid of the classifier, and explained that several core systems now assess the helpfulness of content. Unfortunately for many of those sites impacted by the September 2023 HCU(X), they still haven’t significantly bounced back (or bounced back for an extended period of time). Some have surged during broad core updates, only to drop back down during subsequent algorithm updates.

For example, here is a site that was heavily impacted by the September HCU(X) that has surged during subsequent core updates, only to drop back down to an extent.

The first screenshot shows a drop with the March 2024 core update, a surge with the August core update, then a drop back down. Then it surges again with the November core update only to drop with the December core update. But this doesn’t even give you the full picture. Check the next screenshot for that…

Helpful content site surging and dropping during major Google algorithm updates.

The next screenshot shows the true effect of the September helpful content update. There was a massive drop with the September HCU(X) and then more in March 2024, and then the surges and drops I covered above. Needless to say, it’s not pretty…

A huge drop based on the September HCUX.

Google also explains in its guide to search rankings systems that “site-wide signals and classifiers” are used to and contribute to our understanding of pages.” So there could be any number of classifiers used in Google’s core ranking systems that are trying to target some type of issue or condition. So adjusting confidence levels for classifiers could yield collateral damage during broad core updates, as well as, specific major algorithm updates like reviews updates, Pirate, the HCU of the past, Penguin of the past, etc. For core updates, there are many systems working together so adjusting the confidence level of one classifier might not be as acute as one of the specific algorithm updates mentioned above. That said, you never know how powerful one of those classifiers is…

Google explaining it uses classifiers in its guide to search rankings systems.

Yo-Yo Trending and Confidence Levels:
There are many examples of sites surging and dropping over time during broad core updates, and some of that could absolutely be due to Google tinkering with classifiers and confidence levels. Regarding Google’s broad core updates, I’ve always explained that site owners should get clearly out of the gray area when it comes to “quality” and this is one reason why.

If you are on the border quality-wise (in the gray area), and Google tinkers with the confidence levels of several classifiers used in its core ranking system, then you could surge with one update, and then drop heavily with the next. And as long as a site remains in the gray area quality-wise, then yo-yo trending can continue to happen over time as Google adjusts confidence levels of several classifiers.

Here is a good example of a site surging and dropping like mad with various updates, including product reviews updates and broad core updates (and then even unconfirmed updates). Talk about yo-yo trending…

Yo-yo trending over time based on major Google algorithm updates.

There are obviously other reasons this type of trending could happen, but it makes complete sense that altering the confidence levels of classifiers could cause a lot of volatility for sites that are on the border of getting flagged by the classifier. And that can definitely cause yo-yo trending over time. Again, it’s just another reason to get clearly out of the gray area of Google’s algorithms from a quality standpoint. That’s why my recommendation has always been to significantly improve quality over time using the “kitchen sink” approach to remediation. Don’t cherry pick changes. Instead, address as many issues as you can that might be impacting Google’s assessment of quality. That’s the best path forward based on my experience helping many companies that have been negatively impacted by major algorithm updates.

The Kitchen Sink approach to remediation from Glenn Gabe of G-Squared Interactive.

The future of broad core updates is… no broad core updates.
On the topic of broad core updates, yo-yo trending, and collateral damage, I’ve been saying for a while that the future of broad core updates is… no broad core updates. I believe we’ll see a time when Google’s core ranking system can be updated and refreshed at a regular and ongoing basis without a specific date when updates roll out. When that happens, sites will probably not see a massive drop on one date (or during a relatively short rollout). Instead, the site might drop over time as various systems are updated often.

On the one hand, site owners should technically be able to reverse that trend quicker than waiting for subsequent broad core updates to roll out, but it also might not be immediately apparent what’s happening when they are dropping. But that also keeps site owners from screaming bloody murder when a broad core update rolls out. And Google would like nothing more than to stop that from happening. It’s a bad look for Google, it always ends up in the digital marketing news, while also sometimes reaching the mainstream news.

Oh, and Google recently explained at the Search Central Live event in Zurich that they would like to have more core updates and more frequently. And Danny Sullivan explained, “…the goal is to make updates routine and continuous, so they’re no longer seen as major events”.

So basically, what I just covered above. :)

Here is a tweet from Jonathan Jones who attended Search Central Live in Zurich:

Tweet about the Search Central Live event in Zurich where Google explained core updates will roll out more frequently.

Summary: Confidence levels dictate the amount of collateral damage.
I found Mark Zuckerberg’s comments about building systems to identify harmful content super interesting, especially since that process relates to how Google’s search systems work. As Google tries to tackle quality problems in Search, it can build systems to target those problems. Then Google needs to figure out the optimal confidence level for the classifier to determine how much of the problem it catches, and then how much collateral damage there is. It’s a tough situation and clearly causes big problems for site owners who are wrongly caught in the crosshairs.

I hope you found the video, and this post, helpful. Again, Mark Zuckerberg provided a concise explanation of how classifiers work. Now just swap “Social” for “Search” and he could be talking about any major Google algorithm update.

SEO Services

Company