Penguin 2.1 Analysis and Findings (Based on the October 4, 2013 Update)

Penguin 2.1 Released on October 4, 2013

On Friday, October 4th at 4:50PM, Matt Cutts announced that Penguin 2.1 was rolling out. It was shortly after reading that update that I tweeted we could be in for an icy weekend (as the latest update would surely take out more websites). But on the flip side, a new update also meant that more companies could recover from previous Penguin updates. Needless to say, I was eager to begin analyzing the impact of our cute, black and white friend.

If you have you followed my blog over the past few years, then you know I do a lot of Penguin work. So it should be no surprise that I’ve dug into Penguin 2.1 victims to determine new findings, insights, etc. I fired up my computer at 5:30AM on Saturday morning and started digging in. Since then, I have analyzed eleven websites (update: now 26 sites) hit by Penguin 2.1 and have provided my findings below.

Note, since this was a minor update (2.1), it signals that the core Penguin algorithm hasn’t been updated since 2.0, but that the data has been refreshed. Let’s cover what I found during my research.

Fresh Dates, Fresh Spam
While analyzing sites hit by Penguin 2.1, I wanted to check the creation dates for the unnatural links I was coming across. For most of the sites, the links were first found in late spring 2013, and many were found during the summer. That makes complete sense, since the sites in question weren’t hit by Penguin 2.0 on May 22. Instead, they were hit with this latest Penguin update on October 4, 2013. So, it does seem like fresher unnatural link data is being used.

Categories of Unnatural Links Targeted
Since Penguin 2.1 launched, many people have been asking me if the unnatural link footprint has changed since previous Penguin updates. For example, what types of unnatural links are showing up (for the sites that have been hit by Penguin 2.1). I have listed my latest findings below (and yes, some familiar unnatural link types showed up):

Forum Spam
I have seen forum spam targeted before, but I saw it heavily targeted with this update. For example, comments in forum threads that used exact match anchor text links pointing to websites trying to game the system. During my latest analyses, I saw a lot of forum spam mixed with forum bio spam, which is covered next.
Forum Bio Spam
During my research, I saw forum bios filled with exact match anchor text pointing at sites hit by Penguin 2.1. This is where the “linkbuilder” set up a profile on a forum, only to use that profile to gain exact match anchor text links to the target website. I also saw bios targeting multiple websites with exact match anchor text. This was obviously an attempt to maximize the forum bio to help multiple sites rank. More about multiple sites soon.
Do-Follow Blogs
I have seen links from various resources that identify do-follow blogs. A do-follow blog is one that doesn’t add nofollow to links posted (even blog comment signatures in some cases). The do-follow blog resources are problematic on several levels. First, they act like a directory using exact match anchor text linking to do-follow blogs. Second, they absolutely flag certain blogs as being a resource for rich anchor text links (which can send Google down an icy path).Let’s face it, being listed on do-follow resource sites can absolutely send Google a signal that you are trying to game links. Also, simply finding do-follow blogs and dropping links is not linkbuilding. If you are doing this, and you got hit by Penguin 2.1, then remove those links as soon as possible.
Blogroll Spam
OK, an old friend shows up in the list… Just like with Penguin 1.0 and 2.0, spammy blogroll links showed up on sites hit by Penguin 2.1. This shouldn’t be a shock to anyone involved in SEO, but should reinforce that blogrolls can be extremely problematic when they are on the wrong sites.I believe John Mueller from Google is on record saying that blogrolls overall aren’t bad, but it’s how they are used that can trigger a problem. I’ve always believed the same thing. If you have many blogroll links from questionable sites, then I highly recommend attacking them (nuking them, nofollowing them, or disavowing them). But again, some may be fine. If you are unsure which ones are bad versus good, ask for help from a seasoned SEO.
Spammy Directories
Another old Penguin friend showed up during my research. Similar to what I explained above with blogroll spam, certain directories are absolutely Penguin food. If you have used this tactic in the past, and still have links out there in spammy directories, then nuke them, have them nofollowed, or disavow them. I’ve seen this category of links show up so many times during my research across Penguin updates, it’s not even funny. Beware.In addition, I found several sites with millions of inbound links, and many of those were across spammy directories. Let me tell you… if you want to flag your own site, go ahead and build over 2M inbound links from spammy directories. You’ll get a knock on the door from Mr. Penguin. That’s for sure.
Blog Comment Signature Spam
I came across numerous instances of blog signatures using exact match or rich anchor text. What’s interesting is that Google seems to target these links, even when they aren’t followed links (most blogs nofollow signature links, and I saw this was the case during my research of sites that were hit by Penguin 2.1). So, it seems if you were using exact match anchor text as your blog comment signature, then it could be targeted by Penguin (even when those links are nofollowed).
(Update) Classified Ads Spam
As I analyzed more sites hit by Penguin 2.1, I saw low-quality classified websites show up with links pointing at destination sites. Classified ad listings were used to drop exact match anchor text links, and sometimes in great volume. For some sites I was analyzing, there were hundreds of pages showing from each domain with links to their websites (from the classified ad websites). I’ve analyzed many sites hit by Penguin (historically), and haven’t come across many classified websites showing up in the various link profiles. But with 2.1, I saw this a number of times.

“Linking” Victims Together Via Shared Tactics
One interesting finding I picked up during my analyses was the lumping together of victims. I noticed forum comment spam and forum bio spam that contained multiple sets of exact match anchor text links (to two different sites). That even helped me find more Penguin 2.1 victims… as I didn’t know about the second site being hit by Penguin until I found the first one during my research.

So, I’m wondering if Google was able to identify additional targets since they were associated with initial targets. This would be a brilliant approach for situations where multiple sites were targeted via unnatural links. It would be a solid example of Google targeting like-minded linkbuilders via evidence it picks up during its crawls. I can’t say for sure if the other sites would have been targeted anyway by Penguin 2.1, but I saw this numerous times during my latest research of Penguin 2.1.

Deeper Pages Targeted, P2.0-style
Content-wise, deeper pages were targeted by Penguin 2.1, just like Penguin 2.0. And since this is a minor update of 2.0, then that makes complete sense. I’m referring to the target pages of unnatural links on sites hit by Penguin 2.1. In case you aren’t familiar with what I’m referring to, Penguin 1.0 targeted links to the homepage of a website, where Penguin 2.0 targeted links to any page on the website.

When Matt Cutts first explained this after Penguin 2.0 launched on May 22, it made complete sense to me. That’s because I had Penguin 1.0 victims ask me why their competitors also weren’t targeted initially by Penguin. It ends up their competitors had targeted many pages within their own sites versus just driving unnatural links to their homepages. But in true Google algo fashion, those additional, deeper pages were eventually targeted. And many sites got hit by Penguin 2.0 (and now 2.1).

How To Recover:
My advice has not changed since Penguin 1.0. If you have been hit by Penguin, then you need to take the following steps. And you need to be aggressive with your approach. If you put band-aids on your Penguin situation, then you might not remove enough links to recover. And if that’s the case, then you can sit in the gray area of Penguin, never knowing how close you are to recovery.

Download Your Links – Download all of your links from multiple sources, including Google Webmaster Tools, Majestic, Open Site Explorer, etc.
Analyze Your Links – Analyze and organize your links. Identify which ones are unnatural and flag them in your spreadsheets for removal.
Attack the Links – Form a plan for removing unnatural links. I always recommend removing as many as you can manually, and then disavowing the remaining links.
Removal of Pages is OK – You can also remove pages from your site that are the target pages of spammy links. John Mueller from Google confirmed that removing pages is the same as removing the links. Of course, if those are important pages, then you can’t remove them (like your homepage!)
Head Down, Keep Driving Forward – Once you have completed your link cleanup, then you’ll need to wait for another Penguin update. Note, I have seen several Penguin recoveries during Panda updates, so that’s always a possibility.

Summary – Dealing With Penguin 2.1
That’s what I have for now. I will continue to analyze more sites hit by Penguin 2.1 and will try and write follow-up posts, based on my findings. If you have been hit, you need to move aggressively to fix the problem. Be thorough, be aggressive, and move quickly. That’s the best way to recover from Penguin. Good luck.

SEO Services

Company