6 Time-Saving Link Analysis Tips As We Approach Penguin 4.0 [SEO Plugins, Tools, and Resources]

Glenn Gabe

google, seo, tools

Link Analysis Tips for Penguin 4.0

The mad dash is on to clean up link profiles. Penguin 4.0 is quickly approaching and many companies are heavily analyzing their link profiles in order to flag and handle unnatural links. Penguin 4.0 is supposed to be the real-time Penguin, which is great news for SEOs and webmasters that have dealt with Penguin issues in the past. That said, you never know how Google’s algorithms are going to behave until they are actually released in the wild (as we have seen before). We’ll hopefully learn more about Penguin 4 very soon.

I’ve been helping several companies prepare for the arrival of our cute, black and white, allegedly real-time, Antarctic friend. Yes, go ahead and say that five times fast. :) And while helping those clients, I’ve received a lot of questions about how to save time while analyzing links. For example, which tools, tips, and resources can help as you go through the process of analyzing a link profile? So, based on those questions, I’ve decided to write this post to cover six time-saving tips. My hope is they can make your SEO life a little easier as you fight unnatural links.

Below, I’ll cover plugins, resources, SEO tools, and Excel tips. Here we go… the icy waters of Penguin await. :)

SpamFlag Chrome Extension
I was part of the beta when SpamFlag first launched, and I’ve loved the plugin ever since. SpamFlag is a Chrome extension that enables you to quickly identify a link on a webpage that’s pointing to a specific domain. You can enter the domain in the plugin settings, along with additional project domains (so you can list several to surface for while analyzing a webpage). Then when you visit a url, SpamFlag will highlight links to the domain(s) in question. It’s a huge timesaver while analyzing unnatural links.

SpamFlag Chrome Settings

SpamFlag provides a main notification menu at the top, which shows you if there is a link to the domain you are analyzing, how many of the links are nofollowed, how many are followed, and then a total number of links found (based on the domain you are analyzing). You can also click the link-type in the menu to hop down to the first link on the page that was flagged. For example, by clicking the “followed links” category in the top menu, you will be taken to the first followed link on the page that’s pointing to the domain you are analyzing.

SpamFlag Menu

Each link is highlighted in yellow, so you can easily spot them on the page. That’s very helpful, especially when you are digging through thousands of links. When you hover over the highlighted link, you can view the html code for the link (which enables you to truly see if it’s nofollowed, the destination url on the domain you are analyzing, etc.) Again, it’s a big time saver versus choosing “view source” and then digging through the source code.

SpamFlag Link Information on Hover

And last, but not least, you can view small highlighted bars on the right side of the page (near the scrollbar in the browser) that show you where the links are located on the page. And you can click those highlighted bars to jump to that specific link. Awesome.

SpamFlag Browser Bars

So, if you are working on many link analysis projects, then I highly recommend checking out SpamFlag. You might love it as much as I do.

Marie Haynes Disavow Blacklist
If you’re neck deep in SEO, and you have read about unnatural links, then you are probabaly familiar with Marie Haynes. Marie has been heavily involved with both manual actions and Penguin work and has completed a boatload of link audits.

Given what I just explained, it was great of her to build a tool that enables anyone to check a domain against her own blacklist. Just enter a domain and click “Check it!” Marie’s tool will return a message explaining more about the domain, what her recommendation is, etc. There are several classifications of links that Marie covers in her responses.

Marie Haynes' Blacklist

When performing link audits, you’ll come across both highly spammy domains and clean domains. But in SEO, there are often many shades of gray. For example, you might come across a domain that seems a little spammy, but you’re not sure if you should use the domain operator or just disavow the url. For times like that, checking Marie’s blacklist can be very helpful. Again, Marie has come across many domains during her Penguin travels, so there’s a good chance she has flagged many you are coming across too.  And I know she continually updates the database, which is great.

Which Sources Of Links Should I Use?
Whenever link analysis comes up, I’m always asked which sources of links to use. Is it ok to just use Google Search Console (GSC)? Should you include link analysis tools like Majestic? How much is too much? These are all great questions and it’s important to make sure you cover your bases.

First, you definitely don’t want to miss any unnatural links. That can happen if you simply use one or two tools for collecting links. Also, Google Search Console (GSC) only provides up to 100K of your links per “site” verified in GSC, which isn’t sufficient when you are analyzing a site with hundreds of thousands, or millions, of links. Sure, there are ways to squeeze more out of GSC, but it’s a little tedious. For example, verifying directories in GSC and then exporting links based on directory. That’s totally doable, but will require some additional work. And it’s still limited to 100K per “site” in GSC.

So, I usually recommend exporting links from the following sources:

  • Google Search Console (GSC)
    Hey, it’s Google, definitely start here! Just make sure to export both sampled links and latest links. Then dedupe those. I’ll explain how to do that soon.
  • Bing Webmaster Tools (BWT)
    Yes, Bing is a search engine too, so don’t overlook it.
  • Majestic
    To me, it’s the best link analysis tool on the market and provides a wealth of link data.
  • ahrefs
    A close second to Majestic. It’s a great link analysis tool that also provides a wealth of link data.
  • Open Site Explorer (OSE)
    Typically, it doesn’t contain the most in-depth view of your link profile, but it’s a good supplemental source of links. Remember, you’re trying to cover as many links to your site as possible. Adding OSE is a good idea.

Note, I recommend keeping a spreadsheet that contains all sources of links, but keep each source in its own worksheet. So you should have six different worksheets in your spreadsheet (one per source, including one for GSC sampled and one for GSC latest). Then you’ll want to create a master list in another worksheet (more about that shortly).

Link Sources for Analysis of Unnatural Links

By gathering links from all of the sources listed above, you can rest assured you are covering your bases. There shouldn’t be many link surprises like there can be if you simply choose one or two of those sources. Now, once you download your links, you’ll likely be staring at a boatload of link data. Don’t worry, you can cut that list down pretty quickly. I’ll cover a required link tactic next – deduplication.

Dedupe In Excel
Once you gather all of your links from across sources (as documented above), you’ll definitely want to create a master list to analyze. For example, there will probably be many overlapping links across sources, so you don’t want to waste your time analyzing the same links over and over.

Enter Excel, the Swiss Army knife for SEOs, less the bottle opener. Once you create a worksheet that contains all links from across sources, simply click the Data tab, and then Remove Duplicates. Once you do, Excel will prompt you to choose the column to base the deduplication on. Once you choose the column containing your links, and click OK, Excel will quickly remove any duplicates and inform you how man it found. Voila. :)

Dedupe Functionality in Excel

URL Profiler
URL Profiler is of my favorite SEO tools, especially when performing link analyses. You can do many things with URL Profiler, but I’ll focus on how it helps with analyzing and sorting links.

After gathering links from across data sources, you’ll notice that the data from Google Search Console does not contain any information beyond the actual link. For example, you don’t have anchor text, target url, or any other piece of valuable data that comes from link analysis tools like Majestic, ahrefs, and Open Site Explorer. That makes it tougher to analyze links from GSC.

Enter URL Profiler. Once I dedupe all of the link data, I run that deduped list through URL Profiler. By doing so, I receive a boatload of data back, including anchor text, target url, classification of links, whether it’s nofollowed, header response codes, the root domain extracted for me, etc. The additional fields of data you receive from URL Profiler are extremely helpful while analyzing links.

Also, before crawling your links, you can add domains to a whitelist, blacklist, and you can import your disavow file (to focus the crawl on what you need). Yes, more time-saving features.

URL Profiler for Link Analysis

But it gets even better. URL Profiler breaks your links into specific worksheets by category (based on what it finds during the crawl). For example, you will find worksheets titled unnatural, review, ignore, none (no link found), etc. Note, I still believe you should manually analyze all links, since no system is perfect. But, the breakdown enables you to start with the riskiest links and move your way through the spreadsheet.

URL Profiler Exported Tabs

Bonus: Concatenation in Excel And The Last Mile For Disavow Files
Last, but not least, let’s cover one of my favorite functions in Excel – concatenation. The concatenation function enables you to combine strings. For our purposes, you will inevitably want to add the domain operator (domain:) to the final list of domains you flag during your analysis. Then, the final list can be copied to your disavow file.

So instead of copying and pasting “domain:” before every domain you want to include, you could simply create a new column that will automatically do this for you. Let’s say you have a final list of domains to disavow in a worksheet (and they are in column A). In column B, add the following formula:

=CONCATENATE(“domain:”, A2)

That will prepend “domain:” before the domain name listed in column A. Then simply hover over the bottom-right corner of the cell containing your formula and double click. Now that formula will be copied to every row in that column (and you will have a list of domains to disavow). All you will have to do is copy and paste that column into your disavow file.

Concatenation in Excel for SEO

Summary – Save Time While Preparing For Penguin 4.0
Performing a deep link analysis is extremely time consuming. Therefore, it’s important to save time any way you can. In this post, I covered important tools, resources, and tips that can save SEOs time, from working in Excel to choosing link sources to using third party tools like URL Profiler. The more time you save, the more links you can get through. And the more links you can flag, the greater chance you have at keeping Penguin at bay. And that’s always a good thing.