The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
    • Web Stories
  • Contact GSQi

Archives for February 2016

How To Test Articles Using The Google News Troubleshooting Tool (via the Google News Publisher Center)

February 19, 2016 By Glenn Gabe 2 Comments

How To Use The Google News Troubleshooting Tool

For news publishers, being included in Google News is a big deal. You can drive a lot of traffic by being listed in Google News, which means you can drive a lot of incremental ad revenue. So, it’s tough when certain urls are not accepted in Google News and publishers lose out on traffic, revenue, and prospective subscribers.

There are a number of potential Google News errors that can be causing problems for publishers. And based on helping several large-scale publishers, I find many people involved don’t know the various types of errors or how to track them down for a given site. Until last week, publishers would typically only find these errors via Google Search Console (GSC) under the Crawl Errors reporting. There is a tab for News errors that lists each error by category. See below.

Google News Errors in Google Search Console (GSC)

That’s great, but the problem with only using that approach is that it’s reactive versus proactive. In other words, if you simply check the News tab in the Crawl Errors reporting, you are dealing with errors after the fact. That’s instead of testing your urls early on to ensure they will be accepted in Google News.

If only Google News provided some type of testing tool… Oh wait, it does! And it was released last week. It’s called the Google News Troubleshooting Tool and I’m going to cover what it is and how news publishers can use it in this post.

Introducing the Google News Troubleshooting Tool
If you are included in Google news, chances are you already know about the Google News Publisher Center. It’s a tool that enables you to provide Google more information about your website, including which sections to include in Google News, and which labels to associate with your content. The sites added in the publisher center are tied to verified properties in Google Search Console (GSC), so that’s obviously a prerequisite.

When you log into the publisher center, you will see all of the sites you have access to, including whether or not they are included in Google News.

The Google News Publisher Center


The Troubleshooting Tool, A Valuable Debugging Tool In Your Newsroom Arsenal
Once in the Google News Publisher Center, you’ll see a new section labeled “Troubleshooting”. Under that heading, you will see two links, one for “Sections” and the other for “Articles”.

The Troubleshooting Tool in the Google News Publisher Center

A section is an area of your site where you consistently publish news content. For example, maybe you publish news content in the /news/ directory of your site. That would be a section. Note, you can add multiple sections in the publisher center. I’ll explain how to use the “Sections” link for troubleshooting soon.

Under the “Sections” link in the troubleshooting tool, you’ll see “Articles”. An article is what you think it is… a specific piece of news content on your site (the url). You would use the “Articles” link to test a specific url for Google News extraction problems.

Now let’s troubleshoot some Google News errors. :)

How To Use The Google News Troubleshooting Tool To Test Sections and Articles:
OK, let’s say you just published some important news content and you want to make sure it ends up being included in Google News. In addition, you are using a Google News sitemap and you have added sections to the Google News Publisher Center.

The first thing you would want to do is to click the “Sections” link in the Troubleshooting tool. Since we want to test multiple articles, the most efficient way to tackle this is to enter the Google News sitemap url or add the section to the form. The tool will retrieve the first 100 articles based on the section or sitemap you entered.

Troubleshooting Sections in Google News

Next, you can click the “Test” button under the “Article Extraction” column. This will enable you to test specific urls from your sitemap or section. Again, this is a proactive approach that can help you identify and fix problems before you realize you aren’t getting any Google News traffic (for a url or a set of urls).

Once you click the test button, you will see the status, title, and body that was extracted from the article (if successful). If the extraction failed, the status will list the failure along with the reason. And the reason will link you to the Google News errors page to learn more about what caused the extraction problem. For example, Title not found, Article too long, Article fragmented, etc.

Example of a failure due to “Title not found”:
Google News Error - Title Not Found

Failure and Success: Identifying Google News Crawl Errors
If a url fails for some reason, click through to the support document that lists all Google News errors and determine the root cause of the problem at hand. For example, there could be a coding glitch that’s impacting the extraction process. You might see “empty article”, “title not found” or “article too short”, and wonder what’s going on… But when digging into the url in question, you might find a technical problem acting as the Google News gremlin. Also, analyzing problematic urls can sometimes yield huge wins that can help with many more urls than just the one you are troubleshooting.

Example of a failure due to “No sentences found”:

Google News Error - No Sentences Found

And here’s an example of success:
Google News Troubleshooting Tool - Success

Important: Work Fast, Win Traffic
Google News is time-sensitive, so it’s extremely important to fix problems quickly. The troubleshooting documentation in the publisher center explains that after two days, Google News will crawl the content less often. So they recommend fixing extraction problems within two days to increase the likelihood they will recrawl it again (and hopefully extract the content successfully this time).

Google News Troubleshooting - Two Days To Fix Errors

Bonus: Three Errors Specific To The Troubleshooting Tool
Google explains that most errors will pertain to the crawl errors for Google News, but there are three errors specific to the testing tool that you should know about. You can read about those three errors here, and they relate to access and permission.

For example, if the article returns a 404 or if it’s blocked by robots.txt, you will see a “Failure: Could not crawl article URL” error. This also applies to a section url (the subdirectory you enter as a section in the publisher center). If it 404s or if it’s blocked by robots.txt, you will see a “Failure: Could not crawl section URL”.

And last, there’s a “Failure: Permission denied” error that will flag several situations. For example, not having permission to view the news source in the publisher center, the article is not on the same domain as your source url, the section or sitemap isn’t located on the same domain, or other setup problems.

Again, you have less then two days to fix any problem you encounter. So move quickly.

Summary – A Great Addition For News Publishers
If you are included in Google News, then you should definitely use the new troubleshooting tool in the publisher center. It enables you to be proactive versus reactive and can help you hunt down Google News errors before they cause big problems (like keeping content out of Google News). By following the steps I provided in this post, you can hopefully nip news-related errors in the bud, while possibly surfacing more widespread problems with your site. And that’s always a good thing. Personally, I’m digging the new tool. I think you will too.

GG

Filed Under: google, seo, tools

6 Time-Saving Link Analysis Tips As We Approach Penguin 4.0 [SEO Plugins, Tools, and Resources]

February 8, 2016 By Glenn Gabe 4 Comments

Link Analysis Tips for Penguin 4.0

The mad dash is on to clean up link profiles. Penguin 4.0 is quickly approaching and many companies are heavily analyzing their link profiles in order to flag and handle unnatural links. Penguin 4.0 is supposed to be the real-time Penguin, which is great news for SEOs and webmasters that have dealt with Penguin issues in the past. That said, you never know how Google’s algorithms are going to behave until they are actually released in the wild (as we have seen before). We’ll hopefully learn more about Penguin 4 very soon.

I’ve been helping several companies prepare for the arrival of our cute, black and white, allegedly real-time, Antarctic friend. Yes, go ahead and say that five times fast. :) And while helping those clients, I’ve received a lot of questions about how to save time while analyzing links. For example, which tools, tips, and resources can help as you go through the process of analyzing a link profile? So, based on those questions, I’ve decided to write this post to cover six time-saving tips. My hope is they can make your SEO life a little easier as you fight unnatural links.

Below, I’ll cover plugins, resources, SEO tools, and Excel tips. Here we go… the icy waters of Penguin await. :)

SpamFlag Chrome Extension
I was part of the beta when SpamFlag first launched, and I’ve loved the plugin ever since. SpamFlag is a Chrome extension that enables you to quickly identify a link on a webpage that’s pointing to a specific domain. You can enter the domain in the plugin settings, along with additional project domains (so you can list several to surface for while analyzing a webpage). Then when you visit a url, SpamFlag will highlight links to the domain(s) in question. It’s a huge timesaver while analyzing unnatural links.

SpamFlag Chrome Settings

SpamFlag provides a main notification menu at the top, which shows you if there is a link to the domain you are analyzing, how many of the links are nofollowed, how many are followed, and then a total number of links found (based on the domain you are analyzing). You can also click the link-type in the menu to hop down to the first link on the page that was flagged. For example, by clicking the “followed links” category in the top menu, you will be taken to the first followed link on the page that’s pointing to the domain you are analyzing.

SpamFlag Menu

Each link is highlighted in yellow, so you can easily spot them on the page. That’s very helpful, especially when you are digging through thousands of links. When you hover over the highlighted link, you can view the html code for the link (which enables you to truly see if it’s nofollowed, the destination url on the domain you are analyzing, etc.) Again, it’s a big time saver versus choosing “view source” and then digging through the source code.

SpamFlag Link Information on Hover

And last, but not least, you can view small highlighted bars on the right side of the page (near the scrollbar in the browser) that show you where the links are located on the page. And you can click those highlighted bars to jump to that specific link. Awesome.

SpamFlag Browser Bars

So, if you are working on many link analysis projects, then I highly recommend checking out SpamFlag. You might love it as much as I do.

Marie Haynes Disavow Blacklist
If you’re neck deep in SEO, and you have read about unnatural links, then you are probabaly familiar with Marie Haynes. Marie has been heavily involved with both manual actions and Penguin work and has completed a boatload of link audits.

Given what I just explained, it was great of her to build a tool that enables anyone to check a domain against her own blacklist. Just enter a domain and click “Check it!” Marie’s tool will return a message explaining more about the domain, what her recommendation is, etc. There are several classifications of links that Marie covers in her responses.

Marie Haynes' Blacklist

When performing link audits, you’ll come across both highly spammy domains and clean domains. But in SEO, there are often many shades of gray. For example, you might come across a domain that seems a little spammy, but you’re not sure if you should use the domain operator or just disavow the url. For times like that, checking Marie’s blacklist can be very helpful. Again, Marie has come across many domains during her Penguin travels, so there’s a good chance she has flagged many you are coming across too.  And I know she continually updates the database, which is great.

Which Sources Of Links Should I Use?
Whenever link analysis comes up, I’m always asked which sources of links to use. Is it ok to just use Google Search Console (GSC)? Should you include link analysis tools like Majestic? How much is too much? These are all great questions and it’s important to make sure you cover your bases.

First, you definitely don’t want to miss any unnatural links. That can happen if you simply use one or two tools for collecting links. Also, Google Search Console (GSC) only provides up to 100K of your links per “site” verified in GSC, which isn’t sufficient when you are analyzing a site with hundreds of thousands, or millions, of links. Sure, there are ways to squeeze more out of GSC, but it’s a little tedious. For example, verifying directories in GSC and then exporting links based on directory. That’s totally doable, but will require some additional work. And it’s still limited to 100K per “site” in GSC.

So, I usually recommend exporting links from the following sources:

  • Google Search Console (GSC)
    Hey, it’s Google, definitely start here! Just make sure to export both sampled links and latest links. Then dedupe those. I’ll explain how to do that soon.
  • Bing Webmaster Tools (BWT)
    Yes, Bing is a search engine too, so don’t overlook it.
  • Majestic
    To me, it’s the best link analysis tool on the market and provides a wealth of link data.
  • ahrefs
    A close second to Majestic. It’s a great link analysis tool that also provides a wealth of link data.
  • Open Site Explorer (OSE)
    Typically, it doesn’t contain the most in-depth view of your link profile, but it’s a good supplemental source of links. Remember, you’re trying to cover as many links to your site as possible. Adding OSE is a good idea.

Note, I recommend keeping a spreadsheet that contains all sources of links, but keep each source in its own worksheet. So you should have six different worksheets in your spreadsheet (one per source, including one for GSC sampled and one for GSC latest). Then you’ll want to create a master list in another worksheet (more about that shortly).

Link Sources for Analysis of Unnatural Links

By gathering links from all of the sources listed above, you can rest assured you are covering your bases. There shouldn’t be many link surprises like there can be if you simply choose one or two of those sources. Now, once you download your links, you’ll likely be staring at a boatload of link data. Don’t worry, you can cut that list down pretty quickly. I’ll cover a required link tactic next – deduplication.

Dedupe In Excel
Once you gather all of your links from across sources (as documented above), you’ll definitely want to create a master list to analyze. For example, there will probably be many overlapping links across sources, so you don’t want to waste your time analyzing the same links over and over.

Enter Excel, the Swiss Army knife for SEOs, less the bottle opener. Once you create a worksheet that contains all links from across sources, simply click the Data tab, and then Remove Duplicates. Once you do, Excel will prompt you to choose the column to base the deduplication on. Once you choose the column containing your links, and click OK, Excel will quickly remove any duplicates and inform you how man it found. Voila. :)

Dedupe Functionality in Excel

URL Profiler
URL Profiler is of my favorite SEO tools, especially when performing link analyses. You can do many things with URL Profiler, but I’ll focus on how it helps with analyzing and sorting links.

After gathering links from across data sources, you’ll notice that the data from Google Search Console does not contain any information beyond the actual link. For example, you don’t have anchor text, target url, or any other piece of valuable data that comes from link analysis tools like Majestic, ahrefs, and Open Site Explorer. That makes it tougher to analyze links from GSC.

Enter URL Profiler. Once I dedupe all of the link data, I run that deduped list through URL Profiler. By doing so, I receive a boatload of data back, including anchor text, target url, classification of links, whether it’s nofollowed, header response codes, the root domain extracted for me, etc. The additional fields of data you receive from URL Profiler are extremely helpful while analyzing links.

Also, before crawling your links, you can add domains to a whitelist, blacklist, and you can import your disavow file (to focus the crawl on what you need). Yes, more time-saving features.

URL Profiler for Link Analysis

But it gets even better. URL Profiler breaks your links into specific worksheets by category (based on what it finds during the crawl). For example, you will find worksheets titled unnatural, review, ignore, none (no link found), etc. Note, I still believe you should manually analyze all links, since no system is perfect. But, the breakdown enables you to start with the riskiest links and move your way through the spreadsheet.

URL Profiler Exported Tabs

Bonus: Concatenation in Excel And The Last Mile For Disavow Files
Last, but not least, let’s cover one of my favorite functions in Excel – concatenation. The concatenation function enables you to combine strings. For our purposes, you will inevitably want to add the domain operator (domain:) to the final list of domains you flag during your analysis. Then, the final list can be copied to your disavow file.

So instead of copying and pasting “domain:” before every domain you want to include, you could simply create a new column that will automatically do this for you. Let’s say you have a final list of domains to disavow in a worksheet (and they are in column A). In column B, add the following formula:

=CONCATENATE(“domain:”, A2)

That will prepend “domain:” before the domain name listed in column A. Then simply hover over the bottom-right corner of the cell containing your formula and double click. Now that formula will be copied to every row in that column (and you will have a list of domains to disavow). All you will have to do is copy and paste that column into your disavow file.

Concatenation in Excel for SEO

Summary – Save Time While Preparing For Penguin 4.0
Performing a deep link analysis is extremely time consuming. Therefore, it’s important to save time any way you can. In this post, I covered important tools, resources, and tips that can save SEOs time, from working in Excel to choosing link sources to using third party tools like URL Profiler. The more time you save, the more links you can get through. And the more links you can flag, the greater chance you have at keeping Penguin at bay. And that’s always a good thing.

GG

 

Filed Under: google, seo, tools

Connect with Glenn Gabe today!

Latest Blog Posts

  • Continuous Scroll And The GSC Void: Did The Launch Of Continuous Scroll In Google’s Desktop Search Results Impact Impressions And Clicks? [Study]
  • How to analyze the impact of continuous scroll in Google’s desktop search results using Analytics Edge and the GSC API
  • Percent Human: A list of tools for detecting lower-quality AI content
  • True Destination – Demystifying the confusing, but often accurate, true destination url for redirects in Google Search Console’s coverage reporting
  • Google’s September 2022 Broad Core Product Reviews Update (BCPRU) – The complexity and confusion when major algorithm updates overlap
  • Google Multisearch – Exploring how “Searching outside the box” is being tracked in Google Search Console (GSC) and Google Analytics (GA)
  • Sitebulb Server – Technical Tips And Tricks For Setting Up A Powerful DIY Enterprise Crawler (On A Budget)
  • Google’s Helpful Content Update Introduces A New Site-wide Ranking Signal Targeting “Search engine-first Content”, and It’s Always Running
  • The Google May 2022 Broad Core Update – 5 micro-case studies that once again underscore the complexity of broad core algorithm updates
  • Amazing Search Experiments and New SERP Features In Google Land (2022 Edition)

Web Stories

  • Google’s December 2021 Product Reviews Update – Key Findings
  • Google’s April 2021 Product Reviews Update – Key Points For Site Owners and Affiliate Marketers
  • Google’s New Page Experience Signal
  • Google’s Disqus Indexing Bug
  • Learn more about Web Stories developed by Glenn Gabe

Archives

  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2023 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy
This website uses cookies to improve your experience. Are you ok with the site using cookies? You can opt-out at a later time if you wish. Cookie settings ACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience. You can read our privacy policy for more information.
Cookie Consent