The Internet Marketing Driver

  • GSQi Home
  • About Glenn Gabe
  • SEO Services
    • Algorithm Update Recovery
    • Technical SEO Audits
    • Website Redesigns and Site Migrations
    • SEO Training
  • Blog
    • Web Stories
  • Contact GSQi

Archives for November 2019

Fixing A Google Images Indexing Problem Caused By Redirect Chains and Robots.txt Directives – Case Study

November 8, 2019 By Glenn Gabe Leave a Comment

Google has done a lot of work over the past two years with Google Images. And all signs point to Google driving forward with even more new features and functionality. Although sometimes an afterthought for many companies when it comes to Search, Google Images can drive meaningful traffic. In addition, having your images properly indexed can lead to stronger listings in Google’s Top Stories (if you’re a news publisher).

Therefore, if you have strong visual assets, and you know people are searching for those images, then it’s important to give your site the best shot possible to rank for relevant queries.

By the way, I attended the Google Webmaster Conference in Mountain View on Monday, November 4, and the presentations from the various Googlers were awesome. One “lightning round” covered Image Search and the various projects Google has been working on recently.

An important point covered in that presentation related to “Google moving from just showing images to providing much more context about the great content behind those images”.

Google Images lightning round from the Google Webmaster Conference on Nov 4, 2019
Source: Holly Miller Anderson on Twitter

That has manifested itself with a new design that includes titles and descriptions directly in image search, swipe up with AMP to visit a site, and more. In the past, only images would be displayed, so Google is definitely trying to help users find the content behind the images, which can translate to higher-quality image traffic to your site. So again, it’s important to rank for the images you want to rank for.

That’s a good segue to the case study I’m presenting in this post. :)

Case Study: When One “Top Stories” Listing Is Not Like The Others – The Image Advantage
I received an email one morning from a company with an interesting question. The news publisher was experiencing a weird problem with Google Images, including images in Top Stories. The publisher performed well in Google News, including Top Stories, but images would not show up in the Top Stories listing a good percentage of the time. Sometimes they did, but often they didn’t. And they were ultra-confused about why that was happening.

They explained that there were images in the articles and you could clearly visit those images and see them perfectly when visiting the CDN urls (like many companies, they were using a CDN to store and deliver their images). But when their articles ended up ranking in Top Stories (an extremely visible search feature that can drive a ton of traffic), images often did not accompany the article.

And to make matters worse, the other sites listed in the same Top Stories module did have images, which put this company at a disadvantage from a click-through rate standpoint. Here’s a mockup of what it looked like:

A mockup of what was happening in Top Stories. Images were not showing up.

So, I dug in to hopefully provide some answers. And like many other things in SEO, what’s invisible to the naked eye was causing serious (but logical) problems. Read on.

So what the heck was wrong? The land mine of extra CDN hops…
As I mentioned earlier, the company was using a CDN to house and deliver images. That’s totally fine for handling images and many sites employ CDNs for that very purpose. Even John Mueller has fielded that question a number of times and explained it’s totally fine to use CDNs for that.

But the devil is in the details. Sure, a CDN is fine, but you need to follow the path that Googlebot must follow in order to see if it’s truly fine. So I crawled part of the site and began checking the situation manually (via Chrome and several plugins). It wasn’t long before I could see the problem.

Tracking the path from the site to CDN revealed an extra hop along the way (via a redirect). And that extra hop was extremely problematic. It was through another CDN subdomain… which had me immediately checking the robots.txt file on that subdomain. And low and behold, the urls were blocked by robots.txt.  

So the image urls sent Googlebot to a CDN url, which then sent it to another CDN url (which was blocked by robots.txt), and then to the final CDN url where the image was housed. Since Googlebot couldn’t crawl the extra hop, it was never making its way to the image file. And that’s why images weren’t being indexed, ranking in Google Images, or showing in Top Stories. Not good, to say the least.

The path to the image files revealed an extra hop, which was blocked by robots.txt

To the naked eye (for non-SEOs), you could enter the image url and see that the image was displayed fine. But, the path to that image was problematic. Again, it didn’t take long to see the 301 to 301 to 200 redirect chain. And then checking the second 301 more closely, I cross referenced the robots.txt file for that domain, and boom, it was blocked.

The good news is that this was a relatively straightforward fix (although many of the image urls across a large and complex site needed to be refined). So, the company’s dev team dug in and worked on a better solution. Again, it’s a large and complex site, so no change is easy… But they clearly wanted to move quickly to get this done.  

Side note: It ends up not all images were set up this way across the site. That’s one of the reasons the company was confused about why images weren’t showing up in Top Stories. Some of the images were being indexed fine in Google Images. That’s when the extra hop wasn’t present. This made it harder for the company to debug. For example, they saw some images being indexed fine, and showing up in Top Stories, but many weren’t.

The issue was sort of sinister that way… almost cloaking the bad results with some good.  That said, many images in the latest news articles were having problems in image search (and Top Stories). So that led them to start asking the right questions, which led to the root problem that I covered above.

A fix was implemented and… image traffic surged!
The dev team worked on a better solution and removed that extra hop. Now Googlebot is able to crawl and index the images efficiently and associate those images with the pages at hand. I was eager to see how Google would handle the change. And I was pretty excited with the results.

It’s like Google was just waiting for the fix… Google image search traffic surged and has continued to increase over time. You can clearly see trending for image search surge after the fix was put into place in April. Those metrics continue to move up and to the right. Here are some screenshots of the surge over time:

Since the change, an extra 334K clicks have been driven from image search alone. Unfortunately, it’s hard to gauge the changes in traffic from Top Stories, since there’s no way to specifically report on that in GSC across desktop and mobile, but articles are showing up with images now versus just text (which should be helping with click-through rate).  

For Top Stories, this change levels the playing field for my client. The other articles showing up from other news organizations often had images associated with them, which can greatly help with click-through rate. Now the company’s Top Stories listings also have visuals. It’s great see.

Driving Google Image Performance – Closing Tips
Google Images can definitely drive meaningful traffic, and images can be an important part of Top Stories (which can also drive a lot of traffic for news publishers). As I mentioned earlier, Google has made a serious effort to enhance image search over the past two years or so, and I’m sure more is coming on that front. So, it’s important to make sure you are providing a clear path for crawling and indexing your image content (so that content can end up ranking in Search).

Here are some final bullets containing important image search tips:

  • In order to rank in Google Images, Google needs an image and landing page combination. Images alone will not suffice. Google’s John Mueller has covered this several times.
  • Don’t block Googlebot from crawling your images, whether they are hosted on your own site, on a subdomain, or housed on a CDN. And make sure you follow the full path to those images, which could include extra hops that might be blocked by robots.txt.
  • Crawling your site on a regular basis using tools like DeepCrawl, Screaming Frog, and Sitebulb can help you surface redirect chains and various files blocked by robots.txt. And they do this in bulk. When you combine your toolset with your brainset, good things can happen. :)
  • Check the actual search results and use tools to view screenshots of SERP history to ensure your listings and images look ok. For example, perform actual queries that are yielding Top Stories and view your listings. You can also use tools like SEMrush to view historical screenshots of the SERPs, which can also help you understand how your listings looked over time. All of this can help you understand what’s going on, but can also provide data to stakeholders when a problem exists. As they say, a good image is worth a thousand words. Well, a good SEO screenshot is worth a thousand votes when you need changes implemented.
  • Unfortunately, sitemap reporting in GSC does not contain the number of images indexed anymore. John Mueller once said that’s probably a good thing to have back, so maybe we’ll see that at some point in the future. For now, you really don’t have a solid idea of how many images are being indexed or the problems that could be causing indexing issues. Maybe we’ll see a new sub-report in GSC’s coverage reporting for this in the future… who knows? (Yes, that’s a hint for GSC product managers and engineers!) :)

Summary – Let your images be seen (and crawled and indexed properly)
In closing, a redirect chain combined with a robots.txt directive was causing major problems with image search (and Top Stories) for the news publisher that contacted me. With Google’s focus on improving image search, you don’t want to be left out in the cold if you provide barriers to Googlebot when crawling and indexing your image content.

Sometimes the answer lies in the path to those images… so make sure you double check redirects, redirect chains, and if those urls are accessible to Googlebot.  There just might be some obstacles in Google’s way. And that’s typically not a good thing.

GG

Filed Under: google, seo, tools

Connect with Glenn Gabe today!

Latest Blog Posts

  • How to compare hourly sessions in Google Analytics 4 to track the impact from major Google algorithm updates (like broad core updates)
  • It’s all in the (site) name: 9 tips for troubleshooting why your site name isn’t showing up properly in the Google search results
  • Google Explore – The sneaky mobile content feed that’s displacing rankings in mobile search and could be eating clicks and impressions
  • Bing Chat in the Edge Sidebar – An AI companion that can summarize articles, provide additional information, and even generate new content as you browse the web
  • The Google “Code Red” That Triggered Thousands of “Code Reds” at Publishers: Bard, Bing Chat, And The Potential Impact of AI in the Search Results
  • Continuous Scroll And The GSC Void: Did The Launch Of Continuous Scroll In Google’s Desktop Search Results Impact Impressions And Clicks? [Study]
  • How to analyze the impact of continuous scroll in Google’s desktop search results using Analytics Edge and the GSC API
  • Percent Human: A list of tools for detecting lower-quality AI content
  • True Destination – Demystifying the confusing, but often accurate, true destination url for redirects in Google Search Console’s coverage reporting
  • Google’s September 2022 Broad Core Product Reviews Update (BCPRU) – The complexity and confusion when major algorithm updates overlap

Web Stories

  • Google’s December 2021 Product Reviews Update – Key Findings
  • Google’s April 2021 Product Reviews Update – Key Points For Site Owners and Affiliate Marketers
  • Google’s New Page Experience Signal
  • Google’s Disqus Indexing Bug
  • Learn more about Web Stories developed by Glenn Gabe

Archives

  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • GSQi Home
  • About Glenn Gabe
  • SEO Services
  • Blog
  • Contact GSQi
Copyright © 2023 G-Squared Interactive LLC. All Rights Reserved. | Privacy Policy
This website uses cookies to improve your experience. Are you ok with the site using cookies? You can opt-out at a later time if you wish. Cookie settings ACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience. You can read our privacy policy for more information.
Cookie Consent