{"id":2980,"date":"2021-03-02T09:43:52","date_gmt":"2021-03-02T13:43:52","guid":{"rendered":"https:\/\/www.gsqi.com\/marketing-blog\/?p=2980"},"modified":"2021-03-14T07:48:22","modified_gmt":"2021-03-14T11:48:22","slug":"how-to-use-gsc-crawl-stats-report-site-migrations","status":"publish","type":"post","link":"https:\/\/www.gsqi.com\/marketing-blog\/how-to-use-gsc-crawl-stats-report-site-migrations\/","title":{"rendered":"How To Use GSC&#8217;s Crawl Stats Reporting To Analyze and Troubleshoot Site Moves (Domain Name Changes and URL Migrations)"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-google-search-console.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>For site migrations, I\u2019ve always said that Murphy\u2019s Law is real. \u201cAnything that can go wrong, will go wrong.\u201d You can prepare like crazy, think you have everything nailed down, only to see a migration go sideways once it launches.<br><br>That\u2019s also why I believe that when something does go wrong (which it will), it\u2019s more super-important to address those problems quickly and efficiently. If you can nip migration problems in the bud, you can avoid those problems becoming major issues that impact SEO. That\u2019s why it\u2019s important to prepare as much as you can, have all of the necessary intelligence in front of you while the migration goes live, and then move quickly to attack any problems that arise.<br><br>By the way, if you think you\u2019re immune to site migration problems, then <a href=\"https:\/\/podcasts.google.com\/feed\/aHR0cHM6Ly9zZWFyY2gtb2ZmLXRoZS1yZWNvcmQubGlic3luLmNvbS9yc3M\/episode\/Mzc3YjcxNjAtMTZkZi00MDE4LTk4OTQtZDBjMjI1MDJjOTg0?sa=X&amp;ved=0CAUQkfYCahcKEwjojvqC-o7vAhUAAAAAHQAAAAAQHg&amp;hl=en\">listen to the episode<\/a> of Google\u2019s Search Off The Record podcast where John Mueller, Gary Illyes, Cherry Prommawin, and Martin Splitt talk about the migration of the webmaster central site to the new search central site. It ends up they ran into several problems just like any other site owner could and had to move quickly to rectify those issues. So, if it can happen to Google, it sure can happen to you. :)<\/p>\n\n\n\n<p><strong>Adding Google\u2019s Crawl Stats Reporting To Your Site Migration Checklist:<br><\/strong>There are plenty of checklists and tools out there to help with site migrations. For example, Google\u2019s testing tools in GSC, third-party crawlers like Screaming Frog, DeepCrawl, and Sitebulb, site monitoring tools, log file analysis tools, and more.<br><br>And on the topic of log files, they provide the quickest way to understand how Google is crawling your site post-migration. You don\u2019t need to wait for data to populate in a tool, you don\u2019t have to <strong>guess<\/strong> how Google is treating urls, redirects, etc., and there are several log file analysis tools hungry to consume your logs.<br><br>But there\u2019s a catch\u2026 trying to get log files is like attempting to complete a mission as Tom Cruise in one of his great Mission Impossible movies. If gaining log files was a scene in Mission Impossible, I could hear Tom now:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/mission-impossible-hack.gif\" alt=\"\"\/><\/figure>\n\n\n\n<p>\u201cWait, so we have to scuba dive under a bridge heavily guarded by troops, climb a 200 story building in our underwear (in 20 degree weather), use elaborate yoga moves to dodge a scattered laser security system, steal the ancient lamp of Mueller which is protected by special forces, hack into a computer system protected by six layers of ciphers, download the log files, and then parachute off the building back into the water, only to scuba dive back under the bridge to safety? No problem\u2026 hold my coffee.\u201d<br><br>OK, it\u2019s not that bad, but any SEO that has attempted to get log files from a client knows how frustrating that situation can be. They are huge files, seemingly not owned by one group or person in a company, and you can even find some companies not keeping logs for more than a few days (if that). So, it\u2019s no easy feat to get a hold of them.<br><br>What\u2019s an SEO to do?<\/p>\n\n\n\n<p><strong>Meet The New Crawl Stats Report in GSC: A (Pretty) Good Proxy For Log Files<br><\/strong>In November of 2020 Google launched the <a href=\"https:\/\/developers.google.com\/search\/blog\/2020\/11\/search-console-crawl-stats-report\">new Crawl Stats reporting<\/a> in GSC. The reporting is outstanding, and it was a huge improvement from the previous version. The new reporting provides a boatload of data based on Google crawling your site. I won\u2019t go through all of the reports and data in this post, but you can <a href=\"https:\/\/support.google.com\/webmasters\/answer\/9679690?hl=en\">check out the documentation<\/a> to learn more about each of the report sections.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>I\u2019m going to cover what Google considers \u201csite moves with url changes\u201d, which covers domain name changes and url migrations. I\u2019ll focus on domain name changes, but you can absolutely use the new Crawl Stats reporting to troubleshoot url migrations as well.<\/p>\n\n\n\n<p>For domain name changes, you can view crawl stats reporting for <strong>both<\/strong> the domain you are moving to and the domain you are moving from. So, using the crawl stats reporting can supplement your current migration checks and enable you to see how Google is handling the migration at the source (the old domain).<\/p>\n\n\n\n<p>And for url migrations, you can also surface problems that Google is experiencing post-migration. It\u2019s not as clear as a domain name change, since you can&#8217;t isolate the crawl stats reporting by domain, but it can still help you surface issues based on bulk-changing urls.<\/p>\n\n\n\n<p><strong>Note: There is a delay in the Crawl Stats reporting.<\/strong><br>The Crawl Stats reporting lags by a few days, so log files are still important if you want to see a real-time view of how Google is handling a site migration. The reporting updates daily, but lags by 3-4 days from what I have seen. Below, you can see the report was last updated on 2\/26\/21, but today is 3\/2\/21.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-delay.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>How to identify problems with domain name changes and url migrations using the Crawl Stats reporting in GSC:<br><\/strong>As mentioned above, for domain name changes, you can analyze the Crawl Stats reporting for the domain name you are moving <strong>from<\/strong>, and the domain name you are moving <strong>to<\/strong>. Below, I\u2019ll cover some of the ways you can use the reporting to surface potential issues.<br><br><strong>How To Find The New Crawl Stats Report in Google Search Console (GSC):<\/strong><br>First, I know there&#8217;s some confusion about where the new Crawl Stats report is located. You will <strong>not <\/strong>find the report in the left-side navigation in GSC. Instead, you first need to click &#8220;Settings&#8221;, find the Crawling section of the page which contains top-level crawl stats, and then click &#8220;Open Report&#8221; to view the full Crawl Stats reporting. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-find-report-settings.jpg\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-find-report-gsc.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>Now that you&#8217;ve found the Crawl Stats reporting, here are some of the things you can find when analyzing and troubleshooting a site migration.<\/p>\n\n\n\n<p><strong>404s and Broken Redirects<\/strong>:<br>The Crawl Stats reporting for the domain you are moving from will list urls that Google is crawling that end up as 404s. All urls during a domain name change should map to their equivalent url on the new domain (via 301 redirects). By analyzing the source domain name that\u2019s part of the migration, you can view urls that Googlebot is coming across that end up as 404s. And that can help you find gaps in your 301 redirection plan.<\/p>\n\n\n\n<p>For example, you can see the reporting for a site that went through a domain name change below. 4% of the crawl requests were ending up as 404s when most of the urls should be redirecting to urls that return a 200 header response code on the new domain.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-404.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>If you click into that report, you can see a sample of the top 1,000 urls with that issue and you can inspect the urls as well:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-details.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>And here is what it should look like. 100% of the requests are 301 redirecting to the equivalent urls on the new domain:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-301-correct.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Important (and often confusing) note:<\/strong> It\u2019s worth noting that GSC reports on the <strong>destination url<\/strong>, so a 404 showing up for the old domain name could actually be showing you a redirect to the new domain name, but that new url 404s. In other words, the 404 is actually on the <strong>new domain<\/strong>, but shows up in the reporting for the old domain name. That\u2019s extremely important to understand overall with GSC, and it can cause confusion while analyzing the reporting. I tweeted about this in January with regard to the Coverage reporting:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Reminder: GSC reports on destination urls in the Coverage reporting. So if you see urls that are categorized as blocked by robots.txt or noindexed, but they aren&#39;t, they could be redirecting to urls that are. And that&#39;s what is reported. Can send you off on a wild goose chase: <a href=\"https:\/\/t.co\/QYSWUcVTc1\">pic.twitter.com\/QYSWUcVTc1<\/a><\/p>&mdash; Glenn Gabe (@glenngabe) <a href=\"https:\/\/twitter.com\/glenngabe\/status\/1346453594996940801?ref_src=twsrc%5Etfw\">January 5, 2021<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p><strong>Image Search: Googlebot Image<br><\/strong>If image search is important for your business, you will definitely want to review the \u201cBy Googlebot type\u201d section of the reporting. You will see a listing for \u201cImage\u201d. You can click into that reporting to see the urls Googlebot Image is crawling. If you see 404s, 5XX, etc., then make sure you jump on those issues quickly. You should see plenty of 301s if you <a href=\"https:\/\/www.gsqi.com\/marketing-blog\/how-to-redirect-images-during-website-redesign-or-migration\/\">redirected images<\/a> properly during the migration (which you should). I covered that in my Mythbusting video with Google&#8217;s Martin Splitt about site migrations. The video can be seen later in this post.<br><br>As you can see below, Googlebot Image is coming across 404s as well. This is from the site that went through a domain name change. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-image.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>This is what you should see. Notice how the Googlebot Image requests all properly 301 redirect to the images on the new domain:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-image-301s.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Robots.txt issues:<br><\/strong>In the host issues section, you can see if Googlebot is having problems accessing the robots.txt file for the domain(s) involved. If Google cannot fetch the robots.txt file (which returns a 200 or 403\/404\/410), then it will <strong>not<\/strong> crawl the site at that time. Google will check back later to see if it can fetch the robots.txt file. If it can, then crawling will resume. You can read more details about how this is handled on <a href=\"https:\/\/support.google.com\/webmasters\/answer\/9679690?hl=en#zippy=%2Cmore-robotstxt-availability-details\">Google\u2019s support page<\/a> (or in the screenshot below). Note, you can 404 a robots.txt file and that\u2019s absolutely fine. This is about Google having problems fetching the file (i.e. Google seeing a 429, 5XX).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-new-robots-txt-google.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>And here is what it can look like in GSC\u2019s Crawl Stats reporting. Although this falls under an &#8220;acceptable fail rate&#8221;, I would sure check why the robots.txt fetch is failing at all:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-robots-fail.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Other host issues: DNS Resolution and Server Connectivity:<\/strong><br>Along the same lines, you can see if there are other host-level issues going on. The host reporting also contains DNS resolution errors and server connectivity problems. You obviously want to make sure Google can successfully recognize your hostname and that it can connect to your site.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-host-connectivity-problems.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Performance Problems:<br><\/strong>The reporting also will show pages that are timing out for some reason, so keep an eye on that report. You will find that in the \u201cBy response\u201d section.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-time-out.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Subdomain Issues:<\/strong><br>Hopefully you picked up all subdomains that were in use before you pulled the trigger on the domain name change. But if you didn\u2019t, you can see Crawl Stats reporting <strong>per subdomain<\/strong> that Google is crawling. The catch is that you need a <a href=\"https:\/\/developers.google.com\/search\/blog\/2019\/02\/announcing-domain-wide-data-in-search\">domain property<\/a> set up in GSC for the domain you are moving from (unless you had those subdomains verified and set up already in GSC). <br><br>If you did, you could view the crawl stats reporting for those subdomains separately. Domain properties make this easier since all subdomains being crawled by Google (the top 20 over the past 90 days) will be shown in the Hosts report in the Crawl Stats reporting.<br><br>Below, you can see that the crawl stats reporting shows 17 different subdomains with crawl requests over the past 90 days.  <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-hosts.jpg\" alt=\"\"\/><\/figure>\n\n\n\n<p>Note, I always recommend having a domain property set up. It\u2019s amazing how many companies have not done this yet\u2026 If you haven\u2019t, I would do that today. It doesn\u2019t take long to set up and it covers all protocols and subdomains.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/www.gsqi.com\/images\/crawl-stats-gsc-domain-property.jpg\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p><strong>Crawl Stats For Site Migrations: Final tips And Recommendations<br><\/strong>The Crawl Stats reporting can help site owners and SEOs get closer to log file analysis, when gaining those logs might be tough. Although there\u2019s a lag in the data populating (3-4 days), the Crawl Stats reporting can sure help surface problems during domain name changes and url migrations. And the quicker you can nip those problems in the bud, the less chance they become bigger issues SEO-wise.<br><br><strong>Here are some final tips and recommendations:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Set up domain properties for each of the domains involved in the migration (if changing domain names). This will give you access to all subdomains in the Crawl Stats reporting.<\/li><li>Once data starts populating in the Crawl Stats reporting post-migration, dig into the domain you are moving <strong>from<\/strong>. You might see a number of issues there based on what I explained earlier. For example, 404s, performance issues, robots.txt problems, and more.<\/li><li>Nail the redirection plan. If you see gaps and problems with your 301 redirects, move quickly to rectify those problems. Nip those problems in the bud.<\/li><li>Check for host-level problems (like robots.txt fetch issues, DNS resolution issues, and server connectivity problems). Your redirection plan doesn\u2019t matter if Google can\u2019t successfully connect to your site.<\/li><li>Look for pages that are timing out. This would show up in the \u201cBy response\u201d section of the reporting. If you see that, dig into those problems to see why the pages are timing out. Again, move quickly to address performance issues.<\/li><li>Don&#8217;t forget your images! Make sure to 301 redirect your images and then check the section labeled &#8220;By Googlebot type&#8221;. Then check the &#8220;Image&#8221; reporting to see how Googlebot Image is crawling your content. <\/li><\/ul>\n\n\n\n<p><strong>More About Site Migrations: Mythbusting Video<br><\/strong>If you are interested in site migrations, then you should check out the Mythbusting video I shot with Google\u2019s Martin Splitt. In the video, we cover a number of important topics including domain name changes, url migrations, redirecting images, when a site should revert a migration, site merges, the Change of Address Tool in GSC, and more.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Site Migrations: SEO Mythbusting\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/bGPB-rtxt-I?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><strong>Summary \u2013 GSC\u2019s Crawl Stats as a proxy for log files.<br><\/strong>After reading this post, I hope you see the power in adding Google\u2019s Crawl Stats reporting to your site migration checklist. The reporting provides a boatload of great information based on Google crawling your site post-migration. I\u2019ve found it extremely helpful while helping companies monitor and troubleshoot domain name changes, url migrations, and more. And remember, Murphy\u2019s Law is real for site migrations. Things will go wrong\u2026 which is ok. The important part is how quickly you handle and rectify those problems. <br><br>GG<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For site migrations, I\u2019ve always said that Murphy\u2019s Law is real. \u201cAnything that can go wrong, will go wrong.\u201d You can prepare like crazy, think you have everything nailed down, only to see a migration go sideways once it launches. That\u2019s also why I believe that when something does go wrong (which it will), it\u2019s &#8230; <a title=\"How To Use GSC&#8217;s Crawl Stats Reporting To Analyze and Troubleshoot Site Moves (Domain Name Changes and URL Migrations)\" class=\"read-more\" href=\"https:\/\/www.gsqi.com\/marketing-blog\/how-to-use-gsc-crawl-stats-report-site-migrations\/\" aria-label=\"Read more about How To Use GSC&#8217;s Crawl Stats Reporting To Analyze and Troubleshoot Site Moves (Domain Name Changes and URL Migrations)\">Read more<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,3,5],"tags":[],"class_list":["post-2980","post","type-post","status-publish","format-standard","hentry","category-google","category-seo","category-tools","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/posts\/2980","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/comments?post=2980"}],"version-history":[{"count":74,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/posts\/2980\/revisions"}],"predecessor-version":[{"id":3064,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/posts\/2980\/revisions\/3064"}],"wp:attachment":[{"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/media?parent=2980"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/categories?post=2980"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gsqi.com\/marketing-blog\/wp-json\/wp\/v2\/tags?post=2980"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}