Site Crawl, Day 1: Where Do You Start?
Site Crawl, Day 1: Where Do You Start?
Posted by Dr-Pete
When you’re faced with the many thousands of potential issues a large site can have, where do you start? This is the question we tried to tackle when we rebuilt Site Crawl. The answer depends almost entirely on your site and can require deep knowledge of its history and goals, but I’d like to outline a process that can help you cut through the noise and get started.
Simplistic can be dangerous
Previously, we at Moz tried to label every issue as either high, medium, or low priority. This simplistic approach can be appealing, even comforting, and you may be wondering why we moved away from it. This was a very conscious decision, and it boils down to a couple of problems.
First, prioritization depends a lot on your intent. Misinterpreting your intent can lead to bad advice that ranges from confusing to outright catastrophic. Let’s say, for example, that we hired a brand-new SEO at Moz and they saw the following issue count pop up:
Almost 35,000 NOINDEX tags?! WHAT ABOUT THE CHILDREN?!!
If that new SEO then rushed to remove those tags, they’d be doing a lot of damage, not realizing that the vast majority of those directives are intentional. We can make our systems smarter, but they can’t read your mind, so we want to be cautious about false alarms.
Second, bucketing issues by priority doesn’t do much to help you understand the nature of those problems or how to go about fixing them. We now categorize Site Crawl issues into one of five descriptive types:
- Critical Crawler Issues
- Crawler Warnings
- Redirect Issues
- Metadata Issues
- Content Issues
Categorizing by type allows you to be more tactical. The issues in our new “Redirect” category, for example, are going to have much more in common, which means they potentially have common fixes. Ultimately, helping you find problems is just step one. We want to do a better job at helping you fix them.
1. Start with Critical Crawler Issues
That’s not to say everything is subjective. Some problems block crawlers (not just ours, but search engines) from getting to your pages at all. We’ve grouped these “Critical Crawler Issues” into our first category, and they currently include 5XX errors, 4XX errors, and redirects to 4XX. If you have a sudden uptick in 5XX errors, you need to know, and almost no one intentionally redirects to a 404.
You’ll see Critical Crawler Issues highlighted throughout the Site Crawl interface:
Look for the red alert icon to spot critical issues quickly. Address these problems first. If a page can’t be crawled, then every other crawler issue is moot.
2. Balance issues with prevalence
When it comes to solving your technical SEO issues, we also have to balance severity with quantity. Knowing nothing else about your site, I would say that a 404 error is probably worth addressing before duplicate content — but what if you have eleven 404s and 17,843 duplicate pages? Your priorities suddenly look very different.
At the bottom of the Site Crawl home, check out “Moz Recommends Fixing”:
We’ve already done some of the math for you, weighting urgency by how prevalent the issue is. This does require some assumptions about prioritization, but if your time is limited, we hope it at least gives you a quick starting point to solve a couple of critical issues.
3. Solve multi-page issues
There’s another advantage to tackling issues with high counts. In many cases, you might be able to solve issues on hundreds (or even thousands) of pages with a single fix. This is where a more tactical approach can save you a lot of time and money.
Let’s say, for example, that I want to dig into my 916 pages on Moz.com missing meta descriptions. I immediately notice that some of these pages are blog post categories. So, I filter by URL:
I can quickly see that these pages account for 392 of my missing descriptions — a whopping 43% of them. If I’m concerned about this problem, then it’s likely that I could solve it with a fairly simple CMS page, wiping out hundreds of issues with a few lines of code.
In the near future, we hope to do some of this analysis for you, but if filtering isn’t doing the job, you can also export any list of issues to CSV. Then, pivot and filter to your heart’s content.
4. Dive into pages by PA & crawl depth
If you can’t easily spot clear patterns, or if you’ve solved some of those big issues, what next? Fixing thousands of problems one URL at a time is only worthwhile if you know those URLs are important.
Fortunately, you can now sort by Page Authority (PA) and Crawl Depth in Site Crawl. PA is our own internal metric of ranking ability (primarily powered by link equity), and Crawl Depth is the distance of a page from the home-page:
Here, I can see that there’s a redirect chain in one of our MozBar URLs, which is a very high-authority page. That’s probably one worth fixing, even if it isn’t part of an obvious, larger group.
5. Watch for spikes in new issues
Finally, as time goes on, you’ll also want to be alert to new issues, especially if they appear in large numbers. This could indicate a sudden and potentially damaging change. Site Crawl now makes tracking new issues easy, including alert icons, graphs, and a quick summary of new issues by category:
Any crawl is going to uncover some new pages (the content machine never rests), but if you’re suddenly seeing hundreds of new issues of a single type, it’s important to dig in quickly and make sure nothing’s wrong. In a perfect world, the SEO team would always know what changes other people and teams made to the site, but we all know it’s not a perfect world.
I hope this gives you at least a few ideas for how to quickly dive into your site’s technical SEO issues. If you’re an existing customer, you already have access to Moz’s new Site Crawl and all of the features discussed in this post.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!