SEO Audit Checklist

Authors: Bill Slawski, Chris Countey, Michael Stricker, Jeremy Niedt and Matt Haran

In addition to providing useful information, we’re also testing how Google will attribute authorship when multiple authors are listed with correct markup. For fun, one author is using his Twitter profile as the hook.

Table of Contents

  1. Site Architecture
  2. Technical/Server Issues
  3. HTML Use/Analysis
  4. Content Review
  5. Negative Practices
  6. Keywords
  7. Webmaster Tools
  8. Social Media

The following is a high level checklist of issues that should be explored on a site. We will be drilling deeper into each part of this checklist in the weeks to come, providing examples and tips and suggestions for speeding up how these things are checked upon, and expanding upon the checklist itself as we go along. We will also be shuffling around these issues to prioritize them, and indicate which things should be checked first, and so on.

We will also try to include sources and documents about these different issues as well.

If you have any suggestions, questions, or advice to add, please either let me know directly @chriscountey, or use the comments.

Site Architecture

Canonical URLs (Best Page Addresses)

- Access to pages on domain (www vs. non-www)
- Home Page linking consistency
- Capitalization/Lower Case (capitals in domain name ok, in folders and files a potential problem)
- Print Versions (CSS Rather than crawlable duplicate PDFs/Docs
- Canonical Link Elements – do they match up right?
- Rel Prev/Next link elements for paginated pages?
- Internal Redirects (internal 301 redirects avoided)

Robots.txt File

- Correctly formatted
- Includes all it should (including cart pages, email referral pages, login pages)
- Includes link to XML sitemap or XML Sitemap Index

Meta robots noindex/nofollow

- Used Appropriately
- Used on pages that a deep crawler might try to index (like form and search results pages)

Category/Site Structure (URLS and Information Architecture)

- Unique and User Friendly
- Use of appropriate category and sub-category link structures
- Customer orientated rather than feature orientated
- Provides tasks/Options for different personas

Choosing File Names

- Uses hyphens as word separators
- Unique
- Avoids Keyword Stuffing
- If file names to be changed, links on site changed, and 301s set up for external visitors

Custom Error Page

- Sends proper 404 code status
- no soft 404s
- Helpful to visitor (navigation, directories, search)

HTML Sitemap

- Organized into user friendly and user oriented categories
- Provides links to most important pages
- Avoids using too many links
- Doesn’t include 404s or links that redirect internally

XML Sitemap

- Properly formatted (XML proper encoding)
- Uses only canonicals
- No 404s and no internally redirected pages
- Submitted to GWT and Bing Tools

Jump to Table of Contents

Technical/Server Issues

OS/Server/CMS/Catalog Considerations

Server Status: Messages 200, 300, 400, 500

Secure Server | HTTPS Protocol

- No error messages
- No https bleed-over to pages that aren’t supposed to be https
- No certificate authority errors

Search Friendly Links

- All links to be indexed reachable by text-based links or “href” and “src”.

Broken and Redirected Links

- Broken links identify, links removed or replaced
- All 301 redirected links replaced with direct links

External Links

- Checked for broken links and redirects and replaced where appropriate
- Pages linked to checked for repurposed content

Duplicated Content

- Internally (see canonical section above)
- Mirrors identified and disallowed/noindexed as appropriate
- Substantially duplicated content on self-owned other sites removed/changed/blocked
- Substantially duplicated content on other sites removed (friendly email, AUP letter to host, DMCA)

JavaScript

- Can pages be navigated with javascript disabled? If not, are URLs for pages accessible in HTML code with “href” and “src”?
- If Ajax is necessary, is Google’s hashbang approach used?

Dynamic Pages

- Avoid session IDs in URLs
- Avoid excessive multiple data parameters in URLs
- Avoid excessive processor calls
- Avoid calls to multiple servers as much as possible
- Avoid keyword insertion pages (pages were the content is substantially the same except for keywords that are inserted into the pages).
- Keep boilerplate (disclaimers, copyright notices, other text that appears on most pages) that exists on templates light.
- Label page segments semantically well (the div class for those could be things such as header, footer, sidebar, advertisement, or whichever is most appropriate.)

Page Load Times

- Images compressed for right dimensions and for file sizes?
- GZIP or Deflate used?
- Base 64 encoding for images avoided?
- External CSS and Javascript used and minimized?
- Long browser caching dates?
- CDN in use where appropriate?
- Other Page Speed considerations

Cookies

– Navigation of indexable pages possible without accepting them?

Jump to Table of Contents

HTML Use/Analysis

Deprecated HTML/HTML Validation

- If invalid, are errors the type that will harm SEO?

Cascading Style Sheets (CSS)

- If invalid, are errors the type that will harm SEO?

Title Elements

- Relevant to the content of the page and be keyword-rich.
- Meaningful and able to stand on its own as a description of the page it titles.
- Persuasive and Engaging to those who see it out of context
- As unique as possible compared to other titles on the site
- If the name of the site appears in the title, it should be at the end of the title, and not at the beginning, unless it is the home page.
- No more than ten words or roughly 60-70 characters in length.
- Unique if possible compared to titles from other sites.

Meta Description Elements

- Descriptive of the content of the page
- Includes the main keyword phrase the page is optimized for
- Engaging and persuasive to viewers who see it out of context (search snippets or social shares)
- Around 25 words or 150 characters in length
- Well written sentences, using good punctuation
- One sentence preferable, but two alright if keywords are in the longer sentence
- Preferable to have keywords as close to the start as appropriate

Heading Elements

- Top level heading should describe the content of the page
- Lower level headings should effectively describe the content they head
- One top level heading preferable per page
- Headings should be used like headings in an outline, in proper order
- Main and subheadings can, and should contain targeted keywords if possible and appropriate.
- A heading element should not be used for the page logo
- Headings for lists and sections in page navigation should use CSS to style them rather than heading elements.

Strong/Em Elements

- For bold text, use the “strong” HTML element.
- For Italics text, use the “em” HTML element
- Use Strong and Em to highlight the use of keywords and related words
- When bolding or italicizing other text on a page, use CSS to style how it looks
- Don’t over use bold or italics – emphasizing too much means emphasizing nothing.

Image Optimization

- Use alt text for images on a page that are meaningful
- Use captions for images on a page that are meaningful
- A caption for an image should be contained within the same HTML element as the image (like a div)
- Select images that are meaningful that are related to the keywords optimized for
- Use the chosen optimized keywords in the alt text and captions where appropriate
- Use file names that reflect those keywords where appropriate.
- Use hyphens to separate words in image file names.
- Use alt=”" for images that aren’t meaningful like decorations or bullet points
- Use alt text for logos that are descriptive of the business or organization
- Larger images with better resolution might be ranked a little better than smaller and lower resolution images.
- Alt text should not be a list of keywords, but can contain a keyword phrase.
- Alt text shouldn’t be more than 10 words or so.
- Avoid keyword stuffing alt text, captions, and image file names.

Anchor Text

- Keywords should be used in anchor text
- If the keywords for a page being pointed to aren’t used, related terms should be
- Anchor text used in navigation should be descriptive of what is on the page linked to
- Anchor text should not use generic terms such as “click here.”
- Anchor text shouldn’t be longer than 10 words or so if possible
- Anchor text shouldn’t be stuffed with multiple keywords

Meta Data optimization

- Search engines do not use Dublin core meta tags
- Search engines do not use the revisit meta tag
- A robots index, follow tag is unnecessary and redundant
- a NOODP will keep Google and Bing from using Open directory project titles instead of title element titles, if the site is even listed in DMOZ

Jump to Table of Contents

Content Review

Amount of Text

- Having some minimum amount of text on a page (200 words?) gives search spiders something to index.

Spelling Errors

- Possible quality signal
- Important to credibility

Keyword Use in Copy

- Are keywords chosen for a page being used in page titles, meta descriptions, headings, and content

Keyword Prominence/Visual Segmentation

- How well does the HTML code of a page show how it’s broken down into different blocks (heading, main content, sidebars, footers, etc.)
- Are keywords used in the different sections, and especially in the main content area of pages?

Use of Related Words/Phrases

- Some words tend to co-occur on pages ranked highly for a certain query (or categories of results for queries), and it can help in the rankings for a page to use some of those phrases.

Penguin/Panda Analysis

Is there a loss in traffic that corresponds to one of the Panda or Penguin updates?

Resource: http://www.seomoz.org/google-algorithm-change

Jump to Table of Contents

Negative Practices

Hidden Text

- Is there text on pages in the same font color as the background?
- Is there text on pages hidden through an offset div?
- Is there a large amount of text on pages in small iframes or CSS scrolling overflows
- Is there text in a font color that matches the font color as the page background that might be mistaken as hidden text?

Cloaking

- Does the site use cloaking to show search engines one thing and visitors something else?

Meta Refresh

- Are meta refreshes used instead of redirects, and if so might they be used in a way which might deceive search engines?

JavaScript Redirection

- Is javascript redirection being used so that search engines see one thing, and visitors see something else?

Outward Links/Link Exchanges

- Is the site using link directory pages that promise being listed in exchange for a link?

Keywords

Keyword Research, Selection and Implementation

- Are relevant, competitive, appropriate and popular keywords being used on the pages of the site?
- Are those keywords being used effectively on those pages?

Keyword Focusing | Mid- to Long-Tail Key Phrases

- Do the main pages of the site focus upon more competitive keyword phrases?
- Do deeper pages with less pagerank focus upon long-tail phrases?

Webmaster Tools

Google Webmaster Tools/Errors Analysis*

- Has the site been verified with GWT?
- Has a choice of “www” setting been made? (Doesn’t have to be if domain access issues are addressed)
- Has a targeted country/location been selected? (Doesn’t have to be)
- Have any errors listed been checked upon?

Jump to Table of Contents

Social Media

Social Media Audit | Status

- Does the site integrate appropriate social sharing buttons?
- Do the pages of the site provide links to social profiles for the site?

On-Site Social Engagement

- Does the site provide ways to give feedback to the site owners?
- Does the site provide a way to leave comments?
- Is there user generated content on the site, such as reviews and ratings, and does it use rich snippets if so?
- Are there public user/member profile pages, and if so how rich are they in terms of features?
- Is there a forum on the site, and if so, some guidelines for its use?

Analytics

Have analytics been set up for the site?
- Code on every page

Want more? Check out this awesome technical SEO checklist: http://www.seomoz.org/blog/how-to-do-a-site-audit

  • http://www.seounique.com/blog Matt Ridout

    Great checklist, lots of detail there – enough to last months!

    • Chris Countey

      Thanks Matt!

  • Pingback: SEO Audit Checklist | WebiMax Internet Marketing Blog - Inbound.org

  • James

    Curious, why should internal 301′s be avoided?

    • Chris Countey

      Can you give me a scenario? Servers should return the most accurate response for each situation. For example, if you retire a page, I would not use a 301 unless you are really just changing URLs.

      • http://www.jamesthrasher.com James

        Eh, that’s probably the scenario. I have URL’s that need to change from time to time, so I was wondering if there was a specific reason I shouldn’t be 301 redirecting internal pages.

        • https://twitter.com/bill_slawski Bill Slawski

          Hi James,

          There are at least a couple of reasons why you want to avoid or limit the number of 301 redirects that you use on a site, especially when they are internal redirects:

          1. Each redirect has a processing cost with a call to your server’s processor (an additional HTTP request-response cycle). If your site has a lot of internal redirects, it can slow down your pages. See: https://developers.google.com/speed/docs/best-practices/rtt#AvoidRedirects

          2. Using a 301 redirect instead of a direct link can reduce the amount of PageRank that your pages pass along to each other (per an email sent by Matt Cutts to Eric Enge as a note in an interview of Matt by Eric – see: http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml)

          3. If you let redirects build up over time so that you have chains of redirects, Googlebot may stop following the chain after 3-4 redirects.

          • http://www.workwithclintbutler.com Clint Butler

            What if you delete a lot of content and your redirecting to your home page. The 301 only exists on those url’s that are no longer there?

  • http://www.williamalvarez.com William Alvarez

    Hi Chris,
    This looks like a cookie-cutter SEO audit. Keep in mind that every website suffers from different issues and they should be analyzed in a different way. If you limit your audit to this list, you may be ignoring other problems not visible with this check list alone. Every CMS has its own challenges, not to mention CDNs, etc.

    • Chris Countey

      Hi William,

      Thanks for your comment. I think this looks cookie-cutter only because these are best practices. Implementing these items from a technical and a strategic standpoint will vary with each site and each site’s goals. This is more of a “what to look for” not so much a “how to do it”.

    • https://twitter.com/bill_slawski Bill Slawski

      Hi William,

      I’m going to disagree with your assessment of this audit checklist as a cookie cutter audit. :)

      Every website is different, but the issues identified in this audit are common problems that are often found in many or most websites, and are things that should ideally be checked upon.

      No one said not to look at other issues as well and no one said to limit an audit only to these issues. But failing to check for the kinds of things identified here can mean failing to perform due diligence. Of course, when you do an audit on a specific CMS or certain types of sites, and there are known issues (not included here) and other potential problems, those things should be checked as well. Have a WordPress blog? Make sure that you don’t create image pages wrapped in the template, but without the text for those pages. Creating a page you want included in Google News? Make sure that you fulfill the technical requirements necessary to be included.

      This audit doesn’t replace common sense and thoughtful analysis, but hopefully people reading this post will find it useful. Of course you should go through Google Webmaster Tools and Bing tools, and look for issues and errors that need identifying that might be outside the parameters of this checklist, like IP canonicalization. You should also look through “site” search results to see if other things stand out as well.

      This audit shouldn’t replace the kind of detailed analysis that anyone should do when they do SEO. But using it or something like it is going to help make sure that you don’t miss many things that should be checked anyway.

      Thanks.

  • http://www.whitehat-seo.co.uk/ Michael Talburt

    Great list! This is very helpful to me and for my website!

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Michael,

      Good to hear that you find it useful. Thanks.

  • http://www.webtechservicesinc.com Kevin

    I love two things about this post:

    1. The fact that you don’t list W3C validation as one of your checklist items. It has it’s place, but not necessarily in an SEO audit doc.

    2. That you list a lot of common-sense language. Like the Validation and Keyword research sections. And you also list items that do not necessarily need to be done (like choosing www versus non-www version in GWT).

    Overall I think this checklist is great because it contains flexible language, as non-essential items are commented as such & common-sense language is used (like in the keyword research section).

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Kevin,

      Thanks. W3C validation is worth doing to make sure of things like whether or no you might end up in quirks mode on different browsers, or it can help with issues that might speed things up, or if you’ve made some kind of mistake that you didn’t realize you may have made. Some HTML errors could have a negative impact on SEO, and it’s worth checking just to make sure that none are. Plus checking gives you a chance to see issues that you might want to address when you have the time to focus upon them.

      I tried to keep the language simple in those sections, so it’s great to get some validation that it is. Appreciate that you did. Setting a specific subdomain is something that you can do in GWT, but that only really works with Google, and I’d rather just fix the issue itself. :)

  • http://www.mediawhiz.com Marjory Meechan

    Hi Bill,

    I agree with you. This is a great checklist and not cookie-cutter at all. It’s so easy to get pulled down the rabbit hole of issues with a complicated website and miss something that should have been checked almost as a matter of rote. This checklist is a great way to get the standard stuff out of the way so you can focus on the more complicated issues.

    I especially like the way this checklist is phrased. The focus on relevance through related items and word placement rather than keyword repetitions or “H” tags makes it so clear that the quality of the site isn’t about HTML tricks. It’s about making a quality site for users.

    Marjory
    p.s. This is in my bookmarks now.

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Marjory

      I’ve been using a checklist that’s very close to this for years as a starting point for SEO audits on many sites, and you’re right about having a useful focus and methodology. I like starting out with a look at this issues because it does get them out of the way, and enables you to look at things that might not be garden variety problems.

      It’s definitely about making it a quality site for users, and I tend to think of the things that I’ve listed as building the foundation for a site so that it can become successful.

      Thank you!

  • http://andymarchant.com Andy Marchant

    Good article, describes the very basic fundamental issues that need to be covered in an audit process. But its in the fine details and experience of the auditor where the real value comes from.

    I am more interested as to how the multiple Google Author markup is going to work…

    I wonder if it would make a difference if you was to stick a “+” in front of one of the names… Currently, (and without checking) It must be the first authors face, Bill in this case, that will come up in the SERP’s… Interesting test!

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Andy,

      Good questions. I definitely wanted to make sure that the audit covers the fundamentals, and I agree that knowing where you need to dig more deeply, and where things are happening that aren’t covered by addressing the fundamentals alone is what can make a tremendous difference.

      When we first posted this, we listed a number of people as authors, with Chris Countey as the main author after the byline. Not too long after publishing, we did find the post ranking in Google SERPs with Chris listed as the author, and none of the other authors listed. We changed it to me as the main author (on the byline as opposed to below it). This morning, Google was showing me as the author of the post, with my picture showing as the profile next to the snippet in Google search results.

      This does verify what Google’s John Mueller said would probably happen over at Google Plus when an author is changed for a post.

      It doesn’t tell us though whether or not Google might give the other authors credit for the post, but I suspect that Google Plus doesn’t have that capability yet.

      It is worth experimenting with more. Did notice a WordPress plugin that can list multiple authors in a byline, and trying that out might be next.

  • http://www.linkedin.com/in/socoastal Justin Urich

    Yet another homerun article Bill. Did I learn anything new? No. BUT, you’ve organized everything I already know into a neat and tidy all-in-one resource which is super helpful.

    Thanks for taking the time to write it and keep ‘em coming!

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Justin,

      Thanks. That was the point of putting this together – to make it as convenient as possible to make sure all the different issues listed are looked at.

      Glad to hear that you appreciated it.

      Bill

      • http://twitter.com/RagilPembayun Ragil Pembayun

        Hi Bill,

        This is great stuff and I seconded Justin’s comment :)

        It would also be nice if there was a pdf or excel version of this so that it’s downloadable, not that I don’t appreciate it as it is.

        What’s happened with SEO by the sea by the way? Has it ceased to exist? I certainly hope not as I always enjoy reading and learning stuff from there! :)

        Cheers,
        Ragil

        • https://twitter.com/bill_slawski Bill Slawski

          Hi Ragil,

          Thanks. No PDF or Excel version in the works at this point. But something we might think about.

          SEO by the Sea is fine. My posting has dropped off a little due to being pretty busy, and to not finding patents or papers that I’ve been very excited to write about. Thanks for your kind words about it.

  • http://www.seriocomic.com seriocomic

    “Base 64 encoding for images avoided?” – why? Base 64 encoding of boilerplate images is a great way to reduce the number/latency of the HTTP aspects – even better if you can Base64 encode your sprites. All modern browsers accept them…

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Seriocomic,

      Internet Explorer is the main reason why that’s recommended. IE7 and below don’t support it, and IE8 only has limited support. IE9 removed most of the limitations: http://en.wikipedia.org/wiki/Data_URI_scheme

  • http://www.globalsearchtrends.com/ jelly andrews

    This is such an informative post. Impressive! Thanks for sharing something so meaningful.

  • http://www.seolondonsurrey.co.uk c byrne

    Hi Bill,

    Thanks for sharing your deep knowledge of SEO. Would you be so kind as to tell us the tools you use for an audit pls?

    Thanks,

    Chris

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Chris,

      You’re welcome. For the onsite/onpage parts of an SEO audit, I primarily use Xenu Link Sleuth, Screaming Frog Crawler, Excel (to sort things, and to do things like highlight duplicates such as URLs and canonicals), Notepad, Google Webmaster Tools, Google/Bing Search Results, and W3C validators, browsers where I can turn on and off things like cookies, javascript, images, CSS, etc.

  • http://www.blakestrategiesgroup.com Jonathon

    I don’t think this is entirely comprehensive, but it’s a great place to start if you want a quick overview. I agree that every site is different, so looking ONLY at the items here might not be a good idea, but still a worthwhile list.

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Jonathan

      Thanks.

      Of course this audit doesn’t contain much in the way of off site SEO, detailed social sharing and use of social elements in a site such as user-generated content and user profiles, IP canonicalization, using Google Webmaster Tools to identify and refine titles and meta descriptions that might need work, reducing http requests, and more. I’ve run across many different issues that are unique to individual sites that also didn’t make it into this checklist, but that doesn’t make it worth any less.

      I did set out to create a checklist, and not a book. :)

      Looking at the things listed here is a VERY GOOD idea, but failing to look at other things too isn’t.

  • Pingback: Gianluca Fiorelli’s Super Search Update - State of Search

  • Pingback: Google To Merge Mobile And Destktop Advertising, Facebook's Stock Gets A Bump, Twitter Looks To Replace "Like" Button, & More Rocket Clicks Blog

  • http://twitter.com/aschottmuller Angie Schottmuller (@aschottmuller)

    Thanks for sharing, Bill! I agree there’s no cookie-cutter approach. I’m doing an SEO Audit now, and it’s always helpful to see how other SEOs have structured their reports.

    • https://twitter.com/bill_slawski Bill Slawski

      Hi Angie,

      You’re welcome. I really enjoy seeing audits from other people as well, and what they do to call attention to different factors within the audit, and how they illustrate them. That’s something that it’s always interesting to work upon, and try different things with.

  • http://parvezweblog.blogspot.com/ Sohel Parvez

    Thanks Bill, yup your list covers all important criteria depends on SEO.

  • Pingback: Setting Up Webmaster Tools - Chris Countey

  • Pingback: Madrid Girl Geek » Blog Archive » Las Plantillas que Necesitas para Informes de Auditoría SEO Accionables

  • Pingback: How to Get a Good Guest Blogger for Your Site - Guest Posts

  • Pingback: A Collection of Guides and Resources: SEO, Social, Content - AuthorityLabs

  • Pingback: 50+ SEO Audit Questions and the Tools to Answer Them

  • Pingback: A Collection of Site Audit and Tool List Resources - AuthorityLabs