SEO Audit Checklist

Authors: Bill Slawski, Chris Countey, Michael Stricker, Jeremy Niedt and Matt Haran

In addition to providing useful information, we’re also testing how Google will attribute authorship when multiple authors are listed with correct markup. For fun, one author is using his Twitter profile as the hook.

Table of Contents

  1. Site Architecture
  2. Technical/Server Issues
  3. HTML Use/Analysis
  4. Content Review
  5. Negative Practices
  6. Keywords
  7. Webmaster Tools
  8. Social Media

The following is a high level checklist of issues that should be explored on a site. We will be drilling deeper into each part of this checklist in the weeks to come, providing examples and tips and suggestions for speeding up how these things are checked upon, and expanding upon the checklist itself as we go along. We will also be shuffling around these issues to prioritize them, and indicate which things should be checked first, and so on.

We will also try to include sources and documents about these different issues as well.

If you have any suggestions, questions, or advice to add, please either let me know directly @chriscountey, or use the comments.

Site Architecture

Canonical URLs (Best Page Addresses)

- Access to pages on domain (www vs. non-www)
- Home Page linking consistency
- Capitalization/Lower Case (capitals in domain name ok, in folders and files a potential problem)
- Print Versions (CSS Rather than crawlable duplicate PDFs/Docs
- Canonical Link Elements – do they match up right?
- Rel Prev/Next link elements for paginated pages?
- Internal Redirects (internal 301 redirects avoided)

Robots.txt File

- Correctly formatted
- Includes all it should (including cart pages, email referral pages, login pages)
- Includes link to XML sitemap or XML Sitemap Index

Meta robots noindex/nofollow

- Used Appropriately
- Used on pages that a deep crawler might try to index (like form and search results pages)

Category/Site Structure (URLS and Information Architecture)

- Unique and User Friendly
- Use of appropriate category and sub-category link structures
- Customer orientated rather than feature orientated
- Provides tasks/Options for different personas

Choosing File Names

- Uses hyphens as word separators
- Unique
- Avoids Keyword Stuffing
- If file names to be changed, links on site changed, and 301s set up for external visitors

Custom Error Page

- Sends proper 404 code status
- no soft 404s
- Helpful to visitor (navigation, directories, search)

HTML Sitemap

- Organized into user friendly and user oriented categories
- Provides links to most important pages
- Avoids using too many links
- Doesn’t include 404s or links that redirect internally

XML Sitemap

- Properly formatted (XML proper encoding)
- Uses only canonicals
- No 404s and no internally redirected pages
- Submitted to GWT and Bing Tools

Jump to Table of Contents

Technical/Server Issues

OS/Server/CMS/Catalog Considerations

Server Status: Messages 200, 300, 400, 500

Secure Server | HTTPS Protocol

- No error messages
- No https bleed-over to pages that aren’t supposed to be https
- No certificate authority errors

Search Friendly Links

- All links to be indexed reachable by text-based links or “href” and “src”.

Broken and Redirected Links

- Broken links identify, links removed or replaced
- All 301 redirected links replaced with direct links

External Links

- Checked for broken links and redirects and replaced where appropriate
- Pages linked to checked for repurposed content

Duplicated Content

- Internally (see canonical section above)
- Mirrors identified and disallowed/noindexed as appropriate
- Substantially duplicated content on self-owned other sites removed/changed/blocked
- Substantially duplicated content on other sites removed (friendly email, AUP letter to host, DMCA)

JavaScript

- Can pages be navigated with javascript disabled? If not, are URLs for pages accessible in HTML code with “href” and “src”?
- If Ajax is necessary, is Google’s hashbang approach used?

Dynamic Pages

- Avoid session IDs in URLs
- Avoid excessive multiple data parameters in URLs
- Avoid excessive processor calls
- Avoid calls to multiple servers as much as possible
- Avoid keyword insertion pages (pages were the content is substantially the same except for keywords that are inserted into the pages).
- Keep boilerplate (disclaimers, copyright notices, other text that appears on most pages) that exists on templates light.
- Label page segments semantically well (the div class for those could be things such as header, footer, sidebar, advertisement, or whichever is most appropriate.)

Page Load Times

- Images compressed for right dimensions and for file sizes?
- GZIP or Deflate used?
- Base 64 encoding for images avoided?
- External CSS and Javascript used and minimized?
- Long browser caching dates?
- CDN in use where appropriate?
- Other Page Speed considerations

Cookies

– Navigation of indexable pages possible without accepting them?

Jump to Table of Contents

HTML Use/Analysis

Deprecated HTML/HTML Validation

- If invalid, are errors the type that will harm SEO?

Cascading Style Sheets (CSS)

- If invalid, are errors the type that will harm SEO?

Title Elements

- Relevant to the content of the page and be keyword-rich.
- Meaningful and able to stand on its own as a description of the page it titles.
- Persuasive and Engaging to those who see it out of context
- As unique as possible compared to other titles on the site
- If the name of the site appears in the title, it should be at the end of the title, and not at the beginning, unless it is the home page.
- No more than ten words or roughly 60-70 characters in length.
- Unique if possible compared to titles from other sites.

Meta Description Elements

- Descriptive of the content of the page
- Includes the main keyword phrase the page is optimized for
- Engaging and persuasive to viewers who see it out of context (search snippets or social shares)
- Around 25 words or 150 characters in length
- Well written sentences, using good punctuation
- One sentence preferable, but two alright if keywords are in the longer sentence
- Preferable to have keywords as close to the start as appropriate

Heading Elements

- Top level heading should describe the content of the page
- Lower level headings should effectively describe the content they head
- One top level heading preferable per page
- Headings should be used like headings in an outline, in proper order
- Main and subheadings can, and should contain targeted keywords if possible and appropriate.
- A heading element should not be used for the page logo
- Headings for lists and sections in page navigation should use CSS to style them rather than heading elements.

Strong/Em Elements

- For bold text, use the “strong” HTML element.
- For Italics text, use the “em” HTML element
- Use Strong and Em to highlight the use of keywords and related words
- When bolding or italicizing other text on a page, use CSS to style how it looks
- Don’t over use bold or italics – emphasizing too much means emphasizing nothing.

Image Optimization

- Use alt text for images on a page that are meaningful
- Use captions for images on a page that are meaningful
- A caption for an image should be contained within the same HTML element as the image (like a div)
- Select images that are meaningful that are related to the keywords optimized for
- Use the chosen optimized keywords in the alt text and captions where appropriate
- Use file names that reflect those keywords where appropriate.
- Use hyphens to separate words in image file names.
- Use alt=”" for images that aren’t meaningful like decorations or bullet points
- Use alt text for logos that are descriptive of the business or organization
- Larger images with better resolution might be ranked a little better than smaller and lower resolution images.
- Alt text should not be a list of keywords, but can contain a keyword phrase.
- Alt text shouldn’t be more than 10 words or so.
- Avoid keyword stuffing alt text, captions, and image file names.

Anchor Text

- Keywords should be used in anchor text
- If the keywords for a page being pointed to aren’t used, related terms should be
- Anchor text used in navigation should be descriptive of what is on the page linked to
- Anchor text should not use generic terms such as “click here.”
- Anchor text shouldn’t be longer than 10 words or so if possible
- Anchor text shouldn’t be stuffed with multiple keywords

Meta Data optimization

- Search engines do not use Dublin core meta tags
- Search engines do not use the revisit meta tag
- A robots index, follow tag is unnecessary and redundant
- a NOODP will keep Google and Bing from using Open directory project titles instead of title element titles, if the site is even listed in DMOZ

Jump to Table of Contents

Content Review

Amount of Text

- Having some minimum amount of text on a page (200 words?) gives search spiders something to index.

Spelling Errors

- Possible quality signal
- Important to credibility

Keyword Use in Copy

- Are keywords chosen for a page being used in page titles, meta descriptions, headings, and content

Keyword Prominence/Visual Segmentation

- How well does the HTML code of a page show how it’s broken down into different blocks (heading, main content, sidebars, footers, etc.)
- Are keywords used in the different sections, and especially in the main content area of pages?

Use of Related Words/Phrases

- Some words tend to co-occur on pages ranked highly for a certain query (or categories of results for queries), and it can help in the rankings for a page to use some of those phrases.

Penguin/Panda Analysis

Is there a loss in traffic that corresponds to one of the Panda or Penguin updates?

Resource: http://www.seomoz.org/google-algorithm-change

Jump to Table of Contents

Negative Practices

Hidden Text

- Is there text on pages in the same font color as the background?
- Is there text on pages hidden through an offset div?
- Is there a large amount of text on pages in small iframes or CSS scrolling overflows
- Is there text in a font color that matches the font color as the page background that might be mistaken as hidden text?

Cloaking

- Does the site use cloaking to show search engines one thing and visitors something else?

Meta Refresh

- Are meta refreshes used instead of redirects, and if so might they be used in a way which might deceive search engines?

JavaScript Redirection

- Is javascript redirection being used so that search engines see one thing, and visitors see something else?

Outward Links/Link Exchanges

- Is the site using link directory pages that promise being listed in exchange for a link?

Keywords

Keyword Research, Selection and Implementation

- Are relevant, competitive, appropriate and popular keywords being used on the pages of the site?
- Are those keywords being used effectively on those pages?

Keyword Focusing | Mid- to Long-Tail Key Phrases

- Do the main pages of the site focus upon more competitive keyword phrases?
- Do deeper pages with less pagerank focus upon long-tail phrases?

Webmaster Tools

Google Webmaster Tools/Errors Analysis*

- Has the site been verified with GWT?
- Has a choice of “www” setting been made? (Doesn’t have to be if domain access issues are addressed)
- Has a targeted country/location been selected? (Doesn’t have to be)
- Have any errors listed been checked upon?

Jump to Table of Contents

Social Media

Social Media Audit | Status

- Does the site integrate appropriate social sharing buttons?
- Do the pages of the site provide links to social profiles for the site?

On-Site Social Engagement

- Does the site provide ways to give feedback to the site owners?
- Does the site provide a way to leave comments?
- Is there user generated content on the site, such as reviews and ratings, and does it use rich snippets if so?
- Are there public user/member profile pages, and if so how rich are they in terms of features?
- Is there a forum on the site, and if so, some guidelines for its use?

Analytics

Have analytics been set up for the site?
- Code on every page

Want more? Check out this awesome technical SEO checklist: http://www.seomoz.org/blog/how-to-do-a-site-audit