Atlantic BT: Our Thoughts

June 16 , 2009

How to Diagnose Website Indexing Issues

by Mark Thompson

It is amazing how many times I get asked questions like, “Why can’t I find my website when I search my company name?” or “Why is Google not finding all of the pages on my site?.”  So I thought I would write a post that explains reasons why you may have issues with your website getting properly indexed by search engines.

Check your Robots.txt file: The problem may be as simple as your robots.txt file is disallowing search engines from crawling your site.  Type websiteurl.com/robots.txt to see if you are currently disallowing search engines.  If you see something like this …

robots-txt-screenshot

you are actually telling search engines not to index your site.  By simply removing the ” / ” you will allow search engines to start indexing your site. The ” / ” is representative of your site’s root path. So that essentially means that you are telling search engines when they visit http://www.yourcompany.com to NOT index your site.

You may also be restricting search engines in your source code.  Go to ‘View’ in your web browser and view the ‘Page Source’ or CTRL+U.  Do a “Find” for <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">.  If you see "NOINDEX, NOFOLLOW", as you can imagine it tells search engines not to index or follow the page.  Simply remove that piece of code, to allow it to be indexed.  Be sure to check all pages in question, since this code may be on some or all of your pages.navigation

Check your Site Structure: For large websites you may have a lot of directories in your site.  If your directories go 3, 4, or even 5 directories deep off the root level, it may be hard for search engines to crawl those inner pages.  Try keeping your directories as close to the root directory as possible.

Check your Site Navigation: Your website navigation may not be search engine friendly, meaning search engines are unable to follow the interior pages through your navigation.  If your navigation is developed using flash or programmed in javascript, it may inhibit search engines to access your interior pages.  To fix this issue, have your navigation programmed using a href links.  If you do not want to re-program your navigation, try adding footer text links at the bottom of each page.  This will give the search engines a second way of accessing your interior pages.

Check your CMS: If you are pulling in content dynamically through a CMS or some sort of database, search engines may have a problem indexing your content. Here is an example of the differences between dynamic and static urls.

Dynamic URL

http://www.websiteurl.com/forums/thread.php?threadid=12345&sort=date

Search Friendly Static URL

http://www.websiteurl.com/forums/indexing-issues.html

The problem with dynamic URLs, is that these URLs do not exist.  They pull in content on the fly, based on what the user requests.  If the URLs do not exist, search engines will not be able to find these pages and index them.  In order to fix this URL issue, use a URL re-writer to modify the URL so it creates static looking pages from dynamic pages.  This will give the impression to search engines that these are static pages.  It will also help with SEO, if you use keyword-rich URLs.  For some CMS’s, it may be hard to find a URL re-writer off the shelf.  Almost all of the mainstream CMS’s like Drupal and Wildflower all have URL re-writers you can use.

Google Website Penalty: I recently wrote a post on how to detect when Google penalizes your website.  If you notice that your site is not being crawled by search engines, it could possibly be that search engines have banned or penalized your website and they are not adding your site to their index.

Mark Thompson

About the Author

Mark earned his BA in Marketing from Le Moyne College located in Syracuse, NY. Since he graduated from college in 2005, he has been working in the internet marketing industry. His expertise consists of Search Engine Optimization, Search Engine Marketing, Pay Per Click Advertising, Blog Marketing, Content Development, Web Analytics, and Social Media Marketing. Mark has also met the requirements to become a Google Adwords Qualified Professional and Yahoo! Ambassador.

Leave a Reply

In a Nutshell

Since 1998, Atlantic BT has been a full service web development company that offers the tools, resources and services to get your business moving. We focus on combining new ideas, specific requirements, and years of experience into high-quality, results-oriented web solutions for small to medium sized businesses. If you want the best website possible that generates real results, let's get started.

  • Toll Free: (866) 484-0921
  • Phone: (919) 518-0670
  • Tech Support: (866) 484-0921 (option 8)
  • Fax: (919) 719-0834
Atlantic Business Technologies, Inc.
8015 Creedmoor Road, Suite 101
Raleigh, North Carolina 27613
  • Raleigh Chamber of Commerce Member
  • Cary Chamber of Commerce Member
  • Pinnacle Business Award Winner
  • Triangle Business Journal's Top 40 Under 40
  • Yahoo Search Marketing Ambassador
  • Google Adwords Qualified Company