Natural Search Blog


Matt Cutts reveals underscores now treated as word separators in Google

After the recent WordCamp conference, Stephan Spencer reports here and here that Matt Cutts stated that Google now treats underscores as white-space characters or word separators when interpreting URLs. Read on for more details and my take on it…

Previously, if a developer created a page name containing underscores like “Coca_Cola.html”, it wouldn’t have been considered as close a match to keyword searches for “Coca Cola” as pages designed with more classically-accepted white-space characters like periods, commas, dashes, colons, and semicolons. “Coca-Cola.html” would’ve matched the keyword search much more closely. With this change, Google now likely treats both “Coca_Cola.html” and “Coca-Cola.html” as equally relevant to web searches for “Coca Cola”. This is important to rankings because exact-matches of terms are more likely to rank higher than fuzzy-logic matches.

Of course, the other search engines may not be changing their interpretation of underscores, so it still may be important to not use underscores as white-space characters for the sake of cross-engine optimization.

Even more importantly from my point of view, this announcement reveals that Google is indeed paying attention to keywords in URLs — something that was previously somewhat open for speculation in SEO circles.

Matt further states that the file extension of your page doesn’t matter to Google — .php, .html, .htm, .asp, .aspx, .jsp etc. The one exception is .exe — Google doesn’t want to link directly to executables from their page results.

The major takeaway is: your pagename URLs shouldn’t be esoteric ID numbers or just generic names like “index.jsp”, “file.php”, or anything like that — they should be named meaningfully after the primary contents of the pages. This would provide an additional bit of signal weight for your primary keywords, perhaps giving the page a little more chance to rank higher for searches for those words.

2 comments for Matt Cutts reveals underscores now treated as word separators in Google »

  1. MyAvatars 0.2

    There is a common technology in information retrieval called a wordbreaker: http://www.google.nl/search?hl=nl&q=word+breaker+information+retrieval

    Good news!

    Comment by AndyEd — 7/26/2007 @ 12:39 pm


  2. MyAvatars 0.2

    Oops, Stop the presses:

    Matt Cutts corrects Stephan:
    http://www.mattcutts.com/blog/whitehat-seo-tips-for-bloggers/

    Comment by John Ellis — 8/10/2007 @ 4:24 pm


Leave a comment

* Do not use spammy names!

RSS feed for comments on this post. TrackBack URI

RSS Feeds
Categories
Archives
2013
Feb      
2011
May      
2010
Jan Feb Mar Apr
Sep      
2009
Jan Feb Apr May
Jun Jul Aug Sep
Oct Nov Dec  
2008
Jan Feb Mar Apr
May Jun Jul Aug
Sep Oct Dec  
2007
Jan Feb Mar Apr
May Jun Jul Aug
Sep Oct Nov Dec
2006
Mar Apr May Jun
Jul Aug Sep Oct
Nov Dec    
2005
Jan Feb Mar Dec
2004
May Jun Jul Aug
Sep Oct Nov Dec
Other

web hosts reviews cheap web hosting reviews how to build muscle for women symptoms of depression in women painkiller addiction how to get rid of depression drug addiction