- Natural Search Blog - http://www.naturalsearchblog.com -

Matt Cutts reveals underscores now treated as word separators in Google

Posted By Chris On 7/26/2007 @ 9:52 am In Google,Search Engine Optimization,SEO | 2 Comments

After the recent WordCamp conference, Stephan Spencer reports here [1] and here [2] that Matt Cutts [3] stated that Google now treats underscores as white-space characters or word separators when interpreting URLs. Read on for more details and my take on it…

Previously, if a developer created a page name containing underscores like “Coca_Cola.html”, it wouldn’t have been considered as close a match to keyword searches for “Coca Cola” as pages designed with more classically-accepted white-space characters like periods, commas, dashes, colons, and semicolons. “Coca-Cola.html” would’ve matched the keyword search much more closely. With this change, Google now likely treats both “Coca_Cola.html” and “Coca-Cola.html” as equally relevant to web searches for “Coca Cola”. This is important to rankings because exact-matches of terms are more likely to rank higher than fuzzy-logic matches.

Of course, the other search engines may not be changing their interpretation of underscores, so it still may be important to not use underscores as white-space characters for the sake of cross-engine optimization.

Even more importantly from my point of view, this announcement reveals that Google is indeed paying attention to keywords in URLs — something that was previously somewhat open for speculation in SEO circles.

Matt further states that the file extension of your page doesn’t matter to Google — .php, .html, .htm, .asp, .aspx, .jsp etc. The one exception is .exe — Google doesn’t want to link directly to executables from their page results.

The major takeaway is: your pagename URLs shouldn’t be esoteric ID numbers or just generic names like “index.jsp”, “file.php”, or anything like that — they should be named meaningfully after the primary contents of the pages. This would provide an additional bit of signal weight for your primary keywords, perhaps giving the page a little more chance to rank higher for searches for those words.

2 Comments (Open | Close)

2 Comments To "Matt Cutts reveals underscores now treated as word separators in Google"

#1 Comment By AndyEd On 7/26/2007 @ 12:39 pm

There is a common technology in information retrieval called a wordbreaker: [4]

Good news!

#2 Comment By John Ellis On 8/10/2007 @ 4:24 pm

Oops, Stop the presses:

Matt Cutts corrects Stephan:

Article printed from Natural Search Blog: http://www.naturalsearchblog.com

URL to article: http://www.naturalsearchblog.com/archives/2007/07/26/matt-cutts-reveals-underscores-now-treated-as-word-separators-in-google/

URLs in this post:

[1] here: http://blogs.cnet.com/8301-13530_1-9748779-28.html

[2] here: http://searchengineland.com/070726-081306.php

[3] Matt Cutts: http://www.mattcutts.com/blog/

[4] : http://www.google.nl/search?hl=nl&q=word+breaker+information+retrieval

[5] : http://www.mattcutts.com/blog/whitehat-seo-tips-for-bloggers/