Google has caused a bit of an upset apparently – the new update has sent shockwaves
through the search results and webmasters are up in arms. My sites have done pretty
well out of it thankfully. This is being alleged to be down to “Latent Semantic
Indexing“.
Latent Semantic Indexing is where Google will compare the words used
in two documents and if it finds similar words used in both then it considers them
to be semantically close, if not it considers them different. Where it impacts search
results is Google will be using LSI to determine if your site is “about” what it says
it is about. Also if you take out the main term or phrase the page has been optimised
for, will it still contain words semantically linked to your theme. For example
the famous GoogleBomb “miserable failure” will probably work less and less as they
content will not mention that phrase (it still has top result as Bush though). A page
for “coke” will be semantically close to pages about “pepsi” but probably not “magic
eight ball” (have I just diluted my semantic linking right there?)
Practical SEO Steps you can take are make sure you have varied anchor
text linking to your site, preferably from semantically close pages. Use semantically
close phrases other than your main terms in your content.
Search Google using tilde ~word to find words Google thinks are semantically linked.
Eg. http://www.google.com/search?q=%7Eseo The
words in bold are the words Google thinks are on theme.
Aaron over at SEOBook does a better job of explaining LSI go over and read his
post. Also check out the Threadwatch
thread too.
[Added:]
Links curtesy of Marcia/WMW
http://javelina.cet.middlebury.edu/lsa/out/lsa_definition.htm
http://lsi.research.telcordia.com/
http://www-psych.nmsu.edu/~pfoltz/cois/filtering-cois.html
Latent Semantic Analysis (again Telecordia)
http://lsi.research.telcordia.com/lsi/papers/PSYCHREV96.htmlMicrosoft also
http://research.microsoft.com/users/marycz/ht98.htmhttp://www.cs.cornell.edu/home/llee/papers.html
This looks really good. Here’s the HTML in the cache, but it’s a PDF
Higher Precision for Two Word Queries
Constructing and Examining Personalized Cooccurrence-based Thesauri on Web Pages
http://www2003.org/cdrom/papers/poster/p074/p74-yoshida.html













Comments on this entry are closed.