Site Search Optimization 2.0
Posted by Michael Martinez on October 18, 2007 in SEO Theory
There is no such thing as a perfect site search when you’re relying on the major search engines. You make your choice and you live with it. But there are thngs you can do to ameliorate the choices made by the major search engine.
First, you can create an XML Sitemap and update it when necessary, make sure you verify with Google, Authenticate with Yahoo!, and do whatever else you have to do to get Ask and Microsoft to grab the thing regularly.
Second, you put the “Sitemap: ” entry into your robots.txt file because you want ALL legitimate search engines that index the robots.txt file to see your XML Sitemap. You never know who your next white knight of search will prove to be.
Third, you create an HTML sitemap. Now, a lot of people have long since learned about HTML sitemaps and they are considered to be standard tools for improving usability and accessibility. But HTML sitemaps often leave a great deal to be desired. For example, Xenite.Org has over 2,000 pages of content. There is no way I’m going to put all those URLs on one page or even a handful of pages. It’s too much work.
But you have to consider that the lack of an HTML sitemap hurts your crawling and indexing, so you need to look at step four: supplementing your HTML sitemap with other on-site crawl pages.
Keeping in mind that crawling is driven by links, you cannot afford to get sneaky. And keeping in mind that you cannot afford to get sneaky, you need to make your crawl pages presentable to and usable for people.
For lack of a better name, I’ll call these on-site crawl pages HTML Section Maps. I have an experimental page that I’ll share here (something I rarely do). http://www.xenite.org/news/articles_0000/ is an index page I inserted into a directory to link to old site news articles. While there may not be a lot of very interesting content there now, the links are still followed and the pages are still indexed and people find them. And because people find those pages I decided to revise their appearance, and because I did that I made sure there was an index page.
Now, I could go without an index page and block the directory but there are, actually, old site news articles announcing new sections and changes in Xenite from several years ago that are current. I also occasionally update the articles to remove or change URLs.
Unfortunately, some of those pages are in Google’s Supplemental Index. Even I have occasional need to find those pages so a site search that can get to them is vital. Hence, I couldn’t use Google any more because Google won’t level the playing field for unique, original content.
The HTML Section Map serves two purposes: first, through its indexable anchor text it provides an opportunity for intentionally underperforming search engines to at least point toward something relevant if they don’t want to point to deep content. Secondly, the anchor text (programmatically extracted from the actual title strings on the pages being linked to) reinforces the page titles.
One can only hope that my page titles in years past were well-written enough to help people find anything relevant to today. I can only hope these index/crawl/Section Map pages do the job I need them to do and help search engines find those pages so that they will actually be found. Of course, only Live needs to know about those older pages since Live now powers Xenite’s site search. (NOTE: I liked the fact that Google would let me design a custom search engine that kept the results on-site, but I’m willing to sacrifice that convenience for a site search that actually works.)
However, in case you are tempted to count URLs on that page, there are around 500 of them. That’s too many even by my own gargantuan standards but I haven’t had time to break them up into smaller chunks. And when I do break them up into smaller chunks I’m going to have to do something with the links on the individual pages, which are right now all pointing back to the two HTML Section Maps in the Xenite News directory.
By offloading some of the URL mapping to secondary pages I make it easier for both search engines and visitors to crawl Xenite at a slower pace. That is, I have intentionally extended the lag time between crawling Xenite’s root URL and crawling those deeper pages so that I can tweak the pages at a slower pace. These pages, being less important than other pages, still need to be optimized for site search (because people, including me, still go looking for them).
I just don’t have enough time to update all the content. Nor do I keep all that content in a CMS, which many people have advised me to do (but I don’t yet have a solution for placing Xenite into a CMS that would meet my standards for functionality). Page appearance is secondary on the Web. People first and foremost want to be able to find what they are looking for. So site search is all about helping people find what they are looking for. They came to your site looking for something but they didn’t find it. They turn to site search as a last resort before abandoning your site.
Finally, the fifth step you can take to help search engines is to place cross-promotional links on your pages. A lot of people refuse to this for a variety of reasons. I’ve heard them all and none of them are good. But you have to decide for yourselves what you are comfortable with.
Telling people what else you have on your site — especially when you have 2,000 pages of content — is pretty reasonable. People accept that information graciously. And that doesn’t mean embedding links to all 2,000 pages on every page. You just want to design what you could call “virtual sections” of content.
A virtual content section is a way of showing people (and search engines) how your content pages relate to each other through a structure other than standard on-site navigation. In fact, there are always three levels of on-site navigation and most sites only use two at most: classic “site-wide navigation” (usually in the form of menubars, margin links, etc.), HTML site map pages, and cross-promotional margin or footer links.
Site search and external search also provide visitors with navigational avenues but they are relatively crude functions right now. It would be better if you could sectionalize site search by topic (not really through semantic rules, but rather through a combination of keyword-sensitive and location-sensitive rules).
Cross-promotionl links — when organized into virtual content sections, help you organize your Web sites with alternative architectural structures. Right now, the CMS products I am aware of just don’t support this level of functionality. You need not only multiple templates (which some CMS packages support), but very strong connectivity rules that help you create new content quickly so that it slips into the physical architecture with the right navigational links and into the virtual architecture with the right cross-promotional links and everything appears to be seamless.
So, I’ll come back to virtual site architecture in some future discussions.
Comment
Log in or Register to post a comment.