Hakia introduces QDex - Proves Semantic Search Is A Failure

Posted by Michael Martinez on April 15, 2008 in SEO Theory

Hakia proudly announced its QDex-based search on its blog. What is QDex? It appears to be a human-edited crawl-seed directory.

In fact, it appears to be a publicly visible human-edited crawl-seed directory (like Yahoo! and DMOZ). Hakia claims: “hakia’s pioneering vision is to bring quality search results from vertical domains by ensuring the credibility of the sources and the freshness of information. This is possible via semantic search technology which is not depending on any statistical (popularity) measurement.

“hakia is acquiring credible content in health, medicine, law, finance, and in other knowledge-intensive topics. The following table outlines the current status of content acquisition.”

Hm. Scanning over the list of vetted sources, I find numerous references to Wikipedia — which I know is being eschewed by academia on many fronts because its information is simply unreliable.

So, basically, what they are saying is that they are going to build a large elite directory on the basis of professional association recommendations (I have no problem with that) and Wikipedia (I have a BIG problem with that).

Why mix gutter sewage in with authoritative information? This is not what semantic search promises. Semantic search is really not about “quality” or “authority”, it’s about relevance. What Hakia is telling us (and I note that I capitalize their name where they do not) is that it’s better to index Wikipedia than to actually find truly useful user-generated content (you know — all those specialty sites that engineers, academicians, graduate students, and other specialists in their fields put up on Wordpress, Blogger, Geocities, Tripod, Bravehost, and other free hosting sites).

Given that kind of alternative to Ask, Google, Live, and Yahoo! I have to say I’m no more likely to want to use Hakia as a search resource than I am to use the phone book as a medical dictionary. I’ll leave the link to Hakia in SEO Theory’s blogroll for now but this kind of ridiculous, smarmy, smoke-and-mirrors search technology is beneath what an increasingly frustrated searcher community wants.

If you’re going to create a vertical directory or search engine, Wikipedia is NOT a trustworthy source of information. It’s not a vetted dot-GOV, dot-EDU, or NPO Web site crafted by qualified experts in the field.

Maybe when they get around to doing the mathematics vertical they’ll include Wolfram’s Mathematics World, a resource I’ve used on occasion. But will they include David Shay’s interesting explanation of Fermat’s Last Theorem?

Hakia can niche itself into all sorts of rational search verticals but it won’t be the Web’s answer to today’s search technology limitations because you cannot rely solely upon professional organizations to index all the truly useful and informative Web sites for any given resource. Professional organizations have limits to what they can provide — they have to use today’s search technology, too.

Semantic search is only a realistic technology for the World Wide Web if it can be scaled to provide access to everything that is truly relevant to a topic. Relevance and authority are not interchangeable. Authority may be used to order relevant results but even that should be optional because sometimes the searcher doesn’t want the authoritative site.

Now, searching the Web semantically is a huge task and anyone who has accomplished a large task would be quick to point out that you do better to break the large task down into smaller, more easily achievable steps. I’m good with that. Semantically searching verticals makes sense.

Including Wikipedia in supposedly authoritative verticals does NOT make sense. That’s like saying, “We’re going to include sugar-laden, simple-carbohydrate cereals deep-fried in hydrogenated oils on our list of healthy foods.”

I, for one, am sick of having to click past Ickipedia in search results. I can only imagine what teachers must be telling their students. Kids are probably coming out of high school and college today with “-site:wikipedia.org” permantely tattooed to the backs of their hands. But here we have another search engine wanting to play the “we cannot afford to do it right so we’ll just show people Wikipedia” game.

Semantic search is an evolving technology. In fact, it’s just being born and in the long run we may conclude that it doesn’t offer us any rational benefits for the effort. People use search engines as memory aids as well as tools of discovery. Personalized Search won’t always help us find the sites we think we want to find. Search history can be confusing, unrevealing, and sometimes just blatantly confounding.

We need to build a repetoire of search technologies that complement each other. We need to focus on a diversity of search tools that are covered by a pleasing, intuitive veneer of usability. In the final analysis, we may resolve our search needs with meta search tools that don’t yet exist simply because companies like Google and Hakia commit themselves emotionally to inflexible principles and implementations.

3 Comments on Hakia introduces QDex - Proves Semantic Search Is A Failure

By wibbler on April 15, 2008 at 2:42 pm

“So, basically, what they are saying is that they are going to build a large elite directory on the basis of professional association recommendations (I have no problem with that)”

Here here - I vote Ron Jeremy and Hugh Heffner in for the professional association recommendations regarding 60+% of web content ;) - Those boys are going to be busy the next few decades :)

(And not on topic - but I just read this in the link building threads - lovely stuff - I love this blog.
“Google became big only because other people linked to it transparently, unfettered, and without any knowledge of the fact that Googlers can be a bumbling bunch of buffoons”

By tinpig on April 16, 2008 at 6:19 am

This is an interesting post and I agree with much of what you’ve written - especially the consideration that relevance does not equal authority. It’s a curious point and the initial impulse is to say that authority should inform relevance. I’m not sure how I feel about your claim that in some cases the searcher does not want the most authoritative information source. Determining the user’s intent is difficult when all you have to work with is a few words entered into a search box, but I’m have trouble thinking of a practical example of when a search would prefer a non-authoritative source - at least an example that represents a non-trivial portion of searches. The tricky part to relevance is that it’s a subjective concept.

The other question I have, and I’m probably missing something here, is you seem to be stating that Hakia is placing undue importance on Wikipedia. The only example I could find of their vetted authoritative information sources is here - http://company.hakia.com/verticals.html - and Wikipedia is way down on the list under the heading of User Generated Reference. Where else were you seeing weight given to Wikipedia results?

Anyway, I’m a big fan of your blog - you always have a thoughtful position on interesting subject matter.

By Michael Martinez on April 16, 2008 at 8:01 am

tinpig: “I’m not sure how I feel about your claim that in some cases the searcher does not want the most authoritative information source.”

Michael: When Aunt Penny comes to visit and she tells you she put up a Web site for her math students and it’s “all about fundamental principles of math” but she can’t remember the URL, how do you expect to find it?

She doesn’t know enough about the Web and search to include obviously identifying information so you’ll end up scrolling through tons of irrelevant content, subtracting out URL information from “authoritative” sites because all she can remember is that “it was about basic math stuff”.

I’ve been down the dark alleyways of memory lane on the Web many times. Link popularity, authority, and PageRank are all stupid methods for measuring relevance. The search engines would be doing us HUGE favors if they would allow us to turn off that crap.

Microsoft’s Live Search does give you some of that kind of capability, and believe me when I say I have used it on occasion. It can be a life saver.

tinpig: “you seem to be stating that Hakia is placing undue importance on Wikipedia. The only example I could find of their vetted authoritative information sources is here - http://company.hakia.com/verticals.html - and Wikipedia is way down on the list under the heading of User Generated Reference. Where else were you seeing weight given to Wikipedia results?”

Michael: In my opinion, since Wikipedia is one of the worst possible sources of information on the Web (and since teachers are increasingly advising students not to use it as an authoritative reference), it has no place being mentioned on a page that lists authoritative sources of information.

Comment

Log in or Register to post a comment.

More

Read more posts by Michael Martinez

About the Author

Michael Martinez is the Director of Search Strategies for 1st Query, an Internet Marketing firm offering organic SEO and PPC services.

Brokering search engine optimization The n dimensions of search engine optimization analysis