The only constant in search optimization
Posted by Michael Martinez on January 22, 2008 in SEO Theory
Search engine algorithms are not perfect. Their flaws help distinguish them from each other, but they share similar limitations. As searchers we want to find the most relevant, complete, and accurate information available on a topic. Search engine technology today is not really equipped to provide us with those kinds of results. All the search engines, in fact, do a pretty bad job of presenting the most relevant, complete, and accurate information available on topics.
The first challenge any search engine faces is finding the information and the search engine is not responsible for making that information findable. Neither is the searcher. The Web content providers have to ensure that their content is reachable, indexable, and comprehensible to the search engines. Unfortunately, we are all empowered to create content without knowing the first thing about search.
The second challenge any search engine faces is organizing the information it finds. Web content providers really cannot help the search engines with this. Search engines have to figure out how much information they will manage and how to manage. Nonetheless, Web content providers — if they have an idea of what the search engines value — can influence the search engines’ choices by structuring their own content to match the search engines’ expectations.
The third challenge a search engine faces is determining which data is more relevant, complete, and accurate with respect to a user’s query. The problem here is that search engines are not equipped to determine either accuracy or completeness. So even if a search engine had the perfect relevance algorithm it would be unable to provide a fully acceptable set of results to any user query.
But even the relevance algorithms leave a lot to be desired. In estimating relevance search engines look at on-page content, inbound link anchor text, and user behavior. Each major search engine brings a slightly different mix to the recipe.
One search engine places more emphasis on the first link from any host. The fallacy with this approach is that a Web site may be relevant to multiple topics, and it may in fact be more fair and accurate to allow a diverse selection of relevant links from one host to count equally for another host.
Some search engines look at which results people click on. The fallacy with using click-through data to judge accuracy, relevance, completeness is that click-throughs provide you nothing useful in this respect. People are influenced to click through to results by position and presentation.
And people click through in many different ways. Some people only right-click on search results so they can open new browser windows (a habit that helps keep your browser functional if you try to open a site that times out, locks up, or is loaded down with so much crap it takes several minutes to render the page). Some people will click through in the same browser window, close the window if they don’t like what they see, and go right back to the search engine to perform the same or a similar query.
So one of the major problems with click-counting is that it heavily biases user behavior and not all users use the BACK button to tell the click-counting search engine that the page they visited wasn’t approrpriate (or appropriate enough). Another major problem with click-counting is the search engine’s inability to tell the user how accurate, complete, or relevant any particular page is in its search results.
Just because a document is listed first in a search engine’s results doesn’t mean it is the most relevant document to the query. Some search engines place a great deal of emphass on value or importance, allowing links to channel value to Web documents the way river systems channel water to the ocean. The problem with that approach is that you never get a larger body of water than the ocean, but sea water does little to quench a person’s thirst.
Importance — as conveyed by an arbitrary and artificial misinterpretation of the intent of linkage — is not a substitute for relevance, accuracy, or completeness. In fact, importance is misleading because a popular site may be loaded with misinformation but given a great deal of false credibility by a Web community that was unable to distinguish between nonsense and fact.
For example, if you were to read articles on search engine optimization today, you might find one article that says “by relying so much on factors exclusively within a webmaster’s control, early search engines suffered from abuse and ranking manipulation.” While technically accurate this statement is misleading because it implies that current search engines don’t suffer from abuse and ranking manipulation. All the major search engines are equally vulnerable to abuse and ranking manipulation.
That same article, a little further on, says “although PageRank was more difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi search engine….” This statement is half false — in its claim that PageRank was more difficult to game than meta tag-based search indexing. PageRank — as presented in the Backrub paper — was easy to game. All you needed to do was create more pages that linked to your most important page. Many high PR sites obtained their high PR with a hefty boost from hundreds of thousands of internal links.
Today’s search technology simply cannot measure the accuracy and completeness of Web content. Furthermore, the idea that the quality of content can be assessed through the weight of community of opinion disables search technology. The lack of adequate resources, standards, and rigid testing criteria hurts many Web communities that form consensus on various topics. But even academic resources are at a disadvantage when it comes to assessing information on the Web.
For example, numerous academics have struggled to define search spam in their papers — most of which propose PageRank-derived methodologies for assessing quality. The fundamental, incorrectible flaw in PageRank is the underlying assumption that citation represents informed opinion. In fact, Web citation doesn’t represent opinion at all, much less informed opinion. So any attempt to improve upon the PageRank methodology is doomed to failure because the underlying false assumptions are not challenged.
Informed opinion is not easy to come by, even in the academic community, when it comes to Web spam and the quality of Web sites. Spammers are usually 2-3 years ahead of the documented techniques discussed by academics. For example, Google Scholar finds no association between “script kiddie” and “web search”.
Most people are familiar with the term “script kiddie” through recent virus hacking community news coverage. But there are plenty of script kiddies in the Web spam community. These are the gullible customers for black hat spam software packages. For as little as $100 you can buy a program that will autogenerate crap content for you, bombard forums and blogs with bogus registrations and link-dropping comments, and make yourself generally annoying.
The Gold Rush Principle applies in black hat SEO software as much as it does in multi-level marketing and other get rich quick schemes: those who make the money in any gold rush tend to be the merchants selling picks, pans, and shovels to the miners. A few miners make a fortune. Most miners break their backs and go bust.
Look at it this way: if Carlton Sheets makes so much money by flipping stressed real estate, why is he selling his principles on television? Is it perhaps that he doesn’t have to do as much work to persuade people to sign up for his program as he would were he to scour newspaper listings for stressed properties every day?
Would Don Lapre really have needed to sell all those porn line subscriptions over television if he was truly making millions of dollars by placing “one tiny little ad” in newspaper across the country? Why give up the goose that lays the golden egg? Why take your cash cow to market and destroy its value? Why give away your competitive advantage to potentially millions of competitors?
If someone has created some really good black hat software, why should they sell it? If all they need to do is create a few thousand AdSense sites a day to rake in millions of dollars, why are they suddenly selling the tools that made them rich?
Generally speaking, secret weapons and patented ideas become available to the public close to the end of the premium life cycle. If your $100-per-pill drug is about to become eligible for generic production, it’s time to reformulate and take out a new patent so you can start charging $110-per-pill.
Economics don’t change for spam, search, or pharmacology. People don’t look behind the great deal to find out why the price is so low, why the Great Gizmo is suddenly available. They just take what they are offered and run with it. And that is what the academic community does when it comes to researching search spam. You won’t find the latest and greatest spam techniques by reading academic papers proposing ways to fix PageRank (it cannot be fixed).
In fact, the academic community’s inability to focus on accuracy, completeness, and relevance is a clear example of just how difficult it is for any community to develop a knowledgable consensus about anything. Once a community develops a key influencer the community’s momentum goes charging down a very narrow, relatively inflexible path. It may take years or decades to deflect a key influencer’s bad opinion, even in the scientific community. People familiar with the controversies surrounding Stephen Hawking’s black hole theories know that Astrophysicists have struggled with some really bizarre concepts for decades. They were worried for many years about “loss of information”, but now they are wrangling over who is right about why there is no “loss of information”.
Consensus is a poor guide to truth and relevance. It hinders our ability to judge matters objectively, and objectivity is what we need most in our search tools. Objective search tools cannot be easily gamed. They don’t obsess over links and meta tags. They look at what holds up under closer scrutiny. There is no emotional baggage to contend with as we have with click counting, PageRank, and link anchor text. Someone proposed all of these ideas, championed them, won the battle to have them used, and is heavily invested in an emotional struggle to make these ideas work.
In search engnie optimization we have to work within the limitations of the search engines. Sometimes we can leverage those limitations to favor our own sites. Sometimes we have to overcome those limitations because they favor other people’s sites. But the fact that search engine limitations can be incorporated into SEO strategies hurts all of us because the quick kill syndrome incentivizes the creation of poor content.
If it’s easier to create cheap, lazy content and just point a lot of links to it, most people will prefer to do that rather than create content that can withstand rigid, objective rules for recognition. But search technology doesn’t stand still and it’s not nearly so simplistic as I have made it out to be in this article. At some point objectivity begins to creep into the search process, little by little. It has to if only because the power of market competition compels the search companies to improve their technology.
Although it’s true they are holding themseves back because they are emotionally tied to imperfect techniques, it’s equally true that search engine optimizers are holding themselves back because they are essentially script kiddies buying the latest soon-to-obsolete Black Hat SEO Software. Regardless of who your source of information, if someone is sharing techniques and tips with you they are placing those techniques and tips at risk of being compromised.
Every optimization technique outlives its usefulness. Every optimization tip page becomes less relevant, less accurate, and less complete with each passing day. And yet, if you search for SEO tips, you’ll find some pretty bad, outdated advice. Why? In part because links are boosting pretty bad content to the top of the search results. So while links still work, they are only working against us all: search engine serve poor quality results, searchers find poor quality results, and the Web content providers offering bad advice will eventually be ridiculed for promoting bad advice.
Web search works as a triangle of forces: Web content providers, search engines, and searchers. Each force has the ability to change the results. Unfortunately, too many Web content providers wait for search engines to make the right changes. As a search engine optimizer you are empowered by your knowledge (great or small) to start changing the results today.
Even a script kiddie can do that for a while. The trick is to figure out how to make change the only constant in your SEO strategy. Once you do that, you are no longer restrained by the limitations of today’s search technology as either a searcher or a content provider.
4 Comments on The only constant in search optimization
By Tyler on January 22, 2008 at 11:22 am
Are there any other blogs or writers that you recommend that give quality advice on SEO like SEO Theory?
By eserrano on January 22, 2008 at 7:51 pm
Michael,
Great post. If I had a nickel for each time some mentions that they know how to SEO because they’ve taken a college course or read a book I’d be spending a ton of time pulling slot machine handles in a Vegas casino. It’s absolutely laughable to hear such comments and they are reminiscent of the get rich schemes you refer to in this post.
You mention in this post (and others you’ve written) that sharing SEO tips serves to dilute its effectiveness thus shouldn’t been done, no argument there. But this blog indeed shares advice and in fact SEO advice is its topic. I for one enjoy reading your posts and appreciate the advice (which I’ve used, thanks) but why violate your own rules as in the end it hurts you?
By Michael Martinez on January 22, 2008 at 8:37 pm
Tyler: “Are there any other blogs or writers that you recommend that give quality advice on SEO like SEO Theory?”
Michael: There are no other theoretical blogs I am aware of. I’ve looked at several hundred SEO Blogs over the years and have found them all wanting on the theoretical front (which is why I started SEO Theory).
Bill Slawski’s SEO By the Sea looks at patent applications and related topics regularly. If you want to stay on top of technical details and do a lot of second-guessing, you cannot miss Bill’s posts.
SearchEngineWatch’s blog has been linking to occasional theoretical posts since Danny Sullivan left, I think in part to establish a new direction so that people don’t feel like SEW is just replicating SEL’s imprint.
I don’t agree with all his conclusions, but I think Halfdeck’s SEO Notebook is a great resource for people who want to follow the musings of a risk-taker.
Matt Cutts’ Google/SEO category is a pretty good intermediate SEO resource but he’s naturally singing Google’s song. I would put Matt’s blog on a preferred reading list any day of the week.
Anything by Mike Grehan or Shari Thurow is worth reading, although Shari likes to shake trees.
If you feel up to the challenge of following an academic, you will want to read Dr. Garcia’s blog (Mi Islita — people know him as Orion). He looks at optimization from the perspective of a trained academic (Shari Thurow has also studied IR science).
If you’re into PPC and social media, I like Ms. Danielle (and don’t let the fact she has reviewed SEO Theory and interviewed me mislead you — she knows how to build up a strong readership and she shares tons of advice).
SE Roundtable is indispensible for tracking popular SEO myths, rumors, and trends. They also provide good conference coverage (not to take anything away from other good conference blogs, but SE Roundtable is a more comprehensive resource, in my opinion).
There are other blog sites I visit and like, but relatively few are productive enough or unique enough to be included in this short list, in my opinion.
Hope that helps.
By Michael Martinez on January 22, 2008 at 8:55 pm
eserrano: “You mention in this post (and others you’ve written) that sharing SEO tips serves to dilute its effectiveness thus shouldn’t been done, no argument there. But this blog indeed shares advice and in fact SEO advice is its topic. I for one enjoy reading your posts and appreciate the advice (which I’ve used, thanks) but why violate your own rules as in the end it hurts you?”
Michael: Yes, this blog shares advice and occasional techniques, but I strive to provide guidance in principles rather than focus on “linking method of the week”. I don’t think there is anything wrong with advocating good science, good optimization, and good Web site architecture. I do feel that endlessly spewing cheap link tricks doesn’t really help the community.
But look at it this way: If you only collect methods and techniques from SEO pundits and don’t learn the fundamental principles for yourself, you’ll always be dependent upon other people to help you get ahead of the competition. Is that a formula for competitive success? I don’t believe so.
Here at 1st Query not a week goes by where we don’t try out new ideas, evaluate a lot of search results, and examine Web sites. We look for trends and do our best to understand what the search engine representatives don’t say when they open up and talk to the SEO community.
If you want to browse an archive of past tips and techniques I’ve shared through the years, check out Spider-Food’s SEO forums. I’ve been too busy to post there much lately but I’ve been participating in their community for about 7 years.
The older ideas may no longer be useful for a variety of reasons, not simply because everyone jumped on them and abused them. Sometimes the technology or the available resources just evolve beyond a specific idea’s needs.
Comment
Log in or Register to post a comment.