SEO Theory and SMX Advanced 2007

Posted by Michael Martinez on June 5, 2007 in General, Supplemental Pages

Well, I’ve now attended my first SEO conference and it wasn’t so bad. The company wanted me to go for various corporate reasons. I have a lot to do at work so I won’t be able to attend tomorrow.

During the morning session I was able to ask Matt Cutts the first question from the floor. I tried not to put him in too difficult a position but I could not refrain from mentioning I haven’t received an answer to my question the few times I’ve shared it on various Web sites. So after the audience stopped laughing Matt got me back by asking, “What, you don’t have the right answer already?”

Okay, I can be a little argumentative. That’s live.

Anyway, my question was about Supplemental Results. I pointed out that a number of people have noticed that you can search for unique expressions on Supplemental pages and come up empty. I wanted to know if Google plans to change this behavior or not. But Matt also saw that I was looking for confirmation of two other points. He adroitly answered those 2 points and left the third one unanswered. I didn’t really expect a full disclosure on Google’s plans.

The two points that Matt confirmed? First, you move pages into the main Web index by getting more links. Matt took the opportunity during the session to discuss PageRank in a way that I’m not sure people have really been looking at it. I’ll recap below.

Secondly, Matt confirmed that Google is NOT parsing the Supplemental pages the way they parse main Web index pages. Where he corrected me was in pointing out that Google does index “important words” and “popular words” that are found on Supplemental pages. So, what is an “important word”? I have no idea.

But I’ll come up with a few. I’m sure everyone will come up with a few ideas.

They apparently don’t handle phrases in the Supplemental Results. Now, while Google has been returning phrase-based results for years (if you do an Exact Find search), many people have recently noticed that Google has applied for a patent (or two or three) on phrased-based indexing. Try not to put phrase-based indexing blinders on when evaluating Matt’s reply. Google’s phrase handling is most likely very complicated. They have been constructing phrases from search expressions for years.

All I am saying is that Matt did not elaborate on what Google does with phrases — too many SEOs would take his one remark and write an essay about Google’s phrase-based indexing. We have too little information to know for sure what Google is doing with phrase-based indexing, but we do know that the Supplemental Results index doesn’t get as much attention for phrased-based searching (which is different from phrased-based indexing) as the main Web index receives.

Toward the end of the day I subjected Matt to a little Deja Vu and, during the Personalized Search session I asked him if it would be possible to use Personalized Search to find Supplemental Results. He struggled to provide a useful answer that didn’t give away too much but I think he said it could be done.

I pointed out that I have trouble finding content on Matt’s blog when I am searching his Supplemental Pages. He asked for examples. I’ll have to provide him some. But the problem lies in finding pages that I want to refer to in a post somewhere and those pages have slipped into the Supplemental Results index.

There is good, valudable content to be found on Supplemental Pages. You just cannot (easily) get to that content through Google. Tim Mayer suggested we could use Yahoo! to find content on Matt’s blog. That’s an important counter-point. I think that Supplemental Results are here to stay and that we’ll have to develop tools and methods for optimizing Supplemental Results pages. You just won’t be able to get them all into the main Web index.

Why is that?

Because of the way Google is controlling PageRank. Google’s (Internal) PageRank, as the U.S. courts have pointed out, is an expression of opinion. That means Google is under no obligation whatsoever to award calculated PageRank to any pages. So any page that does receive (Internal) PageRank is somehow trusted by Google.

Matt pointed out that you only get so much PageRank. In responding to another question from someone who was struggling to get a large content site indexed, Matt suggested that the way to go would be to optimize your PageRank so that it is shifted toward your most important pages. I have made similar suggestions in the past, although not because of PageRank.

Google uses PageRank to determine where it will crawl, what it will store in the main Web index, and to weight search results. You’ll get only so much PageRank to spread across your pages. This is not a situation where “hoarding” PageRank makes any sense. Rather, this is a situation where you want to allocate your internal links in such a fashion that you push your PageRank toward your most important pages.

But linking out to other sites is not going to hurt you. Your PageRank is going to be used to calculate PageRank for other sites regardless of whether you “hoard” it or not. Think of PageRank as a piece of rope that passes through your hands for a while and then slips free of your grasp. You cannot stop it from leaving your grip. All you can do is make some use of it while it’s yours to use. But it will come back eventually so you can use it again.

Some of the other interesting points I took away from the conference sessions include:

  1. Google is okay with indexing search results from pages that add value (unique content) to the search results. Many ecommerce sites do indeed add value to their search results.
  2. Google cannot confirm in public whether click-through data is (or is not) used to determine search results
  3. Google seems to distinguish (at some level) between “expert” queries and “inexpert” queries
  4. Wikipedia may indeed be getting some sort of favorable treatment for “inexpert” queries (which seems a grave disservice to the majority of Google users)
  5. Owning multiple domains does not hurt you. Using Domains by Proxy is not a signal.
  6. If you have tripped algorithmic filters with 200 domains, your 201st domain may then be more closely scrutinized
  7. The Googlebomb filter is not run continuously. It is applied to the database once every 2-4 months. They appear to be analyzing links and need to let links accumulate before a linkbomb can be identified
  8. There is a live aspect to the Googlebomb filter that Matt did not elaborate on
  9. Google actively develops semantic matching functions (this is NOT an admission to use of Latent Semantic Indexing)
  10. Some Googlers advocate or favor LSI (this is not an admission that Google uses Latent Semantic Indexing)
  11. Webmasters can boost their semantic matching in their copy or let search engines try to figure it out
  12. Microsoft says that “duplicate content fragments your page and anchor text” (that is, the unique aspects of your page become less valuable because of duplicate content)
  13. Microsoft recommends use of absolute links for all HTTPS pages (some SEOs have been suggesting the same for years)
  14. All the search engines agreed that additional value justifies some replication of content
  15. Microsoft does NOT apply a site-wide penalty for duplicate content. They evaluate each page on its own merits
  16. Ask will not crawl known duplicate content sources, but they won’t treat templates as duplicate content
  17. Ask favors the “most (link) popular candidate” among duplicate results (I got the impression other search engines do too)
  18. Yahoo! is “less likely” to extract links from duplicate content and is less likely to crawl known duplicate content
  19. Yahoo! will look at “approximate page-level duplication” — the pages don’t have to be exact matches
  20. Yahoo! ignores boilerplate content (but the robots-nocontent microformat can be used to mark boilerplate content to help them)
  21. Mashup pages are considered suspect
  22. Weaving and stitching (”rewriting” copy by moving sentences and words around) is considered to be abusive duplication
  23. Microsoft treats http-equiv meta refresh as 301. The other search reps seemed to indicate they do too
  24. Most members of the audience USE XML sitemaps but very few update them weekly or use autodiscovery
  25. Yahoo! said that 30-35% of all queries are looking for opinion (subjective content) like product and service reviews
  26. Matt Cutts said that Personalized Search — when applied to news — improves click-throughs (he asked we not attribute specific numbers to him because he didn’t have the exact references) SIGNIFICANTLY
  27. Matt said that 10% of people mis-spell queries (does that mean that 10% of queries are mis-spelled?)

All-in-all it was an interesting day.

Michael Wolf suggested that social media optimization will have a great impact on Personalized Search. Maybe. Maybe not.

We’ve actually gone through a form of social media networking on the Web once already. It happened in the 1990s. The most popular example of what you could call “Social Media 1.0″ was the old Hit Counter site. For a few years, everyone wanted to be in the top 100 (or even the top 1000) list on Hit Counter. Those sites received a ton of traffic from Hit Counter (another case of “the rich keep getting richer”).

Hit-based traffic actually produced one of the few non-advertising driven revenue methods in the 1990s. But the Internet became too big, the Web outgrew the service’s ability to document the most interesting sites. You see, when your population reaches a certain point, it begins to fragment and then you need a lot of little “hit counter” type sites to help people find popular stuff they haven’t seen before.

Social media won’t go away. It never really has. But I think it will continue to evolve.

There is a large emphasis on social media at this conference. It doesn’t interest me that much because, among other things, a lot of people who count on social media optimization admit that most of the traffic is non-converting and most of the links are not very helpful. I took hardly any notes for the social media session, but someone pointed out that links from social media hubs today are equivalent to links from popular forums a few years ago.

Yes, that’s true. After all, Web forums were part of Social Media 1.0. If you feel like you’re missing out on the Social Media 2.0 opportunity, don’t worry. You’ll get another chance when Social Media 3.0 comes along.

Just be sure you’re ready to handle really technical stuff, because I think Social Media will always be considered “cutting edge” stuff. In its day, Hit Counter was considered a pretty cool gizmo. After all, it counted traffic to tens of thousands of Web sites….

Comment

Log in or Register to post a comment.

More

Read more posts by Michael Martinez

About the Author

Michael Martinez is the Director of Search Strategies for 1st Query, an Internet Marketing firm offering organic SEO and PPC services.

Why your on-site optimization sucks Web Marketing to Improve Search Engine Ranking