The Inevitable SEO Terminology Guide: Advanced SEO Dictionary
Posted by admin on December 24, 2006 in SEO Theory
Just so my loyal readers know, this will be my last post until after Christmas.
And I’m not interested in compiling another rehash of commonly defined (or misdefined) SEO terms. There are plenty of such “SEO dictionaries” out there (for example, see The Martinez Dictionary of SEO and Spam Terminology, 2006 Edition at SEOmoz). No, the point of this post is to give me a reference I can come back to for hard to remember, seldom used terminology. I hate spending hours and hours and hours looking for definitions I cannot remember.
Of course, there are no universally accepted definitions in search engine optimization, but this list will help you understand me when I use these terms.
Advanced SEO Dictionary
Clustered Results - noun phrase. You see this most often with Ask, but it happens in Google quite a bit and I think Yahoo! also does it sometimes. You’ll see 2 pages from the same site, the 2nd one indented. Now, Ask likes to put little folders in the margin to show how smart they are about clustering search results from multiple sites under a single topic. But did you know that Google clusters sites and hides them from you? If you change your Google Preferences to show more than ten listings per page, you’ll see the clustered listings. That is why so many data center tools show you different rankings from what you think you’re seeing.
Collapsed Results or Collapsed Listings - noun phrases. Usually what I call Clustered Results when I cannot think of the word “cluster” (which is more often than not). Technically, these expressions should really only refer to the hidden clusters described above.
Filter - noun. A process whereby a Web document is evaluated and either flagged as “spam”, “potential spam”, “adult-oriented content”, “illegal content”, or something else. Each search engine employs multiple filters. Some filters were designed into the algorithms from the start. Some filters have been added as afterthoughts as search engines have had to react to manipulative or otherwise previously undetected inappropriate content.
Host - noun. A much-used term in academic search engineering literature to distinguish between “Web document collections” on a systemic level. A host is not necessarily the same as a site. Hosts are generally defined to be either entire domains (example.com) or sub-domains (sub1.example.com). A domain to which one or more sub-domains belong would be treated as multiple individual hosts, distinct from one another. A host is easier to identify than a Web site, which may be only a part of a host’s content.
Index - noun. The database(s) against which queries are resolved. All of the major search engines maintain multiple indexes. Each is a separate, distinct database, either physically (kept in separate files) or virtually (logically segmented portions of a master database). The expression database is probably inappropriate for describing what the search engines maintain. When you see me refer to Main Index, think of that as the “static Web page index”. Other indexes may include Image Indexes, News Indexes, and Blog Indexes. I have some ideas on how these various indexes are built, but I don’t expect to share them on this blog.
Index - verb. The process of adding information about Web content to a search engine’s database about the Web. The indexing process may entail considerable effort depending upon the complexity and applicability of the document.
Indexer - noun. A type of program that search engines use to update their databases with information about retrieved and parsed Web documents. You rarely see even knowledgeable SEO forum moderators and admins speak of indexers and parsers, perhaps out of a misguided concern that they will confuse people who are new to search engine optimization. Unfortunately, those new people visit the forums to learn about SEO, so teaching them the wrong terminology does them a great disservice.
Internal Links or Internal Linkage - noun phrase. These are the links within your own site that point to other pages in your site. Search engines may use a different, host-level definition for internal links. It is possible that all the major search engines now distinguish between host-internal and host-external links. See host for more information.
Internal PageRank - noun phrase. This is the actual static value that Google computes and adds to dynamic (run-time, query-time) relevance scores to determine search results rankings. Matt Cutts distinguished between Internal PageRank and Toolbar PageRank on his blog. He also confirmed that he was talking about Internal PageRank where I cited him in my PageRank: Where it helps, where it doesn’t help, and other facts post at Spider-Food in July 2006. Most SEO forum moderators and admins appear to be speaking about Internal PageRank when they discuss PageRank at all, except where they qualify their remarks to address the Toolbar PR value (that nearly all moderators and admins now tell people to ignore). The Toolbar PR value is a proxy value and it is only published 3-4 times a year, making it a virtually worthless indicator of quality or value.
Parser - noun. A type of program used by search engines to break down your HTML pages into components for indexing. The parser strips your indexable content and passes it to one or more indexers. Many SEO forum moderators and admins who should know better continue to speak of “spiders” doing the parsing and indexing. Spiders basically retrieve files and place them into (search engine internal) queing areas for the parsers to munch on.
Partially Indexed Listings - See URL Listings below.
Quality Links - noun phrase. A nonsense expression with no real value or purpose other than to act as a catchall for the types of links people think are better than “those other links”. Googlers use “quality links” as a subtle way of telling people to stop getting cheap spammy links. Many SEO forum moderators and admins use “quality links” in a somewhat broader but similar fashion, if only because they don’t know exactly what criteria make links good for any particular search engine but they recognize that people who are asking about linkage have a problem. Nearly everyone else seems to use the expression to refer to their (usually non-performing) backlinks. I wrote about high quality links at SEOmoz (in a post designed to rank for “high quality links” on the basis of content — but the lesson passed over everyone’s head, except for Aaron Pratt who saw what I was doing right away).
SERP - acronym for Search Engine Results Page. Everyone seems to know this acronym by now. I have always hated it even though I now reluctantly use it. SRP (search results page) would be better, since it’s all inclusive. You can have a DRP (Directory Results Page) which some people might argue should be called a DSRP (Directory Search Results Page). I still get click throughs from Yahoo! and DMOZ directory page listings (or a DLP, Directory Listings Page).
Sitelinks - noun. Google invented this term, which is better than my classic “little clustered links under the main listing”. Sitelinks are those “little clustered links under the main listing” that deep link into the site by category or topic. Many people wonder how these Sitelinks appear. Googlers always say, “That’s algorithmically determined and we have no control over them” — meaning, “We wrote special commands into our software to create those things and we’re not going to tell you what criteria are used to decide which sites get them.” My best guess is that sites that have more than 1,000 pages of content, clear content categorization in their non-breadcrumb internal links, and lots of deep links from other domains are good candidates for Sitelinks. Other criteria are probably taken into consideration. Sitelinks are only shown for the top listing in a popular query result.
Sitemap - noun. A page on your Web site that links to all the other pages, or at least to all the important section top-level pages. Google has usurped this expression for their “Google Sitemaps” feature (now incorporated into the XML Sitemaps standard supported by several major search engines), where you can upload a file listing all of your pages for their crawlers. I have noticed that Googlers are now speaking of HTML Sitemaps to distinguish those Web site pages from the XML Sitemaps. I think it would be best if everyone adopted the convention of saying “XML Sitemap” or “HTML Sitemap” so we are all on the same page.
Trust - noun. Currently the latest SEO buzz word. Generally speaking, the SEO community picks up on a concept about six months to two years after it’s been worked through by the search engineers. Hardcore spammers (the ultimate “Black Hat” SEOs) are usually pretty good at detecting trends before everyone else. Trust has now officially been done to death. It is incorporated into every algorithm (including Windows Live even though we all agree that Microsoft still has a way to go) and the search engines are already looking at other issues. Trust is being placed in the hands of the Webmasters, but most Webmasters don’t seem to want the responsibility.
Update - noun. From the SEO side, an update is any noticeable change to the way a search engine behaves. From the search engines’ side, an update is any intended change in a search engine’s makeup or data. Matt Cutts offers an incomplete explanation of a Google update in his December 2006 Explaining Algorithm Updates and Data Refreshes post. He wrote a similar post in September 2005 with What’s An Update?. I don’t expect Matt to confirm every algorithmic change. That would pretty much defeat the purpose of many of them. Yahoo! and Windows Live occasionally issue “weather reports”. Matt has informally issued some on Google’s behalf.
URL Listings or URL-only Listings - noun phrase. These are the site listings that appear in Google with nothing more than a URL. Matt Cutts explained that they are uncrawled links that Google knows something about from inbound linkage. Google will (or used to) occasionally pull a description from the Open Directory Project for uncrawled links, but you often see them without any description at all. Uncrawled links are not shown in Google’s SafeSearch mode. Matt also discussed them here.
Validate - verb. Every time I use this word people reach for their W3C manuals. When I speak of search engines validating Web sites, I don’t mean they are looking to see if the HTML code meets some arbitrary standard. I mean they pass each URL through a process whereby they establish, according to their own criteria, that the site is “not spam”. Many spam sites appear to validate. The search engines are not perfect. Nonetheless, many spam sites don’t last long because they don’t validate or their validation is revoked. Maybe I could have used a better expression, but I can’t think of one.
And that’s all I can think of for now.
Comment
Log in or Register to post a comment.