Intro
There are 2 types of search engines: Search Engine Spiders and Search Engine Directories. Each search engine uses unique criteria for "ranking" sites - this "ranking" determines their placement in the search engine results page (SERP). Spider-driven engines use robots to spider sites on the Internet. Robots "crawl" each site and "score" pages based on relevancy. Directories have humans that check the sites - rather than robots. Some engines score the index page while others score individual web pages.
The goal of search engine optimization (SEO) is to get higher rankings. This is done using a two-step process. The first is to use industry information that allows professional optimizers to understand what these robots are looking for. This is a complex process because a Web site's placement within a spider driven search engine is derived from hundreds of variables such as link popularity, click popularity, keyword density, Web site themes and more. The second way search engine optimization professionals optimize sites is to eliminate/reduce on-site techniques that can impede the search engine spidering process.
The topics mentioned below could either block search engine spiders form crawling your site accurately or ultimately get your site penalized in certain engines and/or directories. Take these into serious consideration when designing and/or optimizing your website.
SEO Topics Covered:
Duplicate Content
Creating duplicate content/mirrors/redirects might be one of the worst things you could possibly do if you want to succeed in the search engines. When search engines were first getting popular, you could simply point 10 domain names to the same Web site and they would all stack up on the same page of results for the same keywords. Meaning, if you ranked well with one phrase, all 10 of those sites would do the same. This was a burden to the search engines, so now they use very sophisticated algorithms to filter out duplicate content.
They examine all aspects of site structure, image names, and matching text. When too many of these areas match another web site it triggers a red flag, and the site is penalized. Beware of mirror sites, affiliate sites, or any other "cookie-cutter" web marketing service that promises big profits with little effort. Today's engines will remove or reject duplicate content, so this usually leads to failure.
If you want to thrive on the web make sure your site has original and unique content. The safest way to get top search engine placement is to produce real content.
Following are three examples of the use of duplicate content:
Doorways
Many believe doorway pages are an essential aspect of an effective search engine optimization. In an effort to improve rankings, however, some marketers have spammed the search engines with doorway pages, generating multiple pages with little information, making it a topic of much controversy. Search engines have responded to this practice, and are now much stricter in their rules and requirements. Filters have been created to block the "spammers rendition" of doorway pages.
A doorway page, or gateway page, is an alternate entryway to a web site created in the interest of obtaining a top ranking on particular keyword phrases in a major search engine. Doorway pages are often hosted at a different location than the original site. In other words, a new domain name is registered (usually one that includes keyword phrases) and the doorway page is created on that domain name, with links to a destination page on another web site. Typically, these pages match the look and feel of the original site. You should avoid registering a large number of domains with this tactic because it could be considered spam by the search engines and could get your site penalized.
Frames
Frames present some great possibilities to us from a web design standpoint, but they should be avoided if at all possible when it comes to search engine optimization and getting your site listed in the search engines. Many spider-based engines cannot crawl through them, and specific coding is necessary to make them readable by the engines. This coding is viewed as spam amongst most of the search engines. Spiders want to be able to read and view everything that the visitor of the site can. However, if your web site does use frames make sure that you take advantage of the content area on your web site that doesn't utilize frames. It's a very powerful section of the site, and if used properly it can result in some excellent rankings. Nevertheless, frames do pose unique problems and spiders cannot read them. The good news is that despite many of the limitations frames pose, many frameset 'issues' can be turned into frameset 'positives.'
So, if you are going to use frames for search engine optimization make sure that you use them wisely. You can still create a pleasing interface on a two-frame set by specifying the dimensions of your top or bottom frame as 5 to 8 pixels or 5 to 8%. That should help you avoid the spam filters.
Cloaking
Cloaking, also known as spoofing, is a method of web page delivery where different pages are served from the same address, no matter if the visitor is a human or a spider. In other words, browsers such as Internet Explorer are served one page, and spiders visiting the same address are served a different page, usually an optimized page. There are two methods of delivering cloaked pages - IP address and Agent name.
There are two reasons people use cloaking techniques.
By using cloaking, nobody sees the page except for the spider. That gives cloaked pages an extremely powerful advantage over web pages that were optimized to accommodate a professionally appealing design.
But, cloaking may be one of the most frowned upon techniques among all engines. Filters will pick up pages like the following in no time:
IP cloaking is abusive in how it attempts to manipulate a search engine's index. Since IP cloaking is deceptive, search engines routinely purge IP cloaked pages and in some cases ban these web sites permanently.
Link Farms
Since so many engines use link popularity as an integral part of their ranking algorithms, many webmasters responded by joining link farms and stuffing their sites and others with as many links as possible. But, all links are not good links. In fact, bad linking strategies may get you banned from some engines.
A link farm is a network of web pages, which are heavily cross-linked with each other for the sole purpose of increasing link popularity. The web pages usually are in more than one domain or in more than one server. When a web site joins a link farm, it gets a link from each of these pages and in turn it also has to link back to each of those pages. This will then affect the link popularity of the site. But search engines definitely detect the link farms as well as the web sites participating in the link farms. Google®, for one, disapproves of link farms and labels the links they generate as spam. In fact, some sites get removed from the index altogether if they are affiliated with link farms or link stuffing.
Because of this, some webmasters have chosen to remove all links going out to other sites. That is an overreaction that decreases the site value to visitors and hurts the Web in general because cross-linking is a basic tenet of the Internet. Links are fine - even encouraged - if they are related to your topic, but link farms rarely provide useful content to visitors. If your site is selling cars, linking to car parts sites, car forums and other car related sites, is very safe and encouraged. You are only providing access to other sites that are of interest to your visitors. But, if you signed up with a service that promises to generate five hundred inbound links to your site only if you agree to add two hundred outbound links in return, then you are likely participating in a link farm.
Instead of linking to related information of value to your visitors, you are sending them to sites with non-relevant and useless information. Search engines will not penalize you for good, relevant links, but are quick to punish sites that try to spam them with unrelated links.
Spider Design Blocks
Despite the best efforts to make your site look unique and attractive, some of the web's most prized web design technology can be a major stumbling block for a search engine spider.
Flash Sites (or flash introductions) - while beautiful, cannot be read by a spider. Your solution options are to use an entrance page that is keyword text phrase intense, create a two frame frameset where one frame is only one pixel high and use the No Frames area, or to alternate the use of Flash and static HTML. Following are design attributes that block spiders:
Search Engine Spamming
Search engine spamming is the use of unethical techniques for improving the position of a Web site in a search engine. In order to improve their position in a search engine, some Web site owners use spamming (unethical techniques) and in turn try to fool the search engines.
Each search engine's objective is to produce the most relevant results to its visitors. Producing the most relevant results for any particular search query is the determining factor of being a popular search engine. Every search engine measures relevancy according to its own algorithm, thereby producing a different set of results. Search engine spam occurs if anybody tries to artificially influence a search engine's basis of calculating relevancy.
The following techniques can be considered spamming:
Search Engine Optimization – Network Solutions® Overview
Search engines strive to provide the most relevant results to their users, but spam swamps their indexes with irrelevant and misleading information. Therefore, it is advisable to make no mistakes and stay clear of anything that could be seen as spam by the engines. Instead, focus on an ethical approach to SEO. Search engines will always react to the spam techniques when they become a big enough issue and they are affecting searchers. Banning is a last resort but has definitely been known to happen.
The following list will give you an idea of the basic "DONTS" for the search engines:
As you can see, there are a lot of ways to fool the search engines, but just about all of them are detectable - and that makes them very dangerous.
If you are serious about custom delivery to the engines, there is really only one way to go - and that is with a professional search engine optimization.