Showing posts with label internet search engines. Show all posts
Showing posts with label internet search engines. Show all posts

Keyword Research is by far the most important aspect in any Search Engine Optimization initiative.

Keyword Phrase Research is process of selecting the most "optimum performance" keyword phrases that can help visitors find your site. You may have spent days and months on fine-tuning your web pages for a better ranking with the major search engines, yet it will all amount to a big waste if the right keyword phrases are not targeted. It's like not being able to reach your destination even after running your best race because you started out on the wrong road. Even if you achieve high search engine rankings, you may not get relevant traffic if you select the wrong keywords. Therefore, the foremost step in any SEO campaign is identifying your target audience and researching what keyword phrases they might be searching in the search engines to locate a site like yours.

For any marketing strategy to succeed, it is critical to know your audience and the means to reach them. A certain focus is required which could be location specific, region specific or country specific; it could be business, trade, service, product specific, since we are talking specific audience. For instance, a dentist practicing in a particular town would most likely target people living in the same region, instead of targeting the entire country. Just as a patient searching for a dentist would search for one in his own area. Focus on region would help her get targeted visitors, not just wasted traffic.
Common Pitfalls

A common pitfall is to start the website optimization exercise with a set of "gut-feel" keyword phrases. Site owners often come up with 'common sense looking' key phrases, which though look obvious, may not match with the ones your buyers are using as their search term. Very often, being from within the trade narrows the vision and you tend to assume that trade-specific terms are easily understood and popularly used. Not so. You need to think out of the box.

Doing Keyword Research invariably means departing from one's gut-feel and going by the facts. 'Facts are sacred' in website optimization as they provide the exact data of what people are actually searching for, thus saving you from starting on a wild goose chase. As mentioned earlier, targeting the wrong key phrases might get you a good ranking for keywords that have few or no search requests or just get you irrelevant junk traffic. So, how does one get the facts and the data regarding a particular search term? There are several online keyword research tools like Wordtracker and Overture, which offer data pertaining to your search term. Relying on search tools to analyze keyword phrase data helps you to get a grip on your target audience.
Keyword Research Process

Keyword Research process involves the following important steps. These steps can be described as:

Keywords

  • Analyzing Keywords
  • Selecting Keywords
  • Deploying Keywords
The Discovering phase should focus on identifying as many keywords as possible that are related to your website and target audience.

The Analysis phase involves adding information about existing competition; PageRank based limitations and potential for ranking.

The Selection process involves objective measurement based short listing of keywords keeping the site focus and target audience within the limitations analyzed.

The Deploying phase is about making optimum use of your selected keywords on your website copy, HTML code and tags.

What is Google?

“Googol” is the mathematical term for a 1 followed by 100 zeros. The term was coined by Milton Sirotta, nephew of American mathematician Edward Kasner, and was popularized in the book, “Mathematics and the Imagination” by Kasner and James Newman. Google's play on the term reflects the company's mission to organize the immense amount of information available on the web.
Google Technology

Google.com began as an academic search engine. In the paper that describes how the system was built, Sergey Brin and Lawrence Page give an example of how quickly their spiders can work. They built their initial system to use multiple spiders, usually three at one time. Each spider could keep about 300 connections to Web pages open at a time. At its peak performance, using four spiders, their system could crawl over 100 pages per second, generating around 600 kilobytes of data each second.

Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:

* Googlebot, a web crawler that finds and fetches web pages.
* The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
* The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.

Let's take a closer look at each part.
Googlebot, Google's web Crawler

Googlebot is Google's web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It's easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn't traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, and then handing it off to Google's indexer.

Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it's capable of doing.

Googlebot finds pages in two ways: through an add URL form, www.google.com/addurl.html, and through finding links by crawling the web.

allows rapid access to documents that contain user query terms.

To improve search performance, Google ignores (doesn't index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google's performance.

Google's Query Processor

The query processor has several parts, including the user interface (search box); the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.

Google considers over a hundred factors in determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the

proximity of the search terms to one another on the page. PageRank is Google's system for ranking web pages.

Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling-correcting system uses such techniques to figure out likely alternative spellings

Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by the Advanced-Search page and search operators.

Let's see how Google processes a query.

History of Site Ranking

In the early 1990's when the web was emerging, several sites having industry specific content were being added to the web each day. Web surfers, on the other hand, had very few tools to locate such sites, which they believed were out there but did not have a clue about their domain names or web addresses. With the birth of Yahoo in 1993, surfers were offered some relief. Yahoo classified each site it discovered in a neatly organized directory list and also embedded a search engine in its site to search for sites based on 'keywords' existing in its database. Several other search engines like AltaVista, Excite, and Lycos etc. followed the search trends offering site search facilities to users. Most of these search engines relied heavily on Meta Tags to classify the relevance of websites based on the keywords they found in the tags.

Things seemed to work out fine before site owners and webmasters realized the value of how they can 'embed' industry specific keyword phrases in their Meta Tags and other site code, thus manipulating their way to show up higher in search results. Over a period of time, search engine results started getting cluttered with sites that spammed their content with relevant keywords but had poor site content for the visitor. The very essence, credibility and importance of search engines was now being challenged to deal with how they could offer a more refined search output to their users.

What is PageRank ?

PageRank is a unique algorithm developed by Google founders Larry Page and Sergey Brin at Stanford University and determines the importance of a web page measuring page importance on a scale from 0 - 10, where 10 is the highest. The main factor behind the PageRank algorithm is link popularity. If one site links to another site, then Google interprets this link as a vote, the more votes cast, obviously the more important the page must be. ...

From here on in, we'll occasionally refer to PageRank as “PR”.

Note:

Not all links are counted by Google. For instance, they filter out links from known link farms. Some links can cause a site to be penalized by Google. They rightly figure that webmasters cannot control which sites link to their sites, but they can control which sites they link out to. For this reason, links into a site cannot harm the site, but links from a site can be harmful if they link to penalized sites. So be careful which sites you link to. If a site has PR0, it is usually a penalty, and it would be unwise to link to it.

Emergence of Google PageRank

Google realized the problem conventional search engines faced in dealing with this situation. If the control of relevance remained with the webmasters, the ranking results would remain contaminated with sites artificially inflating their keyword relevance.

Web, by its very nature is based on hyperlinks, where sites link to other prominent sites. If you take the logic that you would tend to link to sites that you consider important, in essence, you are casting a vote in favor of the sites that you link to. When hundreds or thousands of sites link to a site, it is logical to assume that such a site would be good and important.

Taking this logic further the Google founders, Sergey Brin and Larry Page formulated a Search Engine algorithm that shifted the ranking weight to off-page factors. They evolved a formula called PageRank (named after its founder Larry Page) where the algorithm would count the number of sites that link to a page and assign it an importance score on a scale of 1-10. More the number of sites that link to a page, higher its PageRank.

The Google Toolbar

You can download Google Toolbar (free) and install it in your Internet Explorer within minutes. Amongst other useful features, it displays the PageRank of each web page you visit.

The Google toolbar appears just below your Internet Explorer browser and can be used for making a search on the web from any page. Google toolbar displays the PageRank of each web page on a scale of 1-10. If you have the Google toolbar installed in your browser, you would be used to seeing each page's PageRank as you browse the web. Google does not display the PageRank of web pages that it has not indexed. Please note that the Toolbar displays the PageRank of individual pages and not the site as a whole.

PageRank in Google's own Words

Google explains PageRank as follows:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an Indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don't match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content (and the content of the pages linking to it) to determine if it's a good match for your query.
Relationship between Search Engine Ranking and PageRank

While the exact algorithm of each search engine is a closely guarded secret, search engine analysts believe that the search engine results (ranking) is some form of a multiplier factor of ‘Page Relevance’ and ‘PageRank’. Simply put, the formula would look something like:

PR (A) = (1-d) + d (PR (t1)/C (t1) + ... + PR (tn)/C (tn))

That's the equation that calculates a page's PageRank. It's the original one that was published when PageRank was being developed, and it is probable that Google uses a variation of it but they aren't telling us what it is. It doesn't matter though, as this equation is good enough.

In the equation 't1 - tn' are pages linking to page A, ‘C’ is the number of outbound links that a page has and ‘d’ is a damping factor, usually set to 0.85.

We can think of it in a simpler way:-

A page's PageRank = 0.15 + 0.85 * (a “share” of the PageRank of every page that links to it) “share” = the linking page's PageRank divided by the number of outbound links on the page.

A page “votes” an amount of PageRank onto each page that it links to. The amount of PageRank that it has to vote with is a little less than its own PageRank value (its own value * 0.85). This value is shared equally between all the pages that it links to.

From this, we could conclude that a link from a page with PR4 and 5 outbound links are worth more than a link from a page with PR8 and 100 outbound links. The PageRank of a page that links to yours is important but the number of links on that page is also important. The more links there are on a page, the less PageRank value your page will receive from it.

If the PageRank value differences between PR1, PR2 ...PR10 were equal then that conclusion would hold up, but many people believe that the values between PR1 and PR10 (the maximum) are set on a logarithmic scale, and there is very good reason for believing it. Nobody outside Google knows for sure one way or the other, but the chances are high that the scale is logarithmic, or similar.

Whichever scale Google uses, we can be sure of one thing. A link from another site increases our site's PageRank. Just remember to avoid links from link farms.

Source By : Google.com

What is Search Engine?

Internet search engines (e.g. Google, AltaVista) help users find web pages on a given subject. The search engines maintain databases of web sit

es and use programs (often referred to as “spiders” or “robots”) to collect information, which is then indexed by the search engine. Similar services are provided by “directories”, which maintain ordered lists of websites e.g. Yahoo!

How Internet Search Engines Work

The good news about the Internet and its most visible component, the World Wide Web, is that there are hundreds of millions of pages available, waiting to present information on an amazing variety of topics.

When you need to know about a particular subject, ho

w do you know which pages to read? If you're like most people, you visit an Internet search engine.

Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in

the ways various search engines work, but they all perform three basic tasks:

  • They search the Internet -- or select pieces of the Internet -- based on important words.
  • They keep an index of the words they find, and w here they find them.
  • They allow users to look for words or combinations of words found in that index.

Early search engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand inquiries each day. Today, a top search engine will index hundreds of millions of pages, and respond to tens of milli

ons of queries per day.

Before a search engine can tell you where a file or document is, it must be found. To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling. (The

re are some disadvantages to calling part of the Internet the World Wide Web -- a large set of arachnid-centric names for tools is one of them.) In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.

How does any spider start its travels over the Web? The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the mo

st widely used portions of the Web.

Types of Search Engines

The term “search engine” is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They “crawl” or “spider” the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

Human-Powered Directories

A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

“Hybrid Search Engines” Or Mixed Results

In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.