Internet Guide Logo

Web Search Engines

bullet Introduction

The World Wide Web ('web' for short) is an information system found on the Internet, and a search engine is a software system that searches information available on the web. The web is based upon a client-server model, where a user accesses a search engine (server) by using a client (browser). Web users have a variety of options available to query a search engine: 1) they can use a search box installed within the user interface of the browser; 2) visit the website of a search engine by inputting its URL into the address bar of the browser; 3) setting the search engine as the homepage of the browser; 4) clicking on a browser bookmark for the search engine.

Search engines collect web information by using a piece of software named a crawler/robot/spider that follows hyperlinks that link websites together in the form of a 'web'. Early search engines tended to only index the plain text of websites, and only this portion of a website could be searched, but modern search engines are more sophisticated and index far more web content (images, videos etc). The search results index of search engines was originally updated on a predetermined (monthly) basis (1998-2005), but modern search engines, like Bing and Google, can update their search results index on 'the fly' (hourly).

Users query a search engine by inputting a keyword(s) into a search box located on the homepage of a search engine, and the search engine will then attempt to match the keyword(s) with the most relevant information it has collected in it's search results index (database). A search engine results page (SERP) is then returned for the query; which typically has ten results per page. Search engines use an algorithm to determine the accuracy/relevancy of the search results it serves in a SERP: this process does not involve human editors and is conducted in actual-time. Search engines typically find new content by following hyperlinks included on webpages that are already listed within its search results index, but they usually include a submission tool where webmasters can submit new URLs or a sitemap for a website, for content that does not have a hyperlink from an established website.

Further reading: Basics of using a search engine and how search engines differ from directories.

bullet History

When the World Wide Web was launched in 1991, Tim Berners-Lee (inventor) manually maintained a list of new web servers. As time passed, the amount of new servers grew exponentially and it was impossible to manually edit a list of new servers. This is when it became apparent that an automated software system was needed to locate and search the information available at web servers. The creation of web search engines was probably influenced by another Internet information system named Gopher: in 1992 by Steven Foster and Fred Barrie developed a search engine for Gopher named Veronica, and in 1993 Rhett Jones launched a Gopher search engine, named Jughead. Veronica predated web search engines, and it would be logical to surmise that early web developers created their search engines due to the success of Veronica.

One of the earliest software applications used to 'crawl' the World Wide Web was the World Wide Web Wanderer: simple referred to as "the Wanderer", it was designed to survey the size of the web. The first web search engines were launched in 1993, and included: ALIWEB, JumpStation and W3Catalog. These early search engines, and their crawlers, were basic in their scope: typically indexing the title and metadata of webpages. In 1994, WebCrawler was the pinnacle of search engine technology: WebCrawler was the first crawler that was able to crawl and index every word available on a webpage. The robot's exclusion standard, initiated by Martijn Koster, led to the creation of robots.txt files: that enable websites to limit access to their website from web crawlers.

WebCrawler was soon rivaled by the crawlers developed by Lycos, Yahoo!, SAPO, Yandex, Hotbot, Ask Jeeves and Dogpile; companies that have managed to continue to exert a presence on the web. Some early search engines, whose prominence was short-lived and are now inactive, are: Magellan, Teoma, AlltheWeb, Inktomi, Infoseek, Northern Light, and AltaVista. The problem that hampered early search engines was: How to make money? An early solution to this issue was 'paid' inclusion - which promised a higher ranking or excluded free listings altogether - but, the problem with paid inclusion was its negative effect on the users experience: with commercial rather than educational resources dominating the search results. Search engines that switched from being free engines to paid engines, soon found their popularity in jeopardy. The answer lay in a new search engine named Google: Google's 'organic' search results were free for inclusion - thus satisfying users and web developers - but Google also included paid listings alongside their organic results to enable the company to generate revenue from their service.

Google was one of the first search engines to focus solely on its search results; with a minimalist and fast loading homepage that only contained a search box and a logo. Competitors, like Alta Vista and Yahoo!, had homepages crammed full of links to various services, and were bloated and slow loading. When Google was launched, the majority of worldwide users were using a dialup connection: therefore, a fast loading homepage gave Google a performance edge and a unique selling point (due to AltaVista decision to become a web portal). Google has continued, to the present day (2017), to be the most popular search engine used on the World Wide Web: simple because, for most users, it provides the most relevant results. This has historically been due to Google having the largest index of information and a superior algorithm (mathematical programming system used to determine which web pages are displayed in search results).

Google has dominated the online search market since their launch in 1998, but, they have not been without competitors, most notable: Yahoo! and Microsoft (Bing). Since 2004, Yahoo! and Microsoft have launched a number of new search engines, none of which were successful enough to 'best' Google, but were good enough to remain relevant and competitve. Rather than invent a new paradigm, Yahoo and Microsoft attempted to beat Google at their own game: with a search results page that was virtually identical to Google's. A problem all search engines have had is indexing all the new content that is being created, and making it available to users: obviously there is only a set number of positions available for any given search term, and when billions of additional webpages are being created for any given subject, it makes it difficult for search engines to satisfy webmasters need to be visable in organic search results. The crawlers that modern search engines use are called: Googlebot, Bingbot, WebCrawler, ExaBot, Yahoo! Slurp, AskJeeves, Baidu Spider, Yandex Bot, Scooter, Mercator, Facebook External Hit, Atomz, ArchitectSpider, and Lycos_Spider_T-Rex.

So which search engine is currently (2017) the most popular? A range of companies analyse and measure digital data; while the figures they collate may differ, they show a similar trend when it comes to search queries. ComScore is considered a global leader in measuring search engines and the digital world. They released the following data in 2012: 175.9 billion total search queries; Google: 114.7 billion search queries; Baidu: 14.5 billion search queries; Yahoo!: 8.6 billion search queries; Yandex: 4.8 billion search queries; MSN (Bing): 4.5 billion search queries. ComScore's data, from 2007-2016, has shown that the amount of yearly search queries has grown year-on-year; while Google's total search queries has grown each year, Yahoo's has stagnated. The other notable trend is the success of Baidu and Yandex: Chinese and Russian language search engines that dominate their domestic markets. While eBay, Time Warner Network, Alibaba.com Corporation and the Ask.com Network make it into the top 10 search worldwide properties, the dominant search engine continues to be Google, followed by: Baidu, Yahoo!, Yandex and Bing.

bullet Search Engines: Timeline

The following list includes search engines that provide/provided traditional search results, 'pay per click' search results and 'meta' search results.

1993

1994

1995

1996

1997

1998

1999

2000

2004

2001