These days, search engines are an essential part of our daily lives they, we use them all the time;
We use them thru browsers or apps on smart phones, tablets, laptops, desktops, etc.
A software system that is designed to search for information on the web such as URLs, documents, files, multimedia, etc. The search results are displayed in result pages.
Archie (Closed), was created 1990 by Peter Deutsch, Alan Emtage, and Bill Heelan and was the first search engine the world. It hosted an index of downloadable directory listings. Because of limited space, only listings were available by not the contents of each site.
ALIWEB (Closed), was created in 1993 by Martijn Coster, it allowed people to add pages to it along with a description. However, people at that time didn’t know how to submit their web sites.
YAHOO! SEARCH, was created by David Filo and Jerry Yang 1994.
Google, was created 1998 by Larry Page and Sergey Brin. They began working on in 1996 and it was called Backrub in the beginning.
Types:
Crawler-based: consists of three parts:
Spider (a.k.a crawler): Visits a webpage, reads its info and follows links to other pages within the site (The website/webpage is called “spidered”). The spider returns to the site on a regular basis too look for changes.
Index: The web pages/sites the spider finds is stored in an index. The index contains every copy of every web page that the spider find, If a web page changes. Sometimes it takes time for new pages that the spider finds to be added to the index. So, a web page may have been spidered but not indexed (added to the index). Until it is indexed, it is not available for search.
Software: Algorithms/s that filters millions of pages recorded in the index to find matches to a search query and rank them in order of what it considers is most relevant.
e.g. Google, Bing
Human-Powered: depends on humans for its listings. You submit a short description to it for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.
e.g. YAHOO! SEARCH
Advantages of crawler-based search engines
Offers larger searchable databases of web sites.
The full text of individual web pages is often searchable.
Good for searching ambiguous terms or phrases
Disadvantages of crawler-based search engines
No human quality control to remove duplicates and junk
The size of the database can produce high numbers of search results
Advantages of human-powered search engines/directories
User can browse, when user is not sure what they are looking for
If the user is unsure of which keywords to use in order to find information
Advantages of human-powered search engines/directories
Directories are smaller than search engine databases, and only index top-level pages of a site.
The content of a site or page can change without the directory being updated.
Spam: Some web admins try to manipulate their placement in the rankings of various search engine. As search engine techniques have developed, new spam techniques have developed in response. Search engines do not publish their anti-spam techniques to avoid helping spammers to evade them.
Cloaking: involves serving different content to a search engine crawler/spider than to other users.
As a result, the search engine is tricked about the content of the page and ranks the page in ways that, look random to humans.
Content Quality : The web is full with text that — intentionally or not — misleads readers. While there has been a plenty of research on determining the relevance of documents, the issue of document quality or accuracy has not received much attention.
Proximity Searches: Search results that are relative to the device’s location, usually found in mobile devices
Forcing Mobile-Friendly Sites: Search engine giants (especially Google) are forcing websites to have a mobile-friendly version. Websites that do not comply, get lowered in search results
Focus on keywords and search rankings is decreasing: because search user experience is significantly affected by the devices people use: tablets, PCs, smartphones, etc.