SlideShare a Scribd company logo
1 of 10
How Google Works???
What exactly Google is???
Distributed Network Of Thousands Of Low-cost
Computers
Parallel Processing
Googlebot
 A Web Crawler That Finds And Fetches Web Pages.
The Indexer
The Query Processor
Googlebot
Google's Web Crawler
Finds and retrieves pages on the web and hands them
off to the Google indexer
Googlebot consists of many computers requesting and
fetching pages much more quickly than you can with
your web browser - thousands of them in A single
second
It functions much like your web browser, by sending A
request to A web server for A web page, downloading
the entire page, then handing it off to Google's indexer.
How Does It Find Pages?
Through an add URL form:
www.Google.Com/addurl.Html,
When googlebot fetches a page, it culls all the links
appearing on the page and adds them to a queue for
subsequent crawling.
Deep crawling
Challenges
Googlebot sends out simultaneous requests for
thousands of pages, the queue of "visit soon" URLs must
be constantly examined and compared with URLs
already in Google's index.
Googlebot must determine how often to revisit a page
Google continuously recrawls popular frequently
changing web pages at a rate roughly proportional to how
often the pages change. Such crawls keep an index
current and are known as fresh crawls
 Newspaper pages
Google's Indexer
Index is sorted alphabetically by search term, with
each index entry storing a list of documents in
which the term appears and the location within the
text where it occurs
To improve search performance, google ignores
(doesn't index) common words called stop words
(such as the, is, on, or, of, how, why, as well as
certain single digits and single letters
The indexer also ignores some punctuation and
multiple spaces, as well as converting all letters to
lowercase, to improve google's performance
Google's Query Processor
Several parts, including the user interface (search box), the
"engine" that evaluates queries and matches them to
relevant documents, and the results formatter.
Google gives more priority to pages that have search terms
near each other and in the same order as the query
Google indexes html code in addition to the text on the
page, users can restrict searches on the basis of where
query words appear, e.G., In the title, in the URL, in the
body, and in links to the page, options offered by the
advanced-search page and search operators.
1. web server sends the query to
the index servers.
2. content inside the index
servers is similar to the index
in the back of a book--it tells
which pages contain the
words that match any
particular query term.
3. query travels to the doc
servers, which actually
retrieve the stored documents.
4. Snippets are generated to
describe each search result.
5. The search results are
returned to the user in a
fraction of a second.
1. web server sends the query to
the index servers.
2. content inside the index
servers is similar to the index
in the back of a book--it tells
which pages contain the
words that match any
particular query term.
3. query travels to the doc
servers, which actually
retrieve the stored documents.
4. Snippets are generated to
describe each search result.
5. The search results are
returned to the user in a
fraction of a second.

More Related Content

What's hot

Week10web Poster
Week10web PosterWeek10web Poster
Week10web Posters1150245
 
How to Use Canonical Tags for SEO | Aleks Shklyar
How to Use Canonical Tags for SEO | Aleks ShklyarHow to Use Canonical Tags for SEO | Aleks Shklyar
How to Use Canonical Tags for SEO | Aleks ShklyarAleks (Aleksander) Shklyar
 
SEO presentation By Dang HA - ECM team
SEO presentation By Dang HA - ECM teamSEO presentation By Dang HA - ECM team
SEO presentation By Dang HA - ECM teamThuy_Dang
 
C. Concept Mapping (Week # 3 - 7)
C. Concept Mapping (Week # 3 - 7) C. Concept Mapping (Week # 3 - 7)
C. Concept Mapping (Week # 3 - 7) s1160202
 
Technical seo tips for web developers
Technical seo tips for web developersTechnical seo tips for web developers
Technical seo tips for web developersSingsys Pte Ltd
 
Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)SyedFaraz41
 
Structure of url, uniform resource locator
Structure of url, uniform resource locatorStructure of url, uniform resource locator
Structure of url, uniform resource locatorPartnered Health
 
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...CloudTechnologies
 

What's hot (17)

Week10web Poster
Week10web PosterWeek10web Poster
Week10web Poster
 
The Google Guide
The Google GuideThe Google Guide
The Google Guide
 
Seo tutorial
Seo tutorialSeo tutorial
Seo tutorial
 
Seo and analytics basics
Seo and analytics basicsSeo and analytics basics
Seo and analytics basics
 
How to Use Canonical Tags for SEO | Aleks Shklyar
How to Use Canonical Tags for SEO | Aleks ShklyarHow to Use Canonical Tags for SEO | Aleks Shklyar
How to Use Canonical Tags for SEO | Aleks Shklyar
 
SEO presentation By Dang HA - ECM team
SEO presentation By Dang HA - ECM teamSEO presentation By Dang HA - ECM team
SEO presentation By Dang HA - ECM team
 
C. Concept Mapping (Week # 3 - 7)
C. Concept Mapping (Week # 3 - 7) C. Concept Mapping (Week # 3 - 7)
C. Concept Mapping (Week # 3 - 7)
 
Basic SEO
Basic SEO Basic SEO
Basic SEO
 
Technical seo tips for web developers
Technical seo tips for web developersTechnical seo tips for web developers
Technical seo tips for web developers
 
SEO Interview FAQ
SEO Interview FAQSEO Interview FAQ
SEO Interview FAQ
 
On-Page SEO
On-Page  SEO On-Page  SEO
On-Page SEO
 
Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)Crawl optimization - ( How to optimize to increase crawl budget)
Crawl optimization - ( How to optimize to increase crawl budget)
 
Seo Introduction/Seo Basics
Seo Introduction/Seo BasicsSeo Introduction/Seo Basics
Seo Introduction/Seo Basics
 
SOA and web services
SOA and web servicesSOA and web services
SOA and web services
 
Structure of url, uniform resource locator
Structure of url, uniform resource locatorStructure of url, uniform resource locator
Structure of url, uniform resource locator
 
Webmaster tools (ICMK485)
Webmaster tools (ICMK485)Webmaster tools (ICMK485)
Webmaster tools (ICMK485)
 
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
 

Viewers also liked

Viewers also liked (6)

Copy herkman vainikka_tartu
Copy herkman vainikka_tartuCopy herkman vainikka_tartu
Copy herkman vainikka_tartu
 
Muuttuvan median 180111
Muuttuvan median 180111Muuttuvan median 180111
Muuttuvan median 180111
 
Cost function
Cost functionCost function
Cost function
 
Cost function
Cost functionCost function
Cost function
 
Lukeminen 2.0
Lukeminen 2.0Lukeminen 2.0
Lukeminen 2.0
 
Cost function
Cost functionCost function
Cost function
 

Similar to Dipika arora ppts

Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)jhon smith
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)ssunnysengar
 
The Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search EngineThe Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search EngineManish Chopra
 
An SEO optimized website is best charged up.pdf
An SEO optimized website is best charged up.pdfAn SEO optimized website is best charged up.pdf
An SEO optimized website is best charged up.pdfMindfire LLC
 
Notes for
Notes forNotes for
Notes for9pallen
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine Aniket_1415
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEONeeraj Reddy
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtipssounddelivery
 
Google indexing
Google indexingGoogle indexing
Google indexingtahoor71
 
Crawler-Friendly Web Servers
Crawler-Friendly Web ServersCrawler-Friendly Web Servers
Crawler-Friendly Web Serverswebhostingguy
 
SEO Training in Hyderabad | SEO Classes in Hyderbad | SEO Coaching in Hyde...
SEO Training in Hyderabad |  SEO  Classes in Hyderbad | SEO Coaching in  Hyde...SEO Training in Hyderabad |  SEO  Classes in Hyderbad | SEO Coaching in  Hyde...
SEO Training in Hyderabad | SEO Classes in Hyderbad | SEO Coaching in Hyde...Prasad Reddy
 

Similar to Dipika arora ppts (20)

Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)
 
The Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search EngineThe Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search Engine
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Search engine
Search engineSearch engine
Search engine
 
Seo Manual
Seo ManualSeo Manual
Seo Manual
 
An SEO optimized website is best charged up.pdf
An SEO optimized website is best charged up.pdfAn SEO optimized website is best charged up.pdf
An SEO optimized website is best charged up.pdf
 
Notes for
Notes forNotes for
Notes for
 
Google
GoogleGoogle
Google
 
Basics of SEO
Basics of SEO Basics of SEO
Basics of SEO
 
SEO
SEOSEO
SEO
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine
 
Seo
Seo Seo
Seo
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEO
 
Search engine
Search engineSearch engine
Search engine
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
 
Google indexing
Google indexingGoogle indexing
Google indexing
 
Crawler-Friendly Web Servers
Crawler-Friendly Web ServersCrawler-Friendly Web Servers
Crawler-Friendly Web Servers
 
Lvr ppt
Lvr pptLvr ppt
Lvr ppt
 
SEO Training in Hyderabad | SEO Classes in Hyderbad | SEO Coaching in Hyde...
SEO Training in Hyderabad |  SEO  Classes in Hyderbad | SEO Coaching in  Hyde...SEO Training in Hyderabad |  SEO  Classes in Hyderbad | SEO Coaching in  Hyde...
SEO Training in Hyderabad | SEO Classes in Hyderbad | SEO Coaching in Hyde...
 

Dipika arora ppts

  • 2. What exactly Google is??? Distributed Network Of Thousands Of Low-cost Computers Parallel Processing Googlebot  A Web Crawler That Finds And Fetches Web Pages. The Indexer The Query Processor
  • 3. Googlebot Google's Web Crawler Finds and retrieves pages on the web and hands them off to the Google indexer Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser - thousands of them in A single second It functions much like your web browser, by sending A request to A web server for A web page, downloading the entire page, then handing it off to Google's indexer.
  • 4. How Does It Find Pages? Through an add URL form: www.Google.Com/addurl.Html,
  • 5. When googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. Deep crawling
  • 6. Challenges Googlebot sends out simultaneous requests for thousands of pages, the queue of "visit soon" URLs must be constantly examined and compared with URLs already in Google's index. Googlebot must determine how often to revisit a page Google continuously recrawls popular frequently changing web pages at a rate roughly proportional to how often the pages change. Such crawls keep an index current and are known as fresh crawls  Newspaper pages
  • 7. Google's Indexer Index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs To improve search performance, google ignores (doesn't index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve google's performance
  • 8. Google's Query Processor Several parts, including the user interface (search box), the "engine" that evaluates queries and matches them to relevant documents, and the results formatter. Google gives more priority to pages that have search terms near each other and in the same order as the query Google indexes html code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.G., In the title, in the URL, in the body, and in links to the page, options offered by the advanced-search page and search operators.
  • 9. 1. web server sends the query to the index servers. 2. content inside the index servers is similar to the index in the back of a book--it tells which pages contain the words that match any particular query term. 3. query travels to the doc servers, which actually retrieve the stored documents. 4. Snippets are generated to describe each search result. 5. The search results are returned to the user in a fraction of a second.
  • 10. 1. web server sends the query to the index servers. 2. content inside the index servers is similar to the index in the back of a book--it tells which pages contain the words that match any particular query term. 3. query travels to the doc servers, which actually retrieve the stored documents. 4. Snippets are generated to describe each search result. 5. The search results are returned to the user in a fraction of a second.