SlideShare a Scribd company logo
1 of 39
Download to read offline
Search Engines
and Online Research

   February 27, 2008
   IST 523
   Denise A. Garofalo
What are search engines?
 designed to make surfing the web
 simple, fast and rewarding for
 Internet users
 designed to search out Web pages
 one at a time and collect the
 results
What do search engines do?
 gather together information
 store it in a database
 allow access to a list of individual
 pages based on:
    a word, or,
    set of words that you submit in the
   form of a query
How do search engines work?
 they send out computer programs
 known as “spiders” or “robotsquot; to
 search the web
 interested in reading and storing
 the actual text that is shown on a
 web page, not graphics, etc.
How, continued….
 spider begins by visiting a single Web
 page
   it saves the text that if finds there
   after it has collected the information on that
   page, it looks for a link that will take it to
   another page
   when it reaches the next page, it starts the
   process all over again
 by following these steps over and over
 again, search engines are able to find
 and index far more web pages than a
 human being
More how…..
 search engines setup spiders to begin
 their searches at web sites known as
 directories
   large web sites that contain lists of links
   that have been collected by human beings
 no way for spiders to find every page
 listed on the World Wide Web
   millions of web pages do not have any links
   to them from other sites
   without these links, spiders can’t find and
   index those pages
How do search engines show
the results?
 Sites are ranked based on the textual
 content of a web page
 A special set of criteria, or algorithm, is
 used to decide which pages to display
 Algorithms consider things like the title
 of the page, the text of the page, how
 many other web sites link to the page,
 and even what text web sites that link
 to a page use to describe it
Search engines--review
 a series of computer programs that find
 and save files at a very fast rate
 when combined with algorithms
 designed to sort content based on text
 queries search engines become a useful
 tool to find a little bit of information in
 that vast collection of files known as the
 World Wide Web
Which search engine is best?
  Need to understand how each search
  engine works
  Check out the Bruce Clay, Inc. search
  engine relationship chart:
http://www.bruceclay.com/searchenginere
  lationshipchart.htm
Invisible Web (or Deep Web)
 Some pages and links are excluded
 from most search engines by policy
 Others are excluded because search
 engine spiders cannot access them.
 Pages that are excluded are referred to
 as the “Invisible Web” (or “Deep Web”)
   you don't see these pages in search engine
   results
   estimated to be two to three or more times
   larger than the “visible web”
Why invisible pages?
 If a search engine doesn’t locate a
 Web page it’s because:
   Technical barriers prohibit access
   Choices or decisions made by the
   search engine (policy) exclude the page
Technical barriers
 Typing or judgment is required
   Searchable specialized databases
   Logins and/or passwords required
Policy issues
 Page format
 Non-HTML pages
 Script-based programs (those URLs
 with a “?”)
Research issues
 Different search tools give
 different results
 Failure to retrieve does not mean
 that there is nothing available
 Develop a search strategy
 Learn the search engine’s search
 tips
 Evaluation
A selection of engines
 •Google           www.google.com
 •Vivisimo         www.vivisimo.com
 •Ask.com          www.ask.com
 •Yahoo!           www.yahoo.com
 •Open Directory   www.dmoz.org
 •Ixquick          www.ixquick.com/
 •Mamma            www.mamma.com/
 •Gigablast        www.gigablast.com/
 Search.com        www.search.com/
Failure to retrieve
 crawling Web pages and locating sites for
 search engines is based on using links from
 one page to reach other pages to crawl
    documents with few links tend to be overlooked
   if pages are never discovered, they are not
   available to researchers
 Failure to retrieve can also be linked to the
 search query used, or search strategy
Search strategy
 three main considerations in the search
 process
   Relevance
   Precision
   Recall
Successful search strategy
 ability to create an exact match
 between search statement and
 documents sought
 size and content of the search engine
 selected
 search engine’s search tools
Process
 involves consultation of definition tools
   subject dictionaries
   thesauri, etc.
 subject familiarization
   i.e. if searching on medical topics, become
   familiar with basic terminology
   same goes for research in any other
   subject area
Formulating a strategy
 be logical
 spend time on search term selection and
 combining to reduce the time spent
 eliminating irrelevant search results
 search engines are good for searching on
 unusual or unique keywords, and for
 combining keywords
 be creative and flexible
 look for subtle connections
 be prepared to make intuitive leaps
Simplified search strategy
 Formulation of the research question and its
 scope
 Identification of concepts within the question
 Identification of search terms to describe
 those concepts
 Consideration of synonyms and variations of
 those terms
 Preparation of the search logic
 Readiness to revise and redo a search
Boolean logic
 describes certain logical operations that
 are used to combine search terms
 basic Boolean operators are AND, OR
 and NOT
AND
limits results to those items that contain
both, or all, of the search terms in the
query
search query with the AND operator will
retrieve only those items containing
both all search terms
OR
helpful in the first phases of a search
  especially if the searcher is unsure of what
  information is available on the topic or
  what words are used to categorize it
when used between two words, it
instructs the search tools to retrieve any
record containing either of the words
NOT
The third of the most common Boolean
operators
used to eliminate records containing a
particular word or combination of words
from the search results
Search engine search tips
 Check the Help files of a search engine
 Some search engines allow you to apply date
 restrictions to a search
 Word order in natural language searching can
 greatly influence the search
   A question phrased in difference ways can
   produce different results
   An added influence is the weight some search
   engines place on words located earlier in the
   search query
+ sign
 ensures that a search engine finds
 pages that have all the words you
 enter, not just some of them
- sign
 a search engine will find pages
 that have one word on them but
 not another word
Phrase searching
 ensures that terms appear in the order
 they are entered
 placing the phrase within quotation
 marks tells the search engine to retrieve
 pages where the terms appear exactly
 in the order specified
Web page evaluation
   Before you leave the list of search
  results -- before you click and get
  interested in anything written on the
  page -- glean all you can from the
  URLs of each page.
  choose pages most likely to be
  reliable and authentic
Main evaluation points
 Accuracy
 Authority
 Objectivity
 Currency
 Coverage
Terminology
 Concept search
 Full-text index
 Fuzzy search
 Index
 Keyword search
 Precision
Terminology, cont.
 Proximity search
 Query-by-example
 Recall
 Relevancy
 Stemming
 Stop words
 Thesaurus
Resources and sources
 Final tip—links on Web pages may lead
 to other relevant sites, but be careful of
 going off on tangents
Resources
 20 great Google secrets
 http://www.pcmag.com/article2/0,4149,1306
 756,00.asp
 WWW Virtual Library
 http://vlib.org
 SearchEngineShowdown
  http://www.searchengineshowdown.com/
Resources
  Best Search Tools Chart
  http://www.infopeople.org/search/chart.html
  Searching the Internet Effectively
http://www2.vuw.ac.nz/staff/alastair_smith/sea
  rching/
  Finding Images Online
http://www.tasi.ac.uk/resources/searchingresou
  rces.html
  FindSounds
  http://www.findsounds.com/
Resources
  SwitchBoard
  http://www.switchboard.com/
  AnyWho http://www.anywho.com/
  Yahoo! PeopleSearch
  http://people.yahoo.com/
  Web 2.0
  http://www.go2web20.net/
  Library 2.0
http://instructionwiki.org/Library_2.0_in_15_min
  utes_a_day
Overall search engine info
  Best General Search Engines
http://www.lib.berkeley.edu/TeachingLib/
  Guides/Internet/SearchEngines.html
Questions?

More Related Content

What's hot

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...ijsrd.com
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Profiles training 1 description
Profiles training 1 descriptionProfiles training 1 description
Profiles training 1 descriptionsuemandella
 
Google Research Paper
Google Research PaperGoogle Research Paper
Google Research Paperdidip
 
Review of search and retrieval strategies
Review of search and retrieval strategiesReview of search and retrieval strategies
Review of search and retrieval strategiesAbid Fakhre Alam
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)silambu111
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...ijdms
 
Search Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismSearch Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismUmang MIshra
 
Semantic citation
Semantic citationSemantic citation
Semantic citationDeepak K
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)Sumit Kumar Gupta
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
Bioinformatioc: Information Retrieval - II
Bioinformatioc: Information Retrieval - IIBioinformatioc: Information Retrieval - II
Bioinformatioc: Information Retrieval - IIDr. Rupak Chakravarty
 

What's hot (20)

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Profiles training 1 description
Profiles training 1 descriptionProfiles training 1 description
Profiles training 1 description
 
Google Research Paper
Google Research PaperGoogle Research Paper
Google Research Paper
 
Automatic indexing
Automatic indexingAutomatic indexing
Automatic indexing
 
Review of search and retrieval strategies
Review of search and retrieval strategiesReview of search and retrieval strategies
Review of search and retrieval strategies
 
Search engine ppt
Search engine pptSearch engine ppt
Search engine ppt
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
 
Search Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismSearch Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanism
 
Semantic citation
Semantic citationSemantic citation
Semantic citation
 
Search strategy11
Search strategy11Search strategy11
Search strategy11
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Citation management tools 1
Citation management tools 1Citation management tools 1
Citation management tools 1
 
Search strategies
Search strategiesSearch strategies
Search strategies
 
Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Bioinformatioc: Information Retrieval - II
Bioinformatioc: Information Retrieval - IIBioinformatioc: Information Retrieval - II
Bioinformatioc: Information Retrieval - II
 

Viewers also liked

eBooks & eReaders: Past, Present & Future
eBooks & eReaders: Past, Present & FutureeBooks & eReaders: Past, Present & Future
eBooks & eReaders: Past, Present & FutureMichael Sauers
 
I Am WHO on the Internet?
I Am WHO on the Internet?I Am WHO on the Internet?
I Am WHO on the Internet?Denise Garofalo
 
Library Boot Camp: Basic Cataloging, Part 2
Library Boot Camp: Basic Cataloging, Part 2Library Boot Camp: Basic Cataloging, Part 2
Library Boot Camp: Basic Cataloging, Part 2Denise Garofalo
 
Library Boot Camp: Basic Cataloging, Part 1
Library Boot Camp: Basic Cataloging, Part 1Library Boot Camp: Basic Cataloging, Part 1
Library Boot Camp: Basic Cataloging, Part 1Denise Garofalo
 
The Importance of Inbound Call Centers
The Importance of Inbound Call CentersThe Importance of Inbound Call Centers
The Importance of Inbound Call CentersXACT TeleSolutions
 
How to Take Customer Experience Seriously
How to Take Customer Experience SeriouslyHow to Take Customer Experience Seriously
How to Take Customer Experience SeriouslyMartha Brooke
 
3 Steps To Bettering Your Staff's Phone Skills Without A Script
3 Steps To Bettering Your Staff's Phone Skills Without A Script3 Steps To Bettering Your Staff's Phone Skills Without A Script
3 Steps To Bettering Your Staff's Phone Skills Without A ScriptCentury Interactive
 
Cataloging basics
Cataloging basicsCataloging basics
Cataloging basicsrobin fay
 
10 Crucial Tips You Need to Learn to Keep Your Customers Happy
10 Crucial Tips You Need to Learn to Keep Your Customers Happy10 Crucial Tips You Need to Learn to Keep Your Customers Happy
10 Crucial Tips You Need to Learn to Keep Your Customers HappyRichard Kimber (CMICS)
 
Five tips for writing perfect tech support emails
Five tips for writing perfect tech support emailsFive tips for writing perfect tech support emails
Five tips for writing perfect tech support emailsFreshdesk Inc.
 
10 Things Your Customers Wish You Knew About Them
10 Things Your Customers Wish You Knew About Them10 Things Your Customers Wish You Knew About Them
10 Things Your Customers Wish You Knew About ThemHelp Scout
 
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...Social Fresh Conference
 
The Art of Talking Happy
The Art of Talking Happy The Art of Talking Happy
The Art of Talking Happy Kayako
 
Tips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customerTips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customerFreshdesk Inc.
 
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...BizLibrary
 
20 Inspirational Customer Experience Quotes
20 Inspirational Customer Experience Quotes20 Inspirational Customer Experience Quotes
20 Inspirational Customer Experience QuotesNeosperience
 
12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.Freshdesk Inc.
 
Talking to Humans at the Lean Startup Conference
Talking to Humans at the Lean Startup ConferenceTalking to Humans at the Lean Startup Conference
Talking to Humans at the Lean Startup ConferenceNew York University
 

Viewers also liked (20)

eBooks & eReaders: Past, Present & Future
eBooks & eReaders: Past, Present & FutureeBooks & eReaders: Past, Present & Future
eBooks & eReaders: Past, Present & Future
 
I Am WHO on the Internet?
I Am WHO on the Internet?I Am WHO on the Internet?
I Am WHO on the Internet?
 
Concepts review
Concepts reviewConcepts review
Concepts review
 
Session9
Session9Session9
Session9
 
Library Boot Camp: Basic Cataloging, Part 2
Library Boot Camp: Basic Cataloging, Part 2Library Boot Camp: Basic Cataloging, Part 2
Library Boot Camp: Basic Cataloging, Part 2
 
Library Boot Camp: Basic Cataloging, Part 1
Library Boot Camp: Basic Cataloging, Part 1Library Boot Camp: Basic Cataloging, Part 1
Library Boot Camp: Basic Cataloging, Part 1
 
The Importance of Inbound Call Centers
The Importance of Inbound Call CentersThe Importance of Inbound Call Centers
The Importance of Inbound Call Centers
 
How to Take Customer Experience Seriously
How to Take Customer Experience SeriouslyHow to Take Customer Experience Seriously
How to Take Customer Experience Seriously
 
3 Steps To Bettering Your Staff's Phone Skills Without A Script
3 Steps To Bettering Your Staff's Phone Skills Without A Script3 Steps To Bettering Your Staff's Phone Skills Without A Script
3 Steps To Bettering Your Staff's Phone Skills Without A Script
 
Cataloging basics
Cataloging basicsCataloging basics
Cataloging basics
 
10 Crucial Tips You Need to Learn to Keep Your Customers Happy
10 Crucial Tips You Need to Learn to Keep Your Customers Happy10 Crucial Tips You Need to Learn to Keep Your Customers Happy
10 Crucial Tips You Need to Learn to Keep Your Customers Happy
 
Five tips for writing perfect tech support emails
Five tips for writing perfect tech support emailsFive tips for writing perfect tech support emails
Five tips for writing perfect tech support emails
 
10 Things Your Customers Wish You Knew About Them
10 Things Your Customers Wish You Knew About Them10 Things Your Customers Wish You Knew About Them
10 Things Your Customers Wish You Knew About Them
 
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...
How to Embrace Complaints and Turn Bad News to Good News, Jay Baer, Social Fr...
 
The Art of Talking Happy
The Art of Talking Happy The Art of Talking Happy
The Art of Talking Happy
 
Tips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customerTips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customer
 
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...
Top 5 Soft Skills: What Successful People Know that Every Employee Needs to K...
 
20 Inspirational Customer Experience Quotes
20 Inspirational Customer Experience Quotes20 Inspirational Customer Experience Quotes
20 Inspirational Customer Experience Quotes
 
12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.
 
Talking to Humans at the Lean Startup Conference
Talking to Humans at the Lean Startup ConferenceTalking to Humans at the Lean Startup Conference
Talking to Humans at the Lean Startup Conference
 

Similar to Session5

5 Accessing Information Resources
5 Accessing Information Resources5 Accessing Information Resources
5 Accessing Information ResourcesPatty Ramsey
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03dpd
 
Tutorial 3 - Searcing the Web
Tutorial 3 - Searcing the WebTutorial 3 - Searcing the Web
Tutorial 3 - Searcing the Webdpd
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.Anish Thomas
 
beginners-guide.pdf
beginners-guide.pdfbeginners-guide.pdf
beginners-guide.pdfCreationlabz
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York UniversityNik Papic
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than GoogleDr Trivedi
 
Lesson Six Researching And The Internet
Lesson Six   Researching And The InternetLesson Six   Researching And The Internet
Lesson Six Researching And The Internetbsimoneaux
 
Intro to Search Engine Optimization
Intro to Search Engine OptimizationIntro to Search Engine Optimization
Intro to Search Engine Optimizationoxenfoord
 
IST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of InformationIST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of InformationD.A. Garofalo
 

Similar to Session5 (20)

5 Accessing Information Resources
5 Accessing Information Resources5 Accessing Information Resources
5 Accessing Information Resources
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03
 
Tutorial 3 - Searcing the Web
Tutorial 3 - Searcing the WebTutorial 3 - Searcing the Web
Tutorial 3 - Searcing the Web
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.
 
Seo guide
Seo guideSeo guide
Seo guide
 
beginners-guide.pdf
beginners-guide.pdfbeginners-guide.pdf
beginners-guide.pdf
 
Surfing the web
Surfing the webSurfing the web
Surfing the web
 
Presentationjava
PresentationjavaPresentationjava
Presentationjava
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York University
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Search Systems
Search SystemsSearch Systems
Search Systems
 
SEO Interview FAQ
SEO Interview FAQSEO Interview FAQ
SEO Interview FAQ
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than Google
 
Lesson Six Researching And The Internet
Lesson Six   Researching And The InternetLesson Six   Researching And The Internet
Lesson Six Researching And The Internet
 
yolink teacher guide
yolink teacher guideyolink teacher guide
yolink teacher guide
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
Seo Manual
Seo ManualSeo Manual
Seo Manual
 
Intro to Search Engine Optimization
Intro to Search Engine OptimizationIntro to Search Engine Optimization
Intro to Search Engine Optimization
 
IST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of InformationIST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of Information
 
Search Enginesv2
Search Enginesv2Search Enginesv2
Search Enginesv2
 

More from Denise Garofalo

More from Denise Garofalo (7)

Notes i am_who_on_internet
Notes i am_who_on_internetNotes i am_who_on_internet
Notes i am_who_on_internet
 
Library Boot Camp Notes
Library Boot Camp NotesLibrary Boot Camp Notes
Library Boot Camp Notes
 
Session8--Creating a technology plan
Session8--Creating a technology planSession8--Creating a technology plan
Session8--Creating a technology plan
 
Session6
Session6Session6
Session6
 
HTML Resources
HTML ResourcesHTML Resources
HTML Resources
 
Notes4
Notes4Notes4
Notes4
 
Session4
Session4Session4
Session4
 

Recently uploaded

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 

Recently uploaded (20)

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 

Session5

  • 1. Search Engines and Online Research February 27, 2008 IST 523 Denise A. Garofalo
  • 2. What are search engines? designed to make surfing the web simple, fast and rewarding for Internet users designed to search out Web pages one at a time and collect the results
  • 3. What do search engines do? gather together information store it in a database allow access to a list of individual pages based on: a word, or, set of words that you submit in the form of a query
  • 4. How do search engines work? they send out computer programs known as “spiders” or “robotsquot; to search the web interested in reading and storing the actual text that is shown on a web page, not graphics, etc.
  • 5. How, continued…. spider begins by visiting a single Web page it saves the text that if finds there after it has collected the information on that page, it looks for a link that will take it to another page when it reaches the next page, it starts the process all over again by following these steps over and over again, search engines are able to find and index far more web pages than a human being
  • 6. More how….. search engines setup spiders to begin their searches at web sites known as directories large web sites that contain lists of links that have been collected by human beings no way for spiders to find every page listed on the World Wide Web millions of web pages do not have any links to them from other sites without these links, spiders can’t find and index those pages
  • 7. How do search engines show the results? Sites are ranked based on the textual content of a web page A special set of criteria, or algorithm, is used to decide which pages to display Algorithms consider things like the title of the page, the text of the page, how many other web sites link to the page, and even what text web sites that link to a page use to describe it
  • 8. Search engines--review a series of computer programs that find and save files at a very fast rate when combined with algorithms designed to sort content based on text queries search engines become a useful tool to find a little bit of information in that vast collection of files known as the World Wide Web
  • 9. Which search engine is best? Need to understand how each search engine works Check out the Bruce Clay, Inc. search engine relationship chart: http://www.bruceclay.com/searchenginere lationshipchart.htm
  • 10. Invisible Web (or Deep Web) Some pages and links are excluded from most search engines by policy Others are excluded because search engine spiders cannot access them. Pages that are excluded are referred to as the “Invisible Web” (or “Deep Web”) you don't see these pages in search engine results estimated to be two to three or more times larger than the “visible web”
  • 11. Why invisible pages? If a search engine doesn’t locate a Web page it’s because: Technical barriers prohibit access Choices or decisions made by the search engine (policy) exclude the page
  • 12. Technical barriers Typing or judgment is required Searchable specialized databases Logins and/or passwords required
  • 13. Policy issues Page format Non-HTML pages Script-based programs (those URLs with a “?”)
  • 14. Research issues Different search tools give different results Failure to retrieve does not mean that there is nothing available Develop a search strategy Learn the search engine’s search tips Evaluation
  • 15. A selection of engines •Google www.google.com •Vivisimo www.vivisimo.com •Ask.com www.ask.com •Yahoo! www.yahoo.com •Open Directory www.dmoz.org •Ixquick www.ixquick.com/ •Mamma www.mamma.com/ •Gigablast www.gigablast.com/ Search.com www.search.com/
  • 16. Failure to retrieve crawling Web pages and locating sites for search engines is based on using links from one page to reach other pages to crawl documents with few links tend to be overlooked if pages are never discovered, they are not available to researchers Failure to retrieve can also be linked to the search query used, or search strategy
  • 17. Search strategy three main considerations in the search process Relevance Precision Recall
  • 18. Successful search strategy ability to create an exact match between search statement and documents sought size and content of the search engine selected search engine’s search tools
  • 19. Process involves consultation of definition tools subject dictionaries thesauri, etc. subject familiarization i.e. if searching on medical topics, become familiar with basic terminology same goes for research in any other subject area
  • 20. Formulating a strategy be logical spend time on search term selection and combining to reduce the time spent eliminating irrelevant search results search engines are good for searching on unusual or unique keywords, and for combining keywords be creative and flexible look for subtle connections be prepared to make intuitive leaps
  • 21. Simplified search strategy Formulation of the research question and its scope Identification of concepts within the question Identification of search terms to describe those concepts Consideration of synonyms and variations of those terms Preparation of the search logic Readiness to revise and redo a search
  • 22. Boolean logic describes certain logical operations that are used to combine search terms basic Boolean operators are AND, OR and NOT
  • 23. AND limits results to those items that contain both, or all, of the search terms in the query search query with the AND operator will retrieve only those items containing both all search terms
  • 24. OR helpful in the first phases of a search especially if the searcher is unsure of what information is available on the topic or what words are used to categorize it when used between two words, it instructs the search tools to retrieve any record containing either of the words
  • 25. NOT The third of the most common Boolean operators used to eliminate records containing a particular word or combination of words from the search results
  • 26. Search engine search tips Check the Help files of a search engine Some search engines allow you to apply date restrictions to a search Word order in natural language searching can greatly influence the search A question phrased in difference ways can produce different results An added influence is the weight some search engines place on words located earlier in the search query
  • 27. + sign ensures that a search engine finds pages that have all the words you enter, not just some of them
  • 28. - sign a search engine will find pages that have one word on them but not another word
  • 29. Phrase searching ensures that terms appear in the order they are entered placing the phrase within quotation marks tells the search engine to retrieve pages where the terms appear exactly in the order specified
  • 30. Web page evaluation Before you leave the list of search results -- before you click and get interested in anything written on the page -- glean all you can from the URLs of each page. choose pages most likely to be reliable and authentic
  • 31. Main evaluation points Accuracy Authority Objectivity Currency Coverage
  • 32. Terminology Concept search Full-text index Fuzzy search Index Keyword search Precision
  • 33. Terminology, cont. Proximity search Query-by-example Recall Relevancy Stemming Stop words Thesaurus
  • 34. Resources and sources Final tip—links on Web pages may lead to other relevant sites, but be careful of going off on tangents
  • 35. Resources 20 great Google secrets http://www.pcmag.com/article2/0,4149,1306 756,00.asp WWW Virtual Library http://vlib.org SearchEngineShowdown http://www.searchengineshowdown.com/
  • 36. Resources Best Search Tools Chart http://www.infopeople.org/search/chart.html Searching the Internet Effectively http://www2.vuw.ac.nz/staff/alastair_smith/sea rching/ Finding Images Online http://www.tasi.ac.uk/resources/searchingresou rces.html FindSounds http://www.findsounds.com/
  • 37. Resources SwitchBoard http://www.switchboard.com/ AnyWho http://www.anywho.com/ Yahoo! PeopleSearch http://people.yahoo.com/ Web 2.0 http://www.go2web20.net/ Library 2.0 http://instructionwiki.org/Library_2.0_in_15_min utes_a_day
  • 38. Overall search engine info Best General Search Engines http://www.lib.berkeley.edu/TeachingLib/ Guides/Internet/SearchEngines.html