Discover the invisible web 2011 presentation

•

3 likes•2,379 views

Elizabeth Holmes

CLE presentation June 2011

Education Technology Design

Discover the Invisible Web Elizabeth Geesey Holmes, M.A., M.S.L.I.S. Law Librarian Partridge Snow & Hahn LLP

How does it differ from the visible Web, and why should we care about it What is the Invisible Web?

Synonyms ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

What is the Surface Web? Tip of the Iceberg

What is the Invisible Web? Tip of the Iceberg The majority of information on the Web

What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page

What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page Search Engine decided not to index the content

Based on Sherman & Price (2001). Chart created by EGH using http://simplemapper.org/

Finding the content that is “hidden” away within invisible websites Finding Invisible Web content

Tip #1 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Remember that the Invisible Web exists

The “Google-Gap” ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com “The object is not to replace general-purpose search engines but to show how Invisible Web resources complement search engine results.” (Devine & Egger-Sider, 2009) “Google-Gap– the difference between the growing perception that the site is omniscient and the fact that it isn’t” (Salkever, 2003)

Tip #2 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Locate a database or website that contains the information for which you are looking

Tip #2 “The point is that often the key to the answer is not locating the answer itself as the first step, but locating the right database in which to search for it.” Diana Botluk, Mining Deeper into the Invisible Web, http://www.llrx.com/features/mining.htm ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

Example #1: Find a Database Westlaw for fee results Pages no longer there Not what we are looking for This site looks good

Example #2: Find a Website ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Find the number of Income taxes filed in the state of Rhode Island

Real 1st hit Tax forms or tax form instructions

Real hit No. 2 Tax forms or information on tax filing

Google’s cached version (captured on May 12)

Tip #3 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Consult a Web Directory or Invisible Web Portal

http://www.nyls.edu/library/research_tools_and_sources/dragnet/

Tip #4 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Ask a Web research specialist such as a Law Librarian

Tip #5 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Use tools in a general search engine to sort your results and bring relevant ones to the top

Example #1: Google’s Sidebar ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

You can limit further by size, color and more

Tip #6 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Don’t forget information from the social web

Example: Twitter http://search.twitter.com/

Tip #7 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Make note of websites recommended for legal research

And can they be located again? Where do old Web pages go?

The Wayback Machine http://www.archive.org/

When & Why to use Invisible Web resources Conclusion

Use Invisible Web Resources ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com When: You are looking for a precise answer You need authoritative results Timeliness of the content is important You are searching for information in a specific subject area

EGHresearch@gmail.com 706-224-1732 http://ElizabethGeeseyHolmes.com http://www.linkedin.com/in/elizabethgeeseyholmes Twitter: bethgholmes Contact Information ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

What's hot

SEO for journalistsChris Snider

3 Simple Steps to 300,000 StumblesJordan Godbey

Online Conversations Archimedes MovementIsaac Holeman

There's more to life than great content - Matthew Barby - #SearchLeeds 2016Matthew Barby

Google Authorship and Author StatsOrbit Media Studios

State Bar of Arizona 2015 ConventionGet Noticed Get Found

SERP Analysis for Content Strategy #SEERFestRoss Hudgens

Scaling Growth Through Inbound Marketing: #Webbdagarna Stockholm 2016 @matthe...Matthew Barby

Evaluating search enginesPhil Bradley

Video and-viral-multimedia-seo-kronisAaron Kronis

Effective Link Building - Pro SEO Boston 2011Justin Briggs

Introduction to searching onlineCelia Bandelier

Cilip2015Phil Bradley

The 5Ws of Web Site Evaluationjmika

The State of Social Media #3XEDigitalRoss Hudgens

Optimizing for Position Zero: Featured Snippets & Food BloggersCasey Markee, MBA

5 Content Blind Spots and How to Avoid ThemAWeber

Evaluating web content authenticityKelly Walsh

Advanced Internet Searching, December 2014Phil Bradley

External Linking - Andrew Girdwood - BigMouthMediaauexpo Conference

What's hot (20)

SEO for journalists

3 Simple Steps to 300,000 Stumbles

Online Conversations Archimedes Movement

There's more to life than great content - Matthew Barby - #SearchLeeds 2016

Google Authorship and Author Stats

State Bar of Arizona 2015 Convention

SERP Analysis for Content Strategy #SEERFest

Scaling Growth Through Inbound Marketing: #Webbdagarna Stockholm 2016 @matthe...

Evaluating search engines

Video and-viral-multimedia-seo-kronis

Effective Link Building - Pro SEO Boston 2011

Introduction to searching online

Cilip2015

The 5Ws of Web Site Evaluation

The State of Social Media #3XEDigital

Optimizing for Position Zero: Featured Snippets & Food Bloggers

5 Content Blind Spots and How to Avoid Them

Evaluating web content authenticity

Advanced Internet Searching, December 2014

External Linking - Andrew Girdwood - BigMouthMedia

Viewers also liked

Rhode Island Paralegal Association Lunch and Learn Searching Google for Legal...Elizabeth Holmes

Paralegal's Guide to Going Beyond Basic Search: Tapping into Google's full po...Elizabeth Holmes

Beyond Basic Searching: Tapping into Google's Full Potential for Legal ResearchElizabeth Holmes

Ipe pp slides google talk 2013Elizabeth Holmes

Hidden Services, Zero KnowledgeDavid Evans

Internet research tips, tools and techniques for the Administrative ProfessionalElizabeth Holmes

Viewers also liked (6)

Rhode Island Paralegal Association Lunch and Learn Searching Google for Legal...

Paralegal's Guide to Going Beyond Basic Search: Tapping into Google's full po...

Beyond Basic Searching: Tapping into Google's Full Potential for Legal Research

Ipe pp slides google talk 2013

Hidden Services, Zero Knowledge

Internet research tips, tools and techniques for the Administrative Professional

Recently uploaded

Raw materials used in Herbal Cosmetics.pptxAshokrao Mane college of Pharmacy Peth-Vadgaon

Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543

Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc

Difference Between Search & Browse Methods in Odoo 17Celine George

Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan

Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri

Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99

Keynote by Prof. Wurzer at Nordex about IP-designMIPLM

How to Add Barcode on PDF Report in Odoo 17Celine George

Earth Day Presentation wow hello nice greatYousafMalik24

Computed Fields and api Depends in the Odoo 17Celine George

Field Attribute Index Feature in Odoo 17Celine George

ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2

Recently uploaded (20)

Raw materials used in Herbal Cosmetics.pptx

Q4 English4 Week3 PPT Melcnmg-based.pptx

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)

Procuring digital preservation CAN be quick and painless with our new dynamic...

Difference Between Search & Browse Methods in Odoo 17

Gas measurement O2,Co2,& ph) 04/2024.pptx

Judging the Relevance and worth of ideas part 2.pptx

Science 7 Quarter 4 Module 2: Natural Resources.pptx

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx

Choosing the Right CBSE School A Comprehensive Guide for Parents

Keynote by Prof. Wurzer at Nordex about IP-design

How to Add Barcode on PDF Report in Odoo 17

Earth Day Presentation wow hello nice great

Computed Fields and api Depends in the Odoo 17

Field Attribute Index Feature in Odoo 17

ENGLISH6-Q4-W3.pptxqurter our high choom

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Discover the invisible web 2011 presentation

1. Discover the Invisible Web Elizabeth Geesey Holmes, M.A., M.S.L.I.S. Law Librarian Partridge Snow & Hahn LLP

2. How does it differ from the visible Web, and why should we care about it What is the Invisible Web?

3. Invisible Web Literature

5. What is the Surface Web? Tip of the Iceberg

6. What is the Invisible Web? Tip of the Iceberg The majority of information on the Web

7. What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page

8. What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page Search Engine decided not to index the content

9. What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page Search Engine decided not to index the content Search Engine has been asked not to index the content

10. What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page Search Engine decided not to index the content Search Engine has been asked not to index the content Search Engine does not have the technology required to index non-HTML content

11. What Types of Content are Invisible? ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Search Engine does not know about the page Search Engine decided not to index the content Search Engine has been asked not to index the content Search Engine does not have the technology required to index non-HTML content Search Engine cannot get to the pages to index them

12. Based on Sherman & Price (2001). Chart created by EGH using http://simplemapper.org/

13. Finding the content that is “hidden” away within invisible websites Finding Invisible Web content

15. The “Google-Gap” ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com “The object is not to replace general-purpose search engines but to show how Invisible Web resources complement search engine results.” (Devine & Egger-Sider, 2009) “Google-Gap– the difference between the growing perception that the site is omniscient and the fact that it isn’t” (Salkever, 2003)

16. Tip #2 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Locate a database or website that contains the information for which you are looking

17. Tip #2 “The point is that often the key to the answer is not locating the answer itself as the first step, but locating the right database in which to search for it.” Diana Botluk, Mining Deeper into the Invisible Web, http://www.llrx.com/features/mining.htm ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

18. Example #1: Find a Database Westlaw for fee results Pages no longer there Not what we are looking for This site looks good

19.

20. Example #2: Find a Website ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Find the number of Income taxes filed in the state of Rhode Island

21. Real 1st hit Tax forms or tax form instructions

22. Real hit No. 2 Tax forms or information on tax filing

23.

24. Google’s cached version (captured on May 12)

25.

26. ©

27.

28.

29.

31. http://www.nyls.edu/library/research_tools_and_sources/dragnet/

32.

33. http://www.completeplanet.com/

34.

35. http://www.plol.org/

36.

37.

39.

40. Tip #5 ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Use tools in a general search engine to sort your results and bring relevant ones to the top

42. You can limit further by size, color and more

45. Example: Twitter http://search.twitter.com/

47. One Standout Resource

48. And can they be located again? Where do old Web pages go?

49. The Wayback Machine http://www.archive.org/

50.

51.

52.

53. Cached Pages on Google

54. When & Why to use Invisible Web resources Conclusion

55. Use Invisible Web Resources ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com When: You are looking for a precise answer You need authoritative results Timeliness of the content is important You are searching for information in a specific subject area

56. EGHresearch@gmail.com 706-224-1732 http://ElizabethGeeseyHolmes.com http://www.linkedin.com/in/elizabethgeeseyholmes Twitter: bethgholmes Contact Information ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

57. Tips for finding Invisible Web content ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com Remember that the Invisible Web exists Locate a database that contains the information you are looking for Consult a Web Directory or Invisible Web Portal Do a vertical search of a deep content web site Ask a Web research specialist (Librarian) Use tools in a general search engine to sort your results and bring relevant ones to the top Do not forget information from the social web Make note of websites recommended for legal research

58. Example No. 1 Finding Statistics: No. of Income Taxes Filed in RI. Show example of general Google search Then use Google to search for a database or other site that would have this information Go to RI Division of Taxation Look under Reports ©Elizabeth Geesey Holmes ■ EGHresearch@gmail.com

59. Source: Based on Sherman and Price (2001)

60. Parts of the Invisible Web

Editor's Notes

From BonnieShucha at U Wisc.Show of hands:Of all the time you spend on legal research, how much of it is spent on the Web? Less than 25%, 25-50%, 50-75%, more than 75%What search tools do you prefer? Google, Westlaw, Lexis, others?Generally, how satisfied are you with your search experiences? Very, Fairly, Somewhat, Not Very, Not at All
NEED AN INTRO HERE OR ON THE SLIDE BEFORE THIS ONeThere has been, and continues to be a lot written on the Invisible Web.Here are just some samples I used when writing this presentation.
Dr. Jill Ellsworth coined the term “Invisible Web” in 1996. It is also called the “Deep Web,” “Hidden Web,” and “Dark Web”. The Invisible web consists of the data you cannot retrieve using a keyword search in a general search engine. I and others also include any results from a general search engine that are not on the first couple of pages. If one isn’t going to look past those first few pages than the rest of the results are effectively “invisible”.Due to the dynamic nature of the Web, what is “invisible” today may be “visible” tomorrow. The Visible Web – is what you see in the results pages generated by general web search engines such as Yahoo!, Google or Bing.It is also called the Surface Web.
In order to understand the concept of the “invisible Web”, it may be helpful to first explore the nature of the “visible Web”Visible Web page exists in “static” or unchanging formThey exists as a “physical” file on a computerMost in .htm or .html formatSimilar to a word processed document in .doc or .wpd formatStatic Web pages considered “visible” because standard search engines can index them and display them as search resultsThe Surface Web is just the Tip of the Iceberg.
What is the Invisible WebIn late 2001 a search company called Bright Planet made several points in a now famous White Paper on the invisible Web (The Deep Web: Surfacing Hidden Value)It is as much as five hundred times larger than the visible webIt is the largest growing category of new information on the Internet.contains sites that tend to be narrower, with deeper content than surface sitescontent is highly relevant to every information need, market, and domain.consists primarily of publicly accessible information – not subject to fees or subscriptions. Chris Sherman and Gary Price, Internet search gurus and authors of The Invisible Web (2001) disagree with BrightPlanet’s sizing of the invisible Web. They estimate it to be somewhere between 2 and 50 times larger than the visible Web. Search engines such as Google are making strides to index deep Web content so this number is constantly changing. Nonetheless, the invisible Web still accounts for the majority of information available on the Internet.
Karen explained earlier this morning how search engines create their Indexes.Understanding this process explains how so much information is not retrieved by a search engine searchI decided to use the word content here rather than sites. The main page of a Deep Web site is usually easy to find and is in the Search Engine’s index. It is the rest of the site – other webpages and other content within the site) that may be hidden.Search engines do not index certain web content (making it Invisible) mainly for the following reasons:The Search Engine does not know about the page. No one has submitted the URL to the search engine, and/or no pages currently covered by the Search engine have linked to it. The Search Engine is assuming in this case that hardly anyone cares about this page, so you probably don’t either.
2. The Search Engines have decided not to index the contentBecause it’s too deep in the site (probably less useful)The page changes frequently and indexing the content would be somewhat meaningless (for example: News pages)The page is generated dynamically -- it only exists for a moment in time as a result of a query/search
3. The Search Engine has been asked not to index the content A robots.txt file is present on the site. This file asks the Search Engine not to index the site or not to index specific pages or particular parts of the site. (The website creator has determined that this information is Nobody Else’s Business)
Web pages are created or coded using HTML. This is what the Search Engines spiders are crawling for content. Files such as Images, and Videos are not HTML. This category used to include files types as PDF, Excel, Word and others, but these are now able to be indexed.Also because of the increasing amounts of HTML readable data attached to these visual files that are becoming more and more indexable and thus findable by Search Engines.
The Search Engine cannot get to the pages to index them because it encounters a request for a password of the site has a search box that must be filled out in order to get to the content.Main example of this type of content is information in databases.
This chart is another way to look at the types of sites and content that is in the Invisible Web.It breaks the Invisible Web into 4 main sub-types based on the categories identified by Sherman and Price. These areThe Opaque WebThe Private WebThe Proprietary WebThe Truly Invisible WebHighlight some of these depending on time: Maximum # of viewable results – only first 10,000 or so are actually viewable (still way more than anyone would actually look at)Password protectedInformation in DatabasesDynamically generatedCorresponds to paper bottom of pg 2 and top of pg 3
Remember that the Invisble Web exists – AND then use the other tips in this presentation to find valuable information that is “hidden” there. If you don’t know it exists then you won’t go looking for it – and will miss out.
Devine & Egger-Sider noted in their book Going Beyond Google: The Invisible Web in Learning and Teaching, that the object in teaching about and remember about the Invisible Web is NOT to replace general-purpose search engines but to show how Invisible Web resources complement search enginge results.Knowing about the Invisible Web helps close what Alex Salkever called the “Google-Gap” or the difference between what people think Google can retrieve and what it actually can. (From his article “The Web According to Google” in Business Week Online 10, no.23 June 2003)
One basic thing to do is use a Search Engine such as Google to find a database that includes the content you are looking forEnter some keywords and then the word DATABASEOnce you find a database then do another search using it’s search function to find the content you need. This is also referred to as “split-level searching”“The point is that often the key to the answer is not locating the answer itself as the first step, but locating the right database in which to search for it.”Diana Botluk, Mining Deeper into the Invisible Web, http://www.llrx.com/features/mining.htm
One basic thing to do is use a Search Engine such as Google to find a database that includes the content you are looking forEnter some keywords and then the word DATABASEFor example if we want to find Rhode Island DocketsEnter the keyword: rhode island dockets AND then the word database
Clicking on the link leads us to Justia’s Dockets & Fillings page. This is one of the resources covered in my paper and I will be talking more about Justia in my presentation on Government Resources. It is a legal portal whose mission according to their website, is to advance the availability of legal resources for the benefit of society. They are especially focused on making primary legal materials and community resources free and easy to find on the Internet. This site is independently owned by two individuals with legal and research backgrounds and is not affiliated with any publisher.Justia is a very deep website and in this section you can search dockets from different jurisidictions so I’ve chosen Rhode Island from the drop down menu, and clicked on SEARCH to get these results Cases filed in Rhode IslandJustia also provides Internet users with free case law, codes, regulations, legal articles and legal blog and twittterer databases, as well as additional community resources. Justia works with educational, public interest and other socially focused organizations to bring legal and consumer information to the online community. Justia provides premium Web site, blogging and online marketing solutions to help law firms optimize their marketing budget and provide their clients with an increased level of information and service
Actual question from our CFO (my boss)Tip #2 Example:Finding Statistics: No. of Income Taxes Filed in RI.
Here’s the first page from a general keyword Google search for our question:The first hits are just for places with our keywordSo, our first real hit is at the site: rhodeislandwork.usThe next 4 results are all for tax forms or instructions for forms
The second real hit is to an article in the Providence Journal. The rest of the results on this page are for more forms and instructions.
Screen capture real hit No. 1 from Google general search for “number of tax returns filed in Rhode Island”This leads to a page or section of the website that is no longer available
Google’s cached version of real hit No. 2This is why the page is included in our results. Our keywords are highlighted
Screen capture real hit No. 2 from Google general search for “number of tax returns filed in Rhode Island”Still doesn’t have the information/data we are looking for.So, the first page of Google results does not lead us to an answer for our question.
Next use Google to locate a site that may have this informationI thought that the page for the RI Division of Taxation might have this information
Website for Division of TaxationDidn’t see anything at first glanceDid a search for returns and it took me to the Reports page. Originally I just looked at the links on the left of the page and choose Reports thinking that the data I am looking for might be in a report.
Reports page.I choose 2009 Resident because it was the most recent and I am looking for Personal and not corporate information.
Part of the report page with the answer I am looking for
Use as examples:DRAGNET Complete PlanetLibrary of Law
Created by Mendik Library at New York Law SchoolWinner of the AALL 2011 Law Library Publications Award, Nonprint DivisionAward honors achievement in creating in-house library materials that are outstanding in quality and significanceDRAGNET stands for Database Retrieval Access Using Google’s New Electronic TechnologyIt is a Google Custom Search engine that searches only 100 highly recommended FREE legal databases and websitesIncludes a mix of governmental and organizational sites with an emphasis on New York.
In addition to this DRAGNET search of free legal databases there is also a DRAGNET search of 150 law journals with FREE online content(And a Third DRAGNET that searches constitutions and codes of the 50 states and federal government).
One of the best Invisible Web portals is CompletePlanet.It is not included in my paper because during the few days I was writing it I could not get into the site and I didn’t want to include it without verifying that it was up and running. I’ve since had to problem accessing it so am including it in this presentation.It is a comprehensive listng of dynamic searchable databasesYou can search by keyword or drill down by subjects.Here I am going to start by clicking on the LAW subject classification
Now I am at a page displaying databases classified in the LAW category.I can drill down further by more specific subject subcategoriesHere I’ve chosedCourtsNotice the “breadcrumb” trail near the top of the page circled in yellow. You can use this to go back to earlier pages, or just to see where you are in the subject hierarchy. While the results are in the 1000s in far fewer and much more relevant than a general search engine search which would retrieve results in the millions.
The Public Library of Law is another Legal Web Portal which I have recommended in my paper.They bill themselves as the best starting place to find law on the Web. It’s easy to use and contains links to case law, federal law, legislative information, and more.
They Advanced search lets you search by keyword, date range and jurisdiction, underneath each of the tabs for Case Law, Statutes, Regulations, Court Rules, Constitutions, and Legal Forms
There are also links to the for-fee information available with a Fastcase subscription because this site is availated with the publisher FASTCASE
Ask a web research specialist such as a Law Librarian or other Information Professional.
If you have a Librarian at your firm you could consult him or her.Or consultthe Librarians at your Public Library or my fellow presenter Karen at the RI State Law Library.You could also hire an Independent Information Professional to do a research project for you. (No I didn’t color coordinate my presentation to these websites!)
You can limit your results using the categories in the left hand sidebar. For example if you are looking just for images click on Images to limit by that content type.If you click on More you can also limit by BLOGS,
Now we have only Images in our results list.We don’t have to scroll through pages of results to get to the images.You can further limit by size of image
Karen mentioned the search engine Yippy earlier this morning.Using the left sidebar on Yippy you can limit your search results by subject areas – they call them Clouds – you can drill down many layers. Here I limited by Law – Immovable, Common law – and under that Real Estate etc…You can also limit by type of Sites.
I saw something this morning about Roundup Weed Killer and wanted to see what people were saying about it, or what news sources they were linking too.To do this I used Twitter Search. This is obviously a hot topic since 3 more results were found since I started searching – just seconds earlier.
Lastly, keep your eyes open for useful websites that are reviewed in the newspapers, your professional newsletters, or that are recommended at seminars such as this. I’ve listed many of these at the end of my paper and will be covering others in my next section on Government Resources.Our other speakers today have also mentioned, and will be mentioning more useful sites today.
In February of this year MarcusZillman, and expert on the deep web, published this paper on LLRX.com.It includes many sections – one of the most useful is the list of Deep Web Research Resources.
Related to information hidden on the Web is the issue of what happens to websites, or information contained in those sites that used to be on the Internet. Perhaps some of you have tried to find a Web page again and found that it was no longer there or that it had changed its appearance. The Wayback Machine, part of the Internet Archive initiative, is one source for archived internet information. It is an archive of billions of Web pages, with new sites and new versions of sites added regularly.They’ve just released a new interface which is pictured here. Notice it’s very clean and simple – much like Google
For example if I wanted to look at the website for Partridge Snow & Hahn from 2006I would enter the URL for the website in the search box and click on SHOW ALL
I selected the year 2006The dates that this URL was crawled by the Wayback Machine are circled in blue. Note that the calendar maps the number of times the site was crawled and not how many times the site was actually updated.
Selecting the Feb 12, 2003 date leads to this version of the website.I can even click on a subpage and bring up this list of attorneys as off that date in 2003.
you can also find earlier versions of some Web pages on Google. When Google indexes a Web page it takes a “snapshot” of itat that time to create its index. Most of these are cached and can be accessed by clicking on the word Cache under each hit in the list. This isn’t a solution for very old information though since when each site is re-indexed – usually at least once a month – a new snapshot will be taken overwriting the older one
Let’s review when and why you need to use invisible Web resources. Use these resources when you are looking for a precise answer, when you want authoritative, exhaustive results, when timeliness of the content is important, and when you are looking for information in a specific subject area. Resources in the invisible Web are usually more authoritative, more comprehensive, and more limited in scope, so your search results are more precise and relevant. I encourage all of you to use the skills you learn today to search smart, keep up to date and know what is out there and how to find it. Knowing what the invisible Web is and how to find information in it is an essential step to becoming a savvy Web searcher.
I have posted these sildes on my website and feel free to call me or contact me by e-mail if you have any questions.I’ve also included my LinkenIn contact info and my Twitter name.
Actual question from our CFO (my boss)
This chart is another way to look at the types of sites and content that is in the Invisible Web.It breaks the Invisible Web into 4 main sub-types based on the categories identified by Sherman and Price. These areThe Opaque WebThe Private WebThe Proprietary WebThe Truly Invisible WebHighlight some of these depending on time: Maximum # of viewable results – only first 10,000 or so are actually viewable (still way more than anyone would actually look at)Password protectedInformation in DatabasesDynamically generatedCorresponds to paper bottom of pg 2 and top of pg 3
This chart is another way to look at the types of sites and content that is in the Invisible Web.It breaks the Invisible Web into 4 main sub-types:The Opaque WebThe Private WebThe Proprietary WebThe Truly Invisible Web

Discover the invisible web 2011 presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Recently uploaded

Recently uploaded (20)

Discover the invisible web 2011 presentation

Editor's Notes