The Deep Web
Visit www.seminarlinks.blogspot.in to Download
The surface Web is that portion of the World Wide Web that is indexable by conventional search engines.
It is also known as the Clearnet, the visible Web or indexable Web.
Eighty-five percent of Web users use search engines to find needed information, but nearly as high a
percentage cite the inability to find desired information as one of their biggest frustrations.
A traditional search engine sees only a small amount of the information that's available -- a measly 0.03 %
Deep Web - Introduction
The Deep Web is World Wide Web content that is not part of the Surface Web, which is indexed
by standard search engines.
It is also called the Deepnet, Invisible Web or Hidden Web.
Largest growing category of new information on the Internet.
400-550X more public information than the Surface Web.
Total quality 1000-2000X greater than the quality of the Surface Web.
Jill Ellsworth used the term invisible Web in 1994 to refer to websites that were not registered
with any search engine.
Mike Bergman cited a January 1996 article by Frank Garcia:
“It would be a site that's possibly reasonably designed, but they didn't bother to register it with
any of the search engines. So, no one can find them! You're hidden. I call that the invisible Web”.
Another early use of the term Invisible Web was by Bruce Mount and Matthew B. Koll of Personal
Library Software in 1996.
The first use of the specific term Deep Web, now generally accepted, occurred in the
aforementioned 2001 Bergman study.
How search engines work
Search engines construct a database of the Web by using programs called spiders or Web crawlers
that begin with a list of known Web pages.
The spider gets a copy of each page and indexes it, storing useful information that will let the page
be quickly retrieved again later.
Any hyperlinks to new pages are added to the list of pages to be crawled.
Eventually all reachable pages are indexed, unless the spider runs out of time or disk space.
The collection of reachable pages defines the Surface Web.
• Dynamic pages which are returned in response to a submitted query or accessed only
through a form
• especially if open-domain input elements (such as text fields) are used
• such fields are hard to navigate without domain knowledge
• Pages which are not linked to by other pages
• Which may prevent web crawling programs from accessing the content
• This content is referred to as pages without backlinks (or inlinks).
Private Web: sites that require registration and login (password-protected resources).
Contextual Web: pages with content varying for different access contexts (e.g., ranges
of client IP addresses or previous navigation sequence).
Limited access content: sites that limit access to their pages in a technical way (e.g.,
using the Robots Exclusion Standard, CAPTCHAs, or no-cache Pragma HTTP headers which
prohibit search engines from browsing them and creating cached copies.
dynamically downloaded from Web servers via Flash or Ajax solutions.
textual content encoded in multimedia (image or video) files or specific file formats not
handled by search engines.
The deep Web is an endless repository for a mind-reeling amount of information.
It's powerful. It unleashes human nature in all its forms, both good and bad.
There are engineering databases, financial information of all kinds, medical papers, pictures, illustrations ... the list
goes on, basically, forever.
For example, construction engineers could potentially search research papers at multiple universities in order to
find the latest and greatest in bridge-building materials.
Doctors could swiftly locate the latest research on a specific disease.
The potential is unlimited. The technical challenges are daunting. That's the draw of the deep Web.
The deep Web may be a shadow land of untapped potential.
The bad stuff, as always, gets most of the headlines.
You can find illegal goods and activities of all kinds through the dark Web.
That includes illicit drugs, child pornography, stolen credit card numbers, human trafficking, weapons, exotic
animals, copyrighted media and anything else you can think of.
Theoretically, you could even, say, hire a hit man to kill someone you don't like.
But you won't find this information with a Google search.
These kinds of Web sites require you to use special software, such as The Onion Router, more commonly known
The Onion Router (TOR)
Tor is software that installs into your browser and sets up the specific connections you need to access dark
Critically it is free software for enabling online anonymity and censorship resistance.
Onion routing refers to the process of removing encryption layers from Internet communications, similar to
peeling back the layers of an onion.
Using Tor makes it more difficult to trace Internet activity, including "visits to Web sites, online posts, instant
messages, and other communication forms", back to the user.
It is intended to protect the personal privacy of users, as well as their freedom and ability to conduct
confidential business by keeping their internet activities from being monitored.
Instead of seeing domains that end in .com or .org, these hidden sites end in .onion.
The most infamous of these onion sites was the now-defunct Silk Road, an online marketplace where
users could buy drugs, guns and all sorts of other illegal items.
The FBI eventually captured Ross Ulbricht, who operated Silk Road, but copycat sites like Black Market
Reloaded are still readily available.
Tor is the result of research done by the U.S. Naval Research Laboratory, which created Tor for political
dissidents and whistleblowers, allowing them to communicate without fear of reprisal.
Tor was so effective in providing anonymity for these groups that it didn't take long for the criminally-
minded to start using it as well.
Silk Road Website
U.S. authorities shut down Silk after the
alleged owner of the site Ross William Ulbricht
You may wonder how any money-related transactions can happen when sellers and buyers can't
identify each other.
That's where Bitcoin comes in.
Bitcoin, it's basically an encrypted digital currency.
Like regular cash, Bitcoin is good for transactions of all kinds, and notably, it also allows for
anonymity; no one can trace a purchase, illegal or otherwise.
When paired properly with TOR, it's perhaps the closest thing to a foolproof way to buy and sell on
The Brighter Side of Darkness
The deep Web is home to alternate search engines, e-mail services, file storage, file sharing, social
media, chat sites, news outlets and whistleblowing sites, as well as sites that provide a safer meeting
ground for political dissidents and anyone else who may find themselves on the fringes of society.
In an age where NSA-type surveillance is omnipresent and privacy seems like a thing of the past, the
dark Web offers some relief to people who prize their anonymity.
Bitcoin may not be entirely stable, but it offers privacy, which is something your credit card company
most certainly does not.
For citizens living in countries with violent or oppressive leaders, the dark Web offers a more secure way
to communicate with like-minded individuals.
Invisible Web Search Tools
• A List of Deep Web Search Engines – Purdue Owl’s Resources to Search the Invisible Web
• Art – Musie du Louvre
• Books Online – The Online Books Page
• Economic and Job Data – FreeLunch.com
• Finance and Investing – Bankrate.com
• General Research – GPO’s Catalog of US Government Publications
• Government Data – Copyright Records (LOCIS)
• International – International Data Base (IDB)
• Law and Politics – THOMAS (Library of Congress)
• Library of Congress – Library of Congress
• Medical and Health – PubMed
• Transportation – FAA Flight Delay Information
The lines between search engine content and the deep Web have begun to blur, as search services
start to provide access to part or all of once-restricted content.
An increasing amount of deep Web content is opening up to free search as publishers and libraries
make agreements with large search engines.
In the future, deep Web content may be defined less by opportunity for search than by access fees or
other types of authentication.
The deep web will continue to perplex and fascinate everyone who uses the internet.
It contains an enthralling amount of knowledge that could help us evolve technologically and as a
species when connected to other bits of information.
And of course, its darker side will always be lurking, too, just as it always does in human nature.
The deep web speaks to the fathomless, scattered potential of not only the internet, but the human