SlideShare a Scribd company logo
1 of 15
TEXT MINING
Presented By:
Prakhyath Rai
Asst. Professor, Dept. of ISE,
SCEM, Mangaluru
Outline
 Introduction
 Data Mining Vs. Text Mining
 Motivation for Text Mining
 I/O Model for Text Mining
 Steps for Text Mining
 Key Terms in Text Mining
 Text Mining Frameworks
 Merits of Text Mining
 Applications of Text Mining
 Demerits of Text Mining
 References
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Introduction
Text Mining is a Discovery
Text Mining is also referred as Text Data Mining (TDM)
and Knowledge Discovery in Textual Database (KDT).
Text Mining is used to extract relevant information or
knowledge or pattern from different sources that are in
unstructured or semi-structured form.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Introduction Cont.
Extract and discover knowledge hidden in text
automatically
Aid domain experts by automatically:
 identifying concepts
extracting facts/relations
discovering implicit links
generating hypotheses
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Data Mining vs. Text Mining
Data Mining Text Mining
Process directly Linguistic processing or natural
language processing (NLP)
Identify causal relationship Discover heretofore unknown
information
Structured Data Semi-structured & Unstructured
Data (Text)
Structured numeric transaction
data residing in rational data
warehouse
Applications deal with much
more diverse and eclectic
collections of systems and
formats
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Motivation for Text Mining
Approximately 90% of the world’s data is held in
unstructured formats (source: Oracle Corporation)
Information intensive business processes demand that we
transcend from simple document retrieval to “knowledge”
discovery.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Input-Output Model for Text Mining
Input
Text Mining
Technique
Output
Patterns
Connections
Trends
Documents
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Steps for Text Mining
Pre-Processing the Text
Applying Text Mining Techniques
Summarization
Classification
Clustering
Visualization
Information Extraction
Analyzing the Text
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Keywords Terms in Text Mining
 Information Extraction (IE)
The science of searching for
Information in documents
Documents themselves
Metadata which describe
documents
Text, sound, images or data,
within database: relational
stand-alone database or
hypertext networked
databases such as the
Internet or intranets.
 Artificial Intelligence (AI)
Artificial intelligence
(AI) is a branch of
computer science and
engineering that deals
with intelligent behavior,
learning, and adaptation
in machines.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Merits of Text Mining
Database limits itself to Storage of less Information
whereas Text Mining overcomes this limitation
Extraction of relevant Information and Relationships
from Natural Documents
Extraction of Information from Unstructured or Semi-
structured Documents
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Applications of Text Mining
Analysis of Market Trends
Classification Technique
Information Extraction Technique
Analysis and Screening of Junk Emails
 Classification on the basis of pre-defined frequently
occurring items
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Demerits of Text Mining
Requires Initial Learned Information System for
Initial Extraction
Suitable programs are not been defined to Analyze
Text from Mining Knowledge or Information
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
References
[1] R Baeza-Yates and B Ribeiro-Neto. “Modern Information Retrieval”, ACM
Press, New York, 1999.
[2] Ning Zhong, Yuefeng Li and T. Grance, “Effective Pattern Discovery for Text
Mining,” IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 1,
January 2012.
[3] Raymond J Mooney and Un Yong Nahm, “ Text Mining with Information
Extraction”, Proceedings of the 4th International MIDP Colloquium, pages 141-
160, Van Schaik Pub., South Africa, 2005.
[4] M E Califf and R J Mooney, “Relational Learning of Pattern-Match Rules for
Information Extraction”, Proceedings of the 16th National Conference on Artificial
Intelligence (AAAI-99), pages 328-334, Orlando, FL, July 1999.
[5] D Freitag and N Kushmerick, “Boosted Wrapper Induction”, Proceedings of
the 17th National Conference on Artificial Intelligence (AAAI-2000), pages 577-
583, Austin, TX, July 2000.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Text Mining Techniques Explained
Text Mining Techniques Explained

More Related Content

What's hot

The vector space model
The vector space modelThe vector space model
The vector space modelpkgosh
 
Big Data & Text Mining
Big Data & Text MiningBig Data & Text Mining
Big Data & Text MiningMichel Bruley
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Spell checker using Natural language processing
Spell checker using Natural language processing Spell checker using Natural language processing
Spell checker using Natural language processing Sandeep Wakchaure
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process Shuvra Ghosh
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 

What's hot (20)

web mining
web miningweb mining
web mining
 
Data mining
Data miningData mining
Data mining
 
The vector space model
The vector space modelThe vector space model
The vector space model
 
Big Data & Text Mining
Big Data & Text MiningBig Data & Text Mining
Big Data & Text Mining
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
NLP
NLPNLP
NLP
 
Spell checker using Natural language processing
Spell checker using Natural language processing Spell checker using Natural language processing
Spell checker using Natural language processing
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Tesxt mining
Tesxt miningTesxt mining
Tesxt mining
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 

Viewers also liked

Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataYanchang Zhao
 
Text Mining and Visualization
Text Mining and VisualizationText Mining and Visualization
Text Mining and VisualizationSeth Grimes
 
Introduction to text mining
Introduction to text miningIntroduction to text mining
Introduction to text miningLars Juhl Jensen
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining FrameworkPrakhyath Rai
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsSeth Grimes
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentJohn Breslin
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehHadi Mohammadzadeh
 
Lecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data WarehouseLecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data Warehousephanleson
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the EnterpriseDATAVERSITY
 
Unstructured Data in BI
Unstructured Data in BIUnstructured Data in BI
Unstructured Data in BIMonaheng Diaho
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With RJahnab Kumar Deka
 
Text data mining1
Text data mining1Text data mining1
Text data mining1KU Leuven
 
Directing and Controlling
Directing and ControllingDirecting and Controlling
Directing and ControllingPrakhyath Rai
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - IJaganadh Gopinadhan
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehHadi Mohammadzadeh
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningSakthi Dasans
 
Structured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookStructured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookEmcien Corporation
 

Viewers also liked (20)

Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
 
Text Mining and Visualization
Text Mining and VisualizationText Mining and Visualization
Text Mining and Visualization
 
Introduction to text mining
Introduction to text miningIntroduction to text mining
Introduction to text mining
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
 
Text mining
Text miningText mining
Text mining
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining Framework
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and Semantics
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi Mohammadzadeh
 
Lecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data WarehouseLecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data Warehouse
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the Enterprise
 
Planning
PlanningPlanning
Planning
 
Unstructured Data in BI
Unstructured Data in BIUnstructured Data in BI
Unstructured Data in BI
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
 
Text data mining1
Text data mining1Text data mining1
Text data mining1
 
Directing and Controlling
Directing and ControllingDirecting and Controlling
Directing and Controlling
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi Mohammadzadeh
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text mining
 
Structured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookStructured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebook
 

Similar to Text Mining Techniques Explained

Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...ijceronline
 
Post 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docxPost 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docxstilliegeorgiana
 
Post 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text miniPost 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text minianhcrowley
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewINFOGAIN PUBLICATION
 
IRJET - BOT Virtual Guide
IRJET -  	  BOT Virtual GuideIRJET -  	  BOT Virtual Guide
IRJET - BOT Virtual GuideIRJET Journal
 
MINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMENMINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMENguest469de8
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...IRJET Journal
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...kevig
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringIRJET Journal
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibEl Habib NFAOUI
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitationbutest
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitationbutest
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...IJCSIS Research Publications
 
Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.Sandeep Wakchaure
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining AreaMahamudHasanCSE
 

Similar to Text Mining Techniques Explained (20)

B0410206010
B0410206010B0410206010
B0410206010
 
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
 
Ijetcas14 409
Ijetcas14 409Ijetcas14 409
Ijetcas14 409
 
Post 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docxPost 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docx
 
Post 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text miniPost 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text mini
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A Review
 
IRJET - BOT Virtual Guide
IRJET -  	  BOT Virtual GuideIRJET -  	  BOT Virtual Guide
IRJET - BOT Virtual Guide
 
MINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMENMINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMEN
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
 
TEXT MINING.pptx
TEXT MINING.pptxTEXT MINING.pptx
TEXT MINING.pptx
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web Engineering
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitation
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitation
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 
Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining Area
 

More from Prakhyath Rai

Ethics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging TechnologiesEthics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging TechnologiesPrakhyath Rai
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)Prakhyath Rai
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial IntelligencePrakhyath Rai
 
Emerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & IntroductionEmerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & IntroductionPrakhyath Rai
 
Preparation of Project
Preparation of ProjectPreparation of Project
Preparation of ProjectPrakhyath Rai
 
Small Scale Industry
Small Scale IndustrySmall Scale Industry
Small Scale IndustryPrakhyath Rai
 
Introduction to Management
Introduction to Management Introduction to Management
Introduction to Management Prakhyath Rai
 

More from Prakhyath Rai (9)

Ethics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging TechnologiesEthics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging Technologies
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Data Science
Data ScienceData Science
Data Science
 
Emerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & IntroductionEmerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & Introduction
 
Preparation of Project
Preparation of ProjectPreparation of Project
Preparation of Project
 
Small Scale Industry
Small Scale IndustrySmall Scale Industry
Small Scale Industry
 
Entrepreneurship
EntrepreneurshipEntrepreneurship
Entrepreneurship
 
Introduction to Management
Introduction to Management Introduction to Management
Introduction to Management
 

Recently uploaded

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 

Recently uploaded (20)

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

Text Mining Techniques Explained

  • 1. TEXT MINING Presented By: Prakhyath Rai Asst. Professor, Dept. of ISE, SCEM, Mangaluru
  • 2. Outline  Introduction  Data Mining Vs. Text Mining  Motivation for Text Mining  I/O Model for Text Mining  Steps for Text Mining  Key Terms in Text Mining  Text Mining Frameworks  Merits of Text Mining  Applications of Text Mining  Demerits of Text Mining  References Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 3. Introduction Text Mining is a Discovery Text Mining is also referred as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT). Text Mining is used to extract relevant information or knowledge or pattern from different sources that are in unstructured or semi-structured form. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 4. Introduction Cont. Extract and discover knowledge hidden in text automatically Aid domain experts by automatically:  identifying concepts extracting facts/relations discovering implicit links generating hypotheses Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 5. Data Mining vs. Text Mining Data Mining Text Mining Process directly Linguistic processing or natural language processing (NLP) Identify causal relationship Discover heretofore unknown information Structured Data Semi-structured & Unstructured Data (Text) Structured numeric transaction data residing in rational data warehouse Applications deal with much more diverse and eclectic collections of systems and formats Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 6. Motivation for Text Mining Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 7. Input-Output Model for Text Mining Input Text Mining Technique Output Patterns Connections Trends Documents Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 8. Steps for Text Mining Pre-Processing the Text Applying Text Mining Techniques Summarization Classification Clustering Visualization Information Extraction Analyzing the Text Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 9. Keywords Terms in Text Mining  Information Extraction (IE) The science of searching for Information in documents Documents themselves Metadata which describe documents Text, sound, images or data, within database: relational stand-alone database or hypertext networked databases such as the Internet or intranets.  Artificial Intelligence (AI) Artificial intelligence (AI) is a branch of computer science and engineering that deals with intelligent behavior, learning, and adaptation in machines. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 10. Merits of Text Mining Database limits itself to Storage of less Information whereas Text Mining overcomes this limitation Extraction of relevant Information and Relationships from Natural Documents Extraction of Information from Unstructured or Semi- structured Documents Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 11. Applications of Text Mining Analysis of Market Trends Classification Technique Information Extraction Technique Analysis and Screening of Junk Emails  Classification on the basis of pre-defined frequently occurring items Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 12. Demerits of Text Mining Requires Initial Learned Information System for Initial Extraction Suitable programs are not been defined to Analyze Text from Mining Knowledge or Information Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 13. References [1] R Baeza-Yates and B Ribeiro-Neto. “Modern Information Retrieval”, ACM Press, New York, 1999. [2] Ning Zhong, Yuefeng Li and T. Grance, “Effective Pattern Discovery for Text Mining,” IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 1, January 2012. [3] Raymond J Mooney and Un Yong Nahm, “ Text Mining with Information Extraction”, Proceedings of the 4th International MIDP Colloquium, pages 141- 160, Van Schaik Pub., South Africa, 2005. [4] M E Califf and R J Mooney, “Relational Learning of Pattern-Match Rules for Information Extraction”, Proceedings of the 16th National Conference on Artificial Intelligence (AAAI-99), pages 328-334, Orlando, FL, July 1999. [5] D Freitag and N Kushmerick, “Boosted Wrapper Induction”, Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000), pages 577- 583, Austin, TX, July 2000. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007