SlideShare a Scribd company logo
1 of 35
Download to read offline
Descubrimiento de Insights a través de Text
Mining: cómo y para qué analizar grandes
cantidades de textos
Andrea Villanes - @andreagrr
Big Data Analytics Summit 2016
Lima - Perú
Big Data Analytics Summit Perú
Acerca de mi
Educación
Experiencia
Otros
Big Data Analytics Summit Perú
Cada minuto…
300,000 tweets
2.5 millones de posts
200 millones de mensajes
#ProcastinandoAndo
*The Data Explosion in 2014 Minute by Minute – Infographic
Big Data Analytics Summit Perú
Text Mining
“…find interesting regularities in large textual datasets”
(Fayad)
…donde interesante significa no-trivial, escondido,
desconocido, y potencialmente util.
Big Data Analytics Summit Perú
Proceso de Text Mining
10 0 8 9 0 3 3 0
15 5 6 0 9 11 0 1
0 2 25 12 0 9 10 0
1 11 0 5 5 0 5 21
0 6 12 2 0 2 5 3
19 8 2 13 0 0 10 14
15 12 5 3 8 9 5 0
5 0 11 0 10 0 5 8
Term 1
Term 2
Term 3
Term 4
...
Term n-2
Term n-1
Term n
Big Data Analytics Summit Perú
Transformacion de texto a una matrix
Recoleccion de
Datos
Text Parsing
Term Vector
Weighting
Big Data Analytics Summit Perú
Recoleccion
de Datos
Text Parsing
Term Vector
Weighting
Web crawling: recolección de datos
de la web
APIs: Twitter, Trip Advisor, Facebook
Archivos CSV: encuestas, emails,
respuestas abiertas, etc!
Big Data Analytics Summit Perú
Recoleccion
de Datos
Text Parsing
Term Vector
Weighting
Limpieza de texto: remover
palabras innecesarias y deshacer
redundancia
Stop Words: remover palabras
comunes pero que no proveen
utilidad al descubrimiento del
contexto (el, la, de, los, y, etc…)
Abrir
Abrir lo
Abrir ias
Abrir as
Abrir
Stemming: convierte las palabras a
su raíz.
Big Data Analytics Summit Perú
Recoleccion
de Datos
Text Parsing
Term Vector
Weighting
Term Frequency–Inverse Document Frequency (TF-IDF)
Las palabras individuales tienen un peso dada su
frecuencia en el document (term frequency), y por la
frequency en todos los documentos en conjunto
(document frequency)
Big Data Analytics Summit Perú
Transformacion de texto a una matrix
Recoleccion de
Datos
Text Parsing
Term Vector
Weighting
10 0 8 9 0 3 3 0
15 5 6 0 9 11 0 1
0 2 25 12 0 9 10 0
1 11 0 5 5 0 5 21
0 6 12 2 0 2 5 3
19 8 2 13 0 0 10 14
15 12 5 3 8 9 5 0
5 0 11 0 10 0 5 8
Big Data Analytics Summit Perú
Producto final
10 0 8 9 0 3 3 0
15 5 6 0 9 11 0 1
0 2 25 12 0 9 10 0
1 11 0 5 5 0 5 21
0 6 12 2 0 2 5 3
19 8 2 13 0 0 10 14
15 12 5 3 8 9 5 0
5 0 11 0 10 0 5 8
Term 1
Term 2
Term 3
Term 4
...
Term n-2
Term n-1
Term n
Que algoritmos podemos
aplicar en esta matrix?
• Clustering (segmentacion)
• Clasificacion
• Associacion de palabras
Big Data Analytics Summit Perú
Herramientas
• SAS Enterprise Miner (Text Miner)
• Text parsing, Term weighting, LSA,
modelos
• Needed <- c("tm", "SnowballCC", "RColorBrewer",
"ggplot2", "wordcloud", "biclust", "cluster", "igraph",
"fpc")
• Text parsing, term weighting, LSA,
LDA, NMF, modelos
• scikit-learn, nltk, numpy, pandas,
beautiful soup
• Web crawling, text parsing, term
weighting, LSA, LDA, NMF, modelos
Big Data Analytics Summit Perú
Ejemplos de Insights
usando Text Mining
Big Data Analytics Summit Perú
Aplicaciones de Text Mining #1
1. Analizando data de social media: Facebook & Twitter
Big Data Analytics Summit Perú
Analizando data de Facebook - Clustering
549
536
210
160
Family lovers
Night lovers
Vacation lovers
Beach lovers
May 28th
Big Data Analytics Summit Perú
Analizando data de Twitter - Predicción
Dime que es lo que twiteas, y te
dire quien eres
Big Data Analytics Summit Perú
Analizando data de Twitter - Predicción
MamaAdolescenteGeek
Big Data Analytics Summit Perú
Analizando data de Twitter - Prediccion
Correctly
predicted
86%
Incorrectly
predicted
14%
Adolescente
Big Data Analytics Summit Perú
Analizando data de Twitter - Prediccion
Correctly
predicted
28%
Incorrectly
predicted
72%
Geek
Big Data Analytics Summit Perú
Analizando data de Twitter - Prediccion
Correctly
predicted
78%
Incorrectly
predicted
22%
Mama
Big Data Analytics Summit Perú
Aplicaciones de Text Mining #2
2. Analizando respuestas abiertas en encuestas sobre
quejas en productos de uso diario
Big Data Analytics Summit Perú
Descripción del dataset
• Respuestas abiertas de una encuesta:
“Describe everyday usability problems in any product”
• Numero de observaciones = 384
• Promedio de palabras por respuesta = 182
Big Data Analytics Summit Perú
Ejemplos de las respuestas
“A poor design that I have experienced in my everyday life is
the safety lids on fruit cups. The lids on the fruit cups have
this clip on the top that you are supposed to be able to open
with ease. Well when I attempt to open the product either my
fruit spills out of the can from my hard tugging at the pin or I
get cut from the aluminum lid. I really hope parents don't let
their kids open these products by themselves because they
could possible get cut. I believe if the product is to be easy to
open let it be easily accessible to everyone not just grown
ups.”
“A bad design I have encountered is rooms
with light switches on the wall as you walk in
but they do not have a light fixture on the
ceiling. That is like having a door handle on
a wall. no point.”
“ATM machines have snazzy little computer screen printouts
but the problem is that if the sun is shining at your back while
using one the glare makes the screen unreadable. They should
position ATM machines North to South or give you shade.”
“An example of something that drives me crazy are
washers and dryers. I really think they should be
standardized. I get used to the way mine operate, then
when I have to use someone else's washer and dryer, I
have to stand there forever trying to figure out which
button starts the dryer. A good way to solve this would be
to standardize the layout of the controls, so that
manufacturers could still add fancy options, but
consumers would still know which control did what.”
Big Data Analytics Summit Perú
Ejemplos
Big Data Analytics Summit Perú
Analizando data usando Enterprise Miner
Big Data Analytics Summit Perú
Analizando data usando Enterprise Miner
Big Data Analytics Summit Perú
Analizando data usando Enterprise Miner
Big Data Analytics Summit Perú
Aplicaciones de Text Mining #3
3. Detección de dengue a través de periódicos
Big Data Analytics Summit Perú
Analizando texto a traves del tiempo
Known to transmit:
• Dengue
• Yellow fever
• Chikungunya
• Zika
Probability of dengue occurrence in 2010
Source: Bhatt, Samir et al. “The Global Distribution and Burden of Dengue.” Nature
496.7446 (2013): 504–507. PMC.
Big Data Analytics Summit Perú
Analizando texto a traves del tiempo
0
100
200
300
400
500
600
700
NumberofArticles
Month
Number of Articles by Month
Prevention
Reported Cases
Total
Total
Prevention (36%)
Big Data Analytics Summit Perú
Analizando texto a traves del tiempo
0
100
200
300
400
500
600
700
NumberofArticles
Month
Number of Articles by Month
Prevention
Reported Cases
Total
Total
Prevention (36%)
Reported Cases (33%)
Big Data Analytics Summit Perú
Analizando texto a traves del tiempo
0
100
200
300
400
500
600
700
NumberofArticles
Month
Number of Articles by Month
Politics
Prevention
Reported Cases
Total
Total
Prevention (36%)
Reported Cases (33%)
Politics (11%)
Big Data Analytics Summit Perú
Cómo comenzar con Text Mining?
“Text Mining: Predictive Methods for
Analyzing Unstructured Information”
Sholom M. Weiss and Nitin Indurkhya
“Web Scraping with Python”
Ryan Mitchell
“Natural Language Processing with Python”
Bird and Klein
Big Data Analytics Summit Perú
Habilidades para ser un cientifico de datos
Algoritmos Estadística Algebra Lineal
Escrita Oral
Teoría
Herramientas
Visualización
Comunicación
Big Data Analytics Summit Perú
Gracias!
andrea.villanes@gmail.com
www.andreavillanes.com
@andreagrr
www.MentorMeInfo.com
https://www.facebook.com/MentorMeInfo

More Related Content

What's hot

Skylads - Big Data for Telcos
Skylads - Big Data for TelcosSkylads - Big Data for Telcos
Skylads - Big Data for TelcosXavier Litt
 
Xpanse Analytics Platform
Xpanse Analytics PlatformXpanse Analytics Platform
Xpanse Analytics PlatformMichael Keane
 
Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie's Overview - How much could your company lose in a matter of minutes?Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie's Overview - How much could your company lose in a matter of minutes?Intelie
 
Bigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentBigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentKushal Shah
 
Beyond analytics: Prescriptive analytics for the future of your business by Á...
Beyond analytics: Prescriptive analytics for the future of your business by Á...Beyond analytics: Prescriptive analytics for the future of your business by Á...
Beyond analytics: Prescriptive analytics for the future of your business by Á...Big Data Spain
 
Kde jsou limity zákaznické 360°?
 Kde jsou limity zákaznické 360°? Kde jsou limity zákaznické 360°?
Kde jsou limity zákaznické 360°?Taste Medio
 
Predictive Analytics World Berlin 2016 Call for Speakers
Predictive Analytics World Berlin 2016 Call for SpeakersPredictive Analytics World Berlin 2016 Call for Speakers
Predictive Analytics World Berlin 2016 Call for SpeakersDatentreiber
 
Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Rising Media Ltd.
 
BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBig-Data-Summit
 
Data Science for Finance
Data Science for FinanceData Science for Finance
Data Science for FinanceTheClickReader
 
Social media analytics using Azure Technologies
Social media analytics using Azure TechnologiesSocial media analytics using Azure Technologies
Social media analytics using Azure TechnologiesKoray Kocabas
 
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)Data Driven Innovation
 

What's hot (15)

Skylads - Big Data for Telcos
Skylads - Big Data for TelcosSkylads - Big Data for Telcos
Skylads - Big Data for Telcos
 
Xpanse Analytics Platform
Xpanse Analytics PlatformXpanse Analytics Platform
Xpanse Analytics Platform
 
Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie's Overview - How much could your company lose in a matter of minutes?Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie's Overview - How much could your company lose in a matter of minutes?
 
Bigdata analysis in supply chain managment
Bigdata analysis in supply chain managmentBigdata analysis in supply chain managment
Bigdata analysis in supply chain managment
 
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
 
Beyond analytics: Prescriptive analytics for the future of your business by Á...
Beyond analytics: Prescriptive analytics for the future of your business by Á...Beyond analytics: Prescriptive analytics for the future of your business by Á...
Beyond analytics: Prescriptive analytics for the future of your business by Á...
 
Kde jsou limity zákaznické 360°?
 Kde jsou limity zákaznické 360°? Kde jsou limity zákaznické 360°?
Kde jsou limity zákaznické 360°?
 
Building a Data Driven Business
Building a Data Driven BusinessBuilding a Data Driven Business
Building a Data Driven Business
 
Predictive Analytics World Berlin 2016 Call for Speakers
Predictive Analytics World Berlin 2016 Call for SpeakersPredictive Analytics World Berlin 2016 Call for Speakers
Predictive Analytics World Berlin 2016 Call for Speakers
 
Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022
 
BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
 
Data Science for Finance
Data Science for FinanceData Science for Finance
Data Science for Finance
 
Social media analytics using Azure Technologies
Social media analytics using Azure TechnologiesSocial media analytics using Azure Technologies
Social media analytics using Azure Technologies
 
Smmart for partners
Smmart for partnersSmmart for partners
Smmart for partners
 
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)
How AI will impact Web and Social Media Intelligence - Uljan Sharka (Crystal.io)
 

Viewers also liked

Big y Open Data para las Smart Cities
Big y Open Data para las Smart CitiesBig y Open Data para las Smart Cities
Big y Open Data para las Smart CitiesBig-Data-Summit
 
Lost in Translation: Connecting Data Insights with Marketing Execution
Lost in Translation: Connecting Data Insights with Marketing ExecutionLost in Translation: Connecting Data Insights with Marketing Execution
Lost in Translation: Connecting Data Insights with Marketing ExecutionBig-Data-Summit
 
Estrategias omnicanal para la mejora de los procesos de comunicación y marke...
	Estrategias omnicanal para la mejora de los procesos de comunicación y marke...	Estrategias omnicanal para la mejora de los procesos de comunicación y marke...
Estrategias omnicanal para la mejora de los procesos de comunicación y marke...Big-Data-Summit
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data StorytellingBig-Data-Summit
 
Big Data en salud: Un mundo de posibilidades
Big Data en salud: Un mundo de posibilidadesBig Data en salud: Un mundo de posibilidades
Big Data en salud: Un mundo de posibilidadesBig-Data-Summit
 
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo Cambiante
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo CambianteBig Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo Cambiante
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo CambianteBig-Data-Summit
 
Convergencia de Analítica con la Experiencia Digital
Convergencia de Analítica con la Experiencia DigitalConvergencia de Analítica con la Experiencia Digital
Convergencia de Analítica con la Experiencia DigitalBig-Data-Summit
 
Explorando los Límites de la Predicción
Explorando los Límites de la PredicciónExplorando los Límites de la Predicción
Explorando los Límites de la PredicciónBig-Data-Summit
 
Data Science: De la Matemática a la Práctica
Data Science: De la Matemática a la PrácticaData Science: De la Matemática a la Práctica
Data Science: De la Matemática a la PrácticaBig-Data-Summit
 
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasParadigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasBig-Data-Summit
 
Diferencias entre scrum y xp
Diferencias entre scrum y xp Diferencias entre scrum y xp
Diferencias entre scrum y xp deborahgal
 
Automatización de Procesos de Negocios con BPMS de Código Abierto
Automatización de Procesos de Negocios con BPMS de Código AbiertoAutomatización de Procesos de Negocios con BPMS de Código Abierto
Automatización de Procesos de Negocios con BPMS de Código AbiertoJosé Luis Chiquete Valdivieso
 

Viewers also liked (18)

Big y Open Data para las Smart Cities
Big y Open Data para las Smart CitiesBig y Open Data para las Smart Cities
Big y Open Data para las Smart Cities
 
Lost in Translation: Connecting Data Insights with Marketing Execution
Lost in Translation: Connecting Data Insights with Marketing ExecutionLost in Translation: Connecting Data Insights with Marketing Execution
Lost in Translation: Connecting Data Insights with Marketing Execution
 
Estrategias omnicanal para la mejora de los procesos de comunicación y marke...
	Estrategias omnicanal para la mejora de los procesos de comunicación y marke...	Estrategias omnicanal para la mejora de los procesos de comunicación y marke...
Estrategias omnicanal para la mejora de los procesos de comunicación y marke...
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data Storytelling
 
Big Data en salud: Un mundo de posibilidades
Big Data en salud: Un mundo de posibilidadesBig Data en salud: Un mundo de posibilidades
Big Data en salud: Un mundo de posibilidades
 
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo Cambiante
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo CambianteBig Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo Cambiante
Big Data e Internet de las Cosas: Nuevas Tecnologías para un Mundo Cambiante
 
Convergencia de Analítica con la Experiencia Digital
Convergencia de Analítica con la Experiencia DigitalConvergencia de Analítica con la Experiencia Digital
Convergencia de Analítica con la Experiencia Digital
 
Explorando los Límites de la Predicción
Explorando los Límites de la PredicciónExplorando los Límites de la Predicción
Explorando los Límites de la Predicción
 
Data Science: De la Matemática a la Práctica
Data Science: De la Matemática a la PrácticaData Science: De la Matemática a la Práctica
Data Science: De la Matemática a la Práctica
 
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasParadigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
 
How to calculate provident fund
How to calculate provident fundHow to calculate provident fund
How to calculate provident fund
 
Manual del administrador de Google Apps
Manual del administrador de Google AppsManual del administrador de Google Apps
Manual del administrador de Google Apps
 
Tratamiento de datos
Tratamiento de datosTratamiento de datos
Tratamiento de datos
 
Diferencias entre scrum y xp
Diferencias entre scrum y xp Diferencias entre scrum y xp
Diferencias entre scrum y xp
 
Programación Orientada a Objetos para Python
Programación Orientada a Objetos para PythonProgramación Orientada a Objetos para Python
Programación Orientada a Objetos para Python
 
Programador Jr. para Python Primera Parte
Programador Jr. para Python Primera ParteProgramador Jr. para Python Primera Parte
Programador Jr. para Python Primera Parte
 
Automatización de Procesos de Negocios con BPMS de Código Abierto
Automatización de Procesos de Negocios con BPMS de Código AbiertoAutomatización de Procesos de Negocios con BPMS de Código Abierto
Automatización de Procesos de Negocios con BPMS de Código Abierto
 
Rup vs. xp
Rup vs. xpRup vs. xp
Rup vs. xp
 

Similar to Descubrimiento de Insights a través de Text Mining: cómo y para qué analizar grandes cantidades de textos

A Survey on Big Data Analytics
A Survey on Big Data AnalyticsA Survey on Big Data Analytics
A Survey on Big Data AnalyticsBHARATH KUMAR
 
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Amazon Web Services Korea
 
Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online caniceconsulting
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
 
Small Investments, Big Returns: Three Successful Data Science Use Cases
Small Investments, Big Returns: Three Successful Data Science Use CasesSmall Investments, Big Returns: Three Successful Data Science Use Cases
Small Investments, Big Returns: Three Successful Data Science Use CasesSense Corp
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of pythonbhavesh lande
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Data Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkData Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkDavid Chiu
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science Sagar Hedau
 
K1 embedding big data & analytics into the business to deliver sustainable value
K1 embedding big data & analytics into the business to deliver sustainable valueK1 embedding big data & analytics into the business to deliver sustainable value
K1 embedding big data & analytics into the business to deliver sustainable valueDr. Wilfred Lin (Ph.D.)
 
Big Data Scotland
Big Data ScotlandBig Data Scotland
Big Data ScotlandRay Bugg
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallTrillium Software
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesSlideTeam
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data ScienceSanghamitra Deb
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsEric D. Schabell
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thingBharath Rao
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analyticsCapgemini
 
A study on web analytics with reference to select sports websites
A study on web analytics with reference to select sports websitesA study on web analytics with reference to select sports websites
A study on web analytics with reference to select sports websitesBhanu Prakash
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...johnmutiso245
 

Similar to Descubrimiento de Insights a través de Text Mining: cómo y para qué analizar grandes cantidades de textos (20)

A Survey on Big Data Analytics
A Survey on Big Data AnalyticsA Survey on Big Data Analytics
A Survey on Big Data Analytics
 
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
 
Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Small Investments, Big Returns: Three Successful Data Science Use Cases
Small Investments, Big Returns: Three Successful Data Science Use CasesSmall Investments, Big Returns: Three Successful Data Science Use Cases
Small Investments, Big Returns: Three Successful Data Science Use Cases
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of python
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Data Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkData Analysis - Making Big Data Work
Data Analysis - Making Big Data Work
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science
 
K1 embedding big data & analytics into the business to deliver sustainable value
K1 embedding big data & analytics into the business to deliver sustainable valueK1 embedding big data & analytics into the business to deliver sustainable value
K1 embedding big data & analytics into the business to deliver sustainable value
 
Big Data Scotland
Big Data ScotlandBig Data Scotland
Big Data Scotland
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation Slides
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
 
A study on web analytics with reference to select sports websites
A study on web analytics with reference to select sports websitesA study on web analytics with reference to select sports websites
A study on web analytics with reference to select sports websites
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...
 

More from Big-Data-Summit

SafeHomeFace - Sistema de reconocimiento facial.
SafeHomeFace - Sistema de reconocimiento facial.SafeHomeFace - Sistema de reconocimiento facial.
SafeHomeFace - Sistema de reconocimiento facial.Big-Data-Summit
 
Las 10 tendencias principales de BI para el 2018 - Carloz Díaz
Las 10 tendencias principales de BI para el 2018 - Carloz DíazLas 10 tendencias principales de BI para el 2018 - Carloz Díaz
Las 10 tendencias principales de BI para el 2018 - Carloz DíazBig-Data-Summit
 
El big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónEl big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónBig-Data-Summit
 
Big Data en el sector inmobiliario - Gonzalo Martín
Big Data en el sector inmobiliario - Gonzalo MartínBig Data en el sector inmobiliario - Gonzalo Martín
Big Data en el sector inmobiliario - Gonzalo MartínBig-Data-Summit
 
Modelo Operativo para grandes proyectos de AI - Ignacio Marrero
Modelo Operativo para grandes proyectos de AI - Ignacio MarreroModelo Operativo para grandes proyectos de AI - Ignacio Marrero
Modelo Operativo para grandes proyectos de AI - Ignacio MarreroBig-Data-Summit
 
La evolución de la analítica descriptiva - Diego Aguirre
La evolución de la analítica descriptiva - Diego AguirreLa evolución de la analítica descriptiva - Diego Aguirre
La evolución de la analítica descriptiva - Diego AguirreBig-Data-Summit
 
El dato tiene forma y la forma significado - Josep Curto
El dato tiene forma y la forma significado - Josep CurtoEl dato tiene forma y la forma significado - Josep Curto
El dato tiene forma y la forma significado - Josep CurtoBig-Data-Summit
 
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentes
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentesBDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentes
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentesBig-Data-Summit
 
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos Digitales
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos DigitalesBDAS-2017 | Deep Learning para Extracción de Valor en Contenidos Digitales
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos DigitalesBig-Data-Summit
 
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...Big-Data-Summit
 
BDAS-2017 | sanselix jobranke_rpptx
BDAS-2017 | sanselix jobranke_rpptxBDAS-2017 | sanselix jobranke_rpptx
BDAS-2017 | sanselix jobranke_rpptxBig-Data-Summit
 
BDAS-2017 | Analitica visual presentación mlms2
BDAS-2017 | Analitica visual presentación mlms2BDAS-2017 | Analitica visual presentación mlms2
BDAS-2017 | Analitica visual presentación mlms2Big-Data-Summit
 
BDAS-2017 | Comunidad Data Science
BDAS-2017 | Comunidad Data ScienceBDAS-2017 | Comunidad Data Science
BDAS-2017 | Comunidad Data ScienceBig-Data-Summit
 
BDAS-2017 | DMC Challengue concurso satisfacción universidad
BDAS-2017 | DMC Challengue concurso satisfacción universidadBDAS-2017 | DMC Challengue concurso satisfacción universidad
BDAS-2017 | DMC Challengue concurso satisfacción universidadBig-Data-Summit
 
BDAS-2017 | Hanldling Target Bias in Predictive Modelling
BDAS-2017 | Hanldling Target Bias in Predictive ModellingBDAS-2017 | Hanldling Target Bias in Predictive Modelling
BDAS-2017 | Hanldling Target Bias in Predictive ModellingBig-Data-Summit
 
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendencias
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendenciasBDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendencias
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendenciasBig-Data-Summit
 
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...Big-Data-Summit
 
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big Data
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big DataBDAS-2017 | Comprendiendo nuestras motivaciones a través de Big Data
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big DataBig-Data-Summit
 
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...Big-Data-Summit
 
Building Innovative Data Products in a Banking Environment
Building Innovative Data Products in a Banking EnvironmentBuilding Innovative Data Products in a Banking Environment
Building Innovative Data Products in a Banking EnvironmentBig-Data-Summit
 

More from Big-Data-Summit (20)

SafeHomeFace - Sistema de reconocimiento facial.
SafeHomeFace - Sistema de reconocimiento facial.SafeHomeFace - Sistema de reconocimiento facial.
SafeHomeFace - Sistema de reconocimiento facial.
 
Las 10 tendencias principales de BI para el 2018 - Carloz Díaz
Las 10 tendencias principales de BI para el 2018 - Carloz DíazLas 10 tendencias principales de BI para el 2018 - Carloz Díaz
Las 10 tendencias principales de BI para el 2018 - Carloz Díaz
 
El big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónEl big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex Rayón
 
Big Data en el sector inmobiliario - Gonzalo Martín
Big Data en el sector inmobiliario - Gonzalo MartínBig Data en el sector inmobiliario - Gonzalo Martín
Big Data en el sector inmobiliario - Gonzalo Martín
 
Modelo Operativo para grandes proyectos de AI - Ignacio Marrero
Modelo Operativo para grandes proyectos de AI - Ignacio MarreroModelo Operativo para grandes proyectos de AI - Ignacio Marrero
Modelo Operativo para grandes proyectos de AI - Ignacio Marrero
 
La evolución de la analítica descriptiva - Diego Aguirre
La evolución de la analítica descriptiva - Diego AguirreLa evolución de la analítica descriptiva - Diego Aguirre
La evolución de la analítica descriptiva - Diego Aguirre
 
El dato tiene forma y la forma significado - Josep Curto
El dato tiene forma y la forma significado - Josep CurtoEl dato tiene forma y la forma significado - Josep Curto
El dato tiene forma y la forma significado - Josep Curto
 
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentes
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentesBDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentes
BDAS-2017 | Evolución de Open Data en el desarrollo de las ciudades inteligentes
 
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos Digitales
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos DigitalesBDAS-2017 | Deep Learning para Extracción de Valor en Contenidos Digitales
BDAS-2017 | Deep Learning para Extracción de Valor en Contenidos Digitales
 
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
 
BDAS-2017 | sanselix jobranke_rpptx
BDAS-2017 | sanselix jobranke_rpptxBDAS-2017 | sanselix jobranke_rpptx
BDAS-2017 | sanselix jobranke_rpptx
 
BDAS-2017 | Analitica visual presentación mlms2
BDAS-2017 | Analitica visual presentación mlms2BDAS-2017 | Analitica visual presentación mlms2
BDAS-2017 | Analitica visual presentación mlms2
 
BDAS-2017 | Comunidad Data Science
BDAS-2017 | Comunidad Data ScienceBDAS-2017 | Comunidad Data Science
BDAS-2017 | Comunidad Data Science
 
BDAS-2017 | DMC Challengue concurso satisfacción universidad
BDAS-2017 | DMC Challengue concurso satisfacción universidadBDAS-2017 | DMC Challengue concurso satisfacción universidad
BDAS-2017 | DMC Challengue concurso satisfacción universidad
 
BDAS-2017 | Hanldling Target Bias in Predictive Modelling
BDAS-2017 | Hanldling Target Bias in Predictive ModellingBDAS-2017 | Hanldling Target Bias in Predictive Modelling
BDAS-2017 | Hanldling Target Bias in Predictive Modelling
 
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendencias
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendenciasBDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendencias
BDAS-2017 | Convergencia entre Open Data y Big Data, casos y tendencias
 
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...
BDAS-2017 | Big Bilbao: Big Data e Internet of Things para la promoción econó...
 
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big Data
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big DataBDAS-2017 | Comprendiendo nuestras motivaciones a través de Big Data
BDAS-2017 | Comprendiendo nuestras motivaciones a través de Big Data
 
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...
BDAS-2017 | Conozca la plataforma ideal para un procesamiento analítico sin p...
 
Building Innovative Data Products in a Banking Environment
Building Innovative Data Products in a Banking EnvironmentBuilding Innovative Data Products in a Banking Environment
Building Innovative Data Products in a Banking Environment
 

Recently uploaded

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Recently uploaded (20)

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 

Descubrimiento de Insights a través de Text Mining: cómo y para qué analizar grandes cantidades de textos

  • 1. Descubrimiento de Insights a través de Text Mining: cómo y para qué analizar grandes cantidades de textos Andrea Villanes - @andreagrr Big Data Analytics Summit 2016 Lima - Perú
  • 2. Big Data Analytics Summit Perú Acerca de mi Educación Experiencia Otros
  • 3. Big Data Analytics Summit Perú Cada minuto… 300,000 tweets 2.5 millones de posts 200 millones de mensajes #ProcastinandoAndo *The Data Explosion in 2014 Minute by Minute – Infographic
  • 4. Big Data Analytics Summit Perú Text Mining “…find interesting regularities in large textual datasets” (Fayad) …donde interesante significa no-trivial, escondido, desconocido, y potencialmente util.
  • 5. Big Data Analytics Summit Perú Proceso de Text Mining 10 0 8 9 0 3 3 0 15 5 6 0 9 11 0 1 0 2 25 12 0 9 10 0 1 11 0 5 5 0 5 21 0 6 12 2 0 2 5 3 19 8 2 13 0 0 10 14 15 12 5 3 8 9 5 0 5 0 11 0 10 0 5 8 Term 1 Term 2 Term 3 Term 4 ... Term n-2 Term n-1 Term n
  • 6. Big Data Analytics Summit Perú Transformacion de texto a una matrix Recoleccion de Datos Text Parsing Term Vector Weighting
  • 7. Big Data Analytics Summit Perú Recoleccion de Datos Text Parsing Term Vector Weighting Web crawling: recolección de datos de la web APIs: Twitter, Trip Advisor, Facebook Archivos CSV: encuestas, emails, respuestas abiertas, etc!
  • 8. Big Data Analytics Summit Perú Recoleccion de Datos Text Parsing Term Vector Weighting Limpieza de texto: remover palabras innecesarias y deshacer redundancia Stop Words: remover palabras comunes pero que no proveen utilidad al descubrimiento del contexto (el, la, de, los, y, etc…) Abrir Abrir lo Abrir ias Abrir as Abrir Stemming: convierte las palabras a su raíz.
  • 9. Big Data Analytics Summit Perú Recoleccion de Datos Text Parsing Term Vector Weighting Term Frequency–Inverse Document Frequency (TF-IDF) Las palabras individuales tienen un peso dada su frecuencia en el document (term frequency), y por la frequency en todos los documentos en conjunto (document frequency)
  • 10. Big Data Analytics Summit Perú Transformacion de texto a una matrix Recoleccion de Datos Text Parsing Term Vector Weighting 10 0 8 9 0 3 3 0 15 5 6 0 9 11 0 1 0 2 25 12 0 9 10 0 1 11 0 5 5 0 5 21 0 6 12 2 0 2 5 3 19 8 2 13 0 0 10 14 15 12 5 3 8 9 5 0 5 0 11 0 10 0 5 8
  • 11. Big Data Analytics Summit Perú Producto final 10 0 8 9 0 3 3 0 15 5 6 0 9 11 0 1 0 2 25 12 0 9 10 0 1 11 0 5 5 0 5 21 0 6 12 2 0 2 5 3 19 8 2 13 0 0 10 14 15 12 5 3 8 9 5 0 5 0 11 0 10 0 5 8 Term 1 Term 2 Term 3 Term 4 ... Term n-2 Term n-1 Term n Que algoritmos podemos aplicar en esta matrix? • Clustering (segmentacion) • Clasificacion • Associacion de palabras
  • 12. Big Data Analytics Summit Perú Herramientas • SAS Enterprise Miner (Text Miner) • Text parsing, Term weighting, LSA, modelos • Needed <- c("tm", "SnowballCC", "RColorBrewer", "ggplot2", "wordcloud", "biclust", "cluster", "igraph", "fpc") • Text parsing, term weighting, LSA, LDA, NMF, modelos • scikit-learn, nltk, numpy, pandas, beautiful soup • Web crawling, text parsing, term weighting, LSA, LDA, NMF, modelos
  • 13. Big Data Analytics Summit Perú Ejemplos de Insights usando Text Mining
  • 14. Big Data Analytics Summit Perú Aplicaciones de Text Mining #1 1. Analizando data de social media: Facebook & Twitter
  • 15. Big Data Analytics Summit Perú Analizando data de Facebook - Clustering 549 536 210 160 Family lovers Night lovers Vacation lovers Beach lovers May 28th
  • 16. Big Data Analytics Summit Perú Analizando data de Twitter - Predicción Dime que es lo que twiteas, y te dire quien eres
  • 17. Big Data Analytics Summit Perú Analizando data de Twitter - Predicción MamaAdolescenteGeek
  • 18. Big Data Analytics Summit Perú Analizando data de Twitter - Prediccion Correctly predicted 86% Incorrectly predicted 14% Adolescente
  • 19. Big Data Analytics Summit Perú Analizando data de Twitter - Prediccion Correctly predicted 28% Incorrectly predicted 72% Geek
  • 20. Big Data Analytics Summit Perú Analizando data de Twitter - Prediccion Correctly predicted 78% Incorrectly predicted 22% Mama
  • 21. Big Data Analytics Summit Perú Aplicaciones de Text Mining #2 2. Analizando respuestas abiertas en encuestas sobre quejas en productos de uso diario
  • 22. Big Data Analytics Summit Perú Descripción del dataset • Respuestas abiertas de una encuesta: “Describe everyday usability problems in any product” • Numero de observaciones = 384 • Promedio de palabras por respuesta = 182
  • 23. Big Data Analytics Summit Perú Ejemplos de las respuestas “A poor design that I have experienced in my everyday life is the safety lids on fruit cups. The lids on the fruit cups have this clip on the top that you are supposed to be able to open with ease. Well when I attempt to open the product either my fruit spills out of the can from my hard tugging at the pin or I get cut from the aluminum lid. I really hope parents don't let their kids open these products by themselves because they could possible get cut. I believe if the product is to be easy to open let it be easily accessible to everyone not just grown ups.” “A bad design I have encountered is rooms with light switches on the wall as you walk in but they do not have a light fixture on the ceiling. That is like having a door handle on a wall. no point.” “ATM machines have snazzy little computer screen printouts but the problem is that if the sun is shining at your back while using one the glare makes the screen unreadable. They should position ATM machines North to South or give you shade.” “An example of something that drives me crazy are washers and dryers. I really think they should be standardized. I get used to the way mine operate, then when I have to use someone else's washer and dryer, I have to stand there forever trying to figure out which button starts the dryer. A good way to solve this would be to standardize the layout of the controls, so that manufacturers could still add fancy options, but consumers would still know which control did what.”
  • 24. Big Data Analytics Summit Perú Ejemplos
  • 25. Big Data Analytics Summit Perú Analizando data usando Enterprise Miner
  • 26. Big Data Analytics Summit Perú Analizando data usando Enterprise Miner
  • 27. Big Data Analytics Summit Perú Analizando data usando Enterprise Miner
  • 28. Big Data Analytics Summit Perú Aplicaciones de Text Mining #3 3. Detección de dengue a través de periódicos
  • 29. Big Data Analytics Summit Perú Analizando texto a traves del tiempo Known to transmit: • Dengue • Yellow fever • Chikungunya • Zika Probability of dengue occurrence in 2010 Source: Bhatt, Samir et al. “The Global Distribution and Burden of Dengue.” Nature 496.7446 (2013): 504–507. PMC.
  • 30. Big Data Analytics Summit Perú Analizando texto a traves del tiempo 0 100 200 300 400 500 600 700 NumberofArticles Month Number of Articles by Month Prevention Reported Cases Total Total Prevention (36%)
  • 31. Big Data Analytics Summit Perú Analizando texto a traves del tiempo 0 100 200 300 400 500 600 700 NumberofArticles Month Number of Articles by Month Prevention Reported Cases Total Total Prevention (36%) Reported Cases (33%)
  • 32. Big Data Analytics Summit Perú Analizando texto a traves del tiempo 0 100 200 300 400 500 600 700 NumberofArticles Month Number of Articles by Month Politics Prevention Reported Cases Total Total Prevention (36%) Reported Cases (33%) Politics (11%)
  • 33. Big Data Analytics Summit Perú Cómo comenzar con Text Mining? “Text Mining: Predictive Methods for Analyzing Unstructured Information” Sholom M. Weiss and Nitin Indurkhya “Web Scraping with Python” Ryan Mitchell “Natural Language Processing with Python” Bird and Klein
  • 34. Big Data Analytics Summit Perú Habilidades para ser un cientifico de datos Algoritmos Estadística Algebra Lineal Escrita Oral Teoría Herramientas Visualización Comunicación
  • 35. Big Data Analytics Summit Perú Gracias! andrea.villanes@gmail.com www.andreavillanes.com @andreagrr www.MentorMeInfo.com https://www.facebook.com/MentorMeInfo