Advanced Machine Learning for Business Professionals
Big Data, Big Disappointment
1. Big Data, Big Disappointment
A diagnosis and prescription (sort of)
for (somewhat) successful analytics
efforts in medium to large firms in
Mexico
(c) 2015 Jesus Ramos
1
2. “Big Data has arrived, but big
insights have not”
- “Big data: are we making a big mistake? Tim Harford. Financial Times.
(c) 2015 Jesus Ramos
2
And with all the money Gartner says
we’re to fork over, the question is…
4. In mature businesses, mostly
because…
• False positives are ignored
• Correlation implies causation
• We don’t care about sampling
• Machine Learning for all
(c) 2015 Jesus Ramos
4
From “8 Reasons why Big Data projects fail”. Matt Asay.
InformationWeek. 8/714
5. And in the rest of us, because…
We don’t understand what Big
Data is!
So…we need definitions:
(c) 2015 Jesus Ramos
5
6. BD is a 2-part deal
(c) 2015 Jesus Ramos
6
Big Data
Technology for storing
and processing large
amounts of data
Analytics
The insights gained
from such large data
“Without ‘analytics’, Big
Data is a sleeping giant!”
- me
7. Don’t talk about ‘BD’ w/o the ‘A’
From this slide on, and for the rest of
your professional lives, I urge you to
please add the ‘Analytics’ suffix to the
buzzword ‘Big Data’.
(c) 2015 Jesus Ramos
7
8. Why this distinction matters?
(c) 2015 Jesus Ramos
8
Big Data
Quality Attributes to
watch out for:
Analytics
Quality attributes to
watch out for:
- Performance
- Fault-tolerance
- Replication
- High Availability
- Integration with
current ecosystem
- Read Performance
- Insert Performance
- Integration with
Analytical Tools
- In-DB Analytics
9. Why this distinction matters?
(c) 2015 Jesus Ramos
9
We might end up buying/building
the wrong technology.
10. The purpose of BDA
1. Development of new products
2. Gain operational efficiencies
3. Support decision-making
(c) 2015 Jesus Ramos
10
If our BDA initiative doesn’t touch
these goals, we’re doing it wrong!
11. CEO/COO
CFO
CTO
CDO
The right place for BDA within the
firm…
(c) 2015 Jesus Ramos
11
In a startup:
BDA
BDA BDA BDA
Analytics is part of the org’s DNA
12. The right place for BDA within the
firm…
(c) 2015 Jesus Ramos
12
In an mature org:
CEO
CTO
CFO
COO
CDO
BDA
CEO sponsorhip needed to break cultural resistance!
BD
13. The WORST place for BDA within
the firm…
(c) 2015 Jesus Ramos
13
CEO
CTO/CIO
COO
CFO
BDA
Why?
14. Reasons why BDA should not be
born in IT (unless core biz is tech)
1. Asking the wrong questions
2. Lacking the right skills
3. Culture change happens
elsewhere
(c) 2015 Jesus Ramos
14
15. Asking the right questions
Even though IT enables the value chain
through technology, burning operational
questions may be out of our reach,
grasp, or jurisdiction.
(c) 2015 Jesus Ramos
15
16. Lack of the right skills
Forget Drew Conway’s Venn Diagram. The
problem is deeper:
1. IT is a labor of engineering.
2. The fundamental question of engineering is
‘How’.
3. To answer questions we need statistics.
4. The fundamental question of Stats is
‘Why’.
5. When we answer ‘Why’ we gain insight.
(c) 2015 Jesus Ramos
16
17. Lack of the right skills (2)
• Of course, our engineers could go through
training to become statisticians, and when
they do, they are sometimes better at it
than classically-trained statisticians.
• Only this training is long, and often requires
a change of mindset to become true Data
Scientists.
(c) 2015 Jesus Ramos
17
18. Culture change happens elsewhere
If tech is not the core business nor is central to
strategy, IT will not have enough ‘gravitas’ to
pull the entire org from a hunch-based decision
management, to a data-driven one.
(c) 2015 Jesus Ramos
18
19. A case for for giving birth to
Analytics in IT
(c) 2015 Jesus Ramos
19
Survey of +200 data
professionals. Those
closer to SW dev had a
negative correlation to
those closer to the
business. When the pale
red dot turns into a tight,
upward-facing, dark blue
oval, not only will be have
software built with a
purpose, but also SW
devs turned excellent data
analysts.
Source: Entry survey for @TheDataPub meetup
20. If you have no choice but give birth
to BDA in IT…
1. Set up a DWH (if not present).
2. Federate data.
3. Establish data ingestion frequency (must match my
decision-making frequency) & pipeline.
4. Hire the right people with the right skill (and keep
the BI people at bay lest they spread an illness called Reportitis
Operativitis).
5. Seize IT’s presence all across the value chain and
acquire political capital.
6. Address the low-hanging fruit of analytics.
(c) 2015 Jesus Ramos
20
21. 1. Set up a DWH (if not present).
2. Federate data.
3. Establish data ingestion frequency (must match my
decision-making frequency) & pipeline.
4. Hire the right people with the right skill (and keep
the BI people at bay lest they spread an illness called Reportitis
Operativitis).
5. Seize IT’s presence all across the value chain and
acquire political capital.
6. Address the low-hanging fruit of analytics.
If you have no choice but give birth
to BDA in IT…
(c) 2015 Jesus Ramos
21
Big Data
Analytics
22. Where do I get the right people (in
Mexico) ?
1. MSc Data Science – ITAM.
2. MSc Analytic Intelligence – U. Anahuac.
3. BS Applied Maths + MSc Economics/
Econometrics.
4. BS Industrial Engineering + MSc Computer
Science.
5. BS Actuarial Sciences + MSc Computer
Science
(c) 2015 Jesus Ramos
22
23. Where do I get the right people (in
Mexico) ?
• Note that they’re all master degrees, so
don’t expect to pay average developer
salaries.
• Industrial Engineering and Economics
appear a lot because those guys know how
to measure processes.
• Note that when we mention Computing, it’s
Computer Science, not Engineering.
(c) 2015 Jesus Ramos
23
24. Take aways:
• BigData does nothing without Analytics.
• BDA must deliver 1) new products, 2)
operational efficiency, 3) decision support.
• The right place for BDA is a position of influence.
• BDA living in IT has many drawbacks related to
skill + political capital.
• But IT is in a priviledged position to deliver value
through BDA if it blends with the business.
(c) 2015 Jesus Ramos
24
25. Pending discussions:
• Big Data Ethics
• Beware Data Charlatanry!
• Analytics team-building
• Data Science + Software Engineering
• What mexican education system
needs to produce data professionals.
(c) 2015 Jesus Ramos
25
Titulo fancy para decir ‘esto es lo que me funciona, y espero que a uds tambien’.
Del 64% de empresas que según Gartner invertiria en BigData en 2013, solo el 30% lo hizo, y de este 30%, solo 120 organizaciones han extraido los beneficios.
Si estan pensando que el problema es la estadística (o la falta de), van por buen camino.
…
…
Algo que hay que resaltar: cuando hablamos de negocios analiticamente maduros estamos hablando de todo el espectro de capital: desde las startups, hasta telcel, walmart, target.
Necesitamos tomar 2 pasitos hacia atrás y ver the whole picture.
Dicen que 2as partes no sonbuenas, pero ahí esta Batman El Caballero de la noche para demostrarnos que si.
Big Data es la basesota de datos
Tenemos que vigilar atributos de calidad como performance, HA, automantenible, etc.
…
Analytics es lo que hacemos con los datos de la basesota
Los invito a que de ahora en delante hablemos de analytics cuando platiquemos de bigdata analytics.
Aquí vigilamos atributos de calidad que perdemos de vista si solo hablamos de big data: que corra in-database analytics, que se integre con nuestra plataforma analitica y que sea buenisima para contestar preguntas
…
Si no hablamos de Analytics, BigData no sera mas que un gigante dormido.
Todo lo que hagamos en BDA, debe caer en una o mas de estas 3 cubetas.
En una startup la analitica emana desde el CDO, y es practicada por TODAS las areas! Google, Facebook, aunque no son startups, asi estan estructuradas. LinkedIn esta de otra manera, pero eso no lo platicaremos.
En una firma mandura con gran capital y N niveles jerarquicos, es mejor que el CDO sea staff del CEO, para que los insights generados tengan impacto en toda la org,vencer resistencias al cambio y tener acceso a los datos de toda la org. Ojo que el CTO forma parte crucial de esta colaboracion, porque la basesota vive con el.
El peor lugar para el nacimiendo de esta iniciativa es a veces IT. Por que?
Asking the wrong questions
IT may not be close enough to value chain to know its problems.
Lacking the right skills
Analytics=Stats&Math + Domain Knowledge + Programming. IT only has, at best, the latter two. En IT somos ingenieros, y la pregunta fundamental de los ingenieros es COMO. Aquí se requiere otro perfil donde la pregunta fundamental es POR QUE.
Culture change happens elsewhere
Going from hunch-driven decisions to data driven decisions requires culture change. IT can’t pull entire org.
Setup DWH: implica tambien vigilar que el WH resuelva los atributos de calidad necesarios para la analitica.
Federate data: Si tenemos silos de datos, volaremos a traves de la iniciativa tuertos y mancos. Es crucial federar la info de finanzas, RH, planta, operation, etc.
Establish data ingestion frequency: BMV toma decisiones de milisegundos, mientras que Walmart puede tomarlas diario.
Hire the right people: la gente de BI va a querer participar en esto, lo malo es que ellos padecen de reportitis operativitis, y dificilmente formularan preguntas de valor para el negocio.
Seize IT’s position: IT esta en una posicion ventajosa porque toca a todas las areas de negocio, asi que podemos establecer sensores y comenzar a tomar metricas de TODO, y poder entregar valor con vista a toda la org.
Address the low hanging fruit: proyectos de analitica chicos, baratos y que tengan gran impacto para ganar confiabilidad.
Setup DWH: implica tambien vigilar que el WH resuelva los atributos de calidad necesarios para la analitica.
Federate data: Si tenemos silos de datos, volaremos a traves de la iniciativa tuertos y mancos. Es crucial federar la info de finanzas, RH, planta, operation, etc.
Establish data ingestion frequency: BMV toma decisiones de milisegundos, mientras que Walmart puede tomarlas diario.
Hire the right people: la gente de BI va a querer participar en esto, lo malo es que ellos padecen de reportitis operativitis, y dificilmente formularan preguntas de valor para el negocio.
Seize IT’s position: IT esta en una posicion ventajosa porque toca a todas las areas de negocio, asi que podemos establecer sensores y comenzar a tomar metricas de TODO, y poder entregar valor con vista a toda la org.
Address the low hanging fruit: proyectos de analitica chicos, baratos y que tengan gran impacto para ganar confiabilidad.