SlideShare a Scribd company logo
1 of 89
Download to read offline
Taewook Eom
Data Infrastructure Team
SK planet
2014-01-28
Taewook Eom
Data Programmer
Plaster(Planet Master)
of Big Data Infra
Pre-Assessor of Hiring Programmers
Mentor of 101 Startup Korea

Twitter: @taewooke
LinkedIn: http://kr.linkedin.com/in/taewookeom
http://www.flickr.com/photos/oreillyconf/10616622085/
Santa Clara
: Technical

New York
with Cloudera

: Financial, Business

Europe

: Privacy, Government

Boston
: Medical

http://strataconf.com/

by O’Reilly
Web 2.0

: Open, Sharing, Participation

Big Data

: Making Data Work
Change the World with Data.
Data
When hardware became commoditized,
software was valuable.
Now software being commoditized,
data is valuable.
– Tim O’Reilly, 2011

Data is like the blood of the enterprise.
– Amr Awadallah, CTO at Cloudera, 2013
What is Big Data?
All data that is not a fit for a traditional RDBMS,
whether used for OLTP or Analytics purposes

Big Data Architectural Patterns
http://strataconf.com/stratany2013/public/schedule/detail/30397
Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data
- Gartner, 2011

http://blog.vitria.com/Portals/47881/images/3values-resized-600.png
http://image-store.slidesharecdn.com/ae63030a-3d9b-11e3-9cff-22000a970267-original.jpg
Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS
http://strataconf.com/stratany2013/public/schedule/detail/29968
Data Science

http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

http://en.wikipedia.org/wiki/File:DataScienceDisciplines.png
Big Data

http://mappingignorance.org/fx/media/2013/07/Figura-11.jpg

Open Mind!
Big Data

Gartner's 2013 Hype Cycle for Emerging Technologies (2013-08-19)
more than half of
technical sessions
are presented by
Chinese or Indian

39 of 125 sessions are
sponsored sessions
Big Data: 4 Approaches
Hadoop-based

RDB-based

Search-based

NoSQL
Real-time Processing

Real-time Recommendations for Retail: Architecture, Algorithms, and Design
http://strataconf.com/stratany2013/public/schedule/detail/30217
Real-time Stream Processing
Apache
Kafka

Gathering

Apache
Storm

Processing
Querying

Streaming
Search-based
NoSQL
SQL

Stringer/Tez

Shark
… not yet Graph Processing
Big Data Space
No one tools is the right fit for all Big Data problem
Do not be afraid to recommend the right solution
for the problem over the popular solution
To do this, you must be aware of the entire ecosystem

Big Data Architectural Patterns
http://strataconf.com/stratany2013/public/schedule/detail/30397
Practical Performance Analysis and Tuning for Cloudera Impala
http://strataconf.com/stratany2013/public/schedule/detail/30551
Big Data Architectural Patterns
http://strataconf.com/stratany2013/public/schedule/detail/30397
Hadoop and the Relational Data Warehouse – When to Use Which?
http://strataconf.com/stratany2013/public/schedule/detail/30964
Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS
http://strataconf.com/stratany2013/public/schedule/detail/29968
Each speaker is allocated five minutes of presentation time
and is accompanied by 20 presentation slides.
During presentations, each slide is displayed for 15 seconds
and then automatically advanced.
- http://en.wikipedia.org/wiki/Ignite_(event)
http://oreilly.com/pub/pr/2242
Ignite Talks
Hilary: The Most Poisoned Name In US History - Hilary Parker
sudo make me a visualization! - Jeroen Janssens
Design as a Fulcrum for Societal Change: the influence of Jimmyjane on female sexuality - Lisa Green

Spaces in Between: The Transdisciplinary Niche to Type 1 Diabetes Living - Jorge Luna

Why are women better data scientists than men? - Carolyn Martin
Memoirs of a Prolific Moonlighter: A Chronic Writing Disorder…or Insanity? - Matthew Russell
The Data Behind H1B Visas - Melissa Smolensky

Signal Detection Theory: Man vs Machine - Kyle Redinger
Algorithms of Pain - Heather Fenby
Hadoop Playlist - Adam Kawa
Why a Data Community is like a Music Scene - Harlan Harris

A Tale of two Kinds of Startups - Jen van der Meer

http://strataconf.com/stratany2013/public/schedule/detail/32182
Ignite
Signal Detection Theory: Man vs Machine
Co-Founder @VividCortex
Kyle Redinger
http://www.youtube.com/watch?v=Fg6mN-jevds
(5 minutes 6 seconds)
http://www.slideshare.net/realkyleredinger/man-vs-machine-signal-detection-theory-and-big-data
Signal Detection Theory: Man vs Machine

Remove the obvious and look at what is important
Remember: Less is more.
Ignite
A Tale of Two Kinds of Startups
CSO at Luminary Labs
Jen van der Meer
http://www.youtube.com/watch?v=0ooIs4cy5uM
(5 minutes 2 seconds)
http://www.slideshare.net/bettybluegreen/twokindsofstartups
Keynote
Towards Strata 2014
Director of market research at O’Reilly Media
Roger Magoulas
http://www.youtube.com/watch?v=Ytd5VkEgQf8
(5 minutes 26 seconds)
http://strataconf.com/stratany2013/public/schedule/detail/31935

http://www.oreilly.com/data/free/files/stratasurvey.pdf
Towards Strata 2014
Towards Strata 2014
Towards Strata 2014
Towards Strata 2014
Science is fundamentally about data,
but data is not fundamentally about science
Beyond R and Ph.D.s: The Mythology of Data Science Debunked
Douglas Merrill (ZestFinance)
http://www.youtube.com/watch?v=J2sgObXbIWY (8 minutes 9 seconds)
People

A data scientist is a data analyst who lives in California.

– George Roumeliotis, (Intuit)
http://www.anlytcs.com/2014/01/data-science-venn-diagram-v20.html
Data
Data
Data
Data

Businessperson: Business person, Leader, Entrepreneur
Creative: Artist, Jack-of-All-Trades, Hacker
Researcher: Scientist, Researcher, Statistician
Engineer: Engineer, Developer

http://datacommunitydc.org/blog/2012/08/data-scientists-survey-results-teaser/

http://cdn.oreillystatic.com/oreilly/radarreport/0636920029014/Analyzing_the_Analyzers.pdf
Scientists think they can code,
software engineers think they are scientists.
Team them up so they collaborate.

– Scott Sorenson (Ancestry.com)

Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop
How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce
http://strataconf.com/stratany2013/public/schedule/detail/30707
Data scientists spend their lives as data janitors
instead of leveraging their skills

– Wes McKinney (DataPad)

Building More Productive Data Science and Analytics Workflows
Keynote
Is Bigger Really Better?
Predictive Analytics
with Fine-grained Behavior Data
Professor at the NYU Stern School of Business
Foster Provost
http://www.youtube.com/watch?v=1jzMiAfLH2c
(10 minutes 16 seconds)
http://strataconf.com/stratany2013/public/schedule/detail/31685
Is Bigger Really Better?
Predictive Analytics with Fine-grained Behavior Data
Is Bigger Really Better?
Predictive Analytics with Fine-grained Behavior Data
Is Bigger Really Better?
Predictive Analytics with Fine-grained Behavior Data

Predictive does not mean actionable.

– Scott Sorenson (Ancestry.com)

Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop
More data gives you more precision, not more prediction.
Using multiple datasets to reduce errors when measuring values.
Is Bigger Really Better?
- Ravi Iyer (Ranker.com)
Predictive Analytics with Fine-grained Understand yourData Users, and Employees
Behavior Customers,
Using Graphs of Data to
Is Bigger Really Better?
Predictive Analytics with Fine-grained Behavior Data
Is Bigger Really Better?
Predictive Analytics with Fine-grained Behavior Data
Keynote
Big Impact from Big Data
Head of Analytics at Facebook
Ken Rudin
http://www.youtube.com/watch?v=RJFwsZwTBgg
(11 minutes 57 seconds)
http://strataconf.com/stratany2013/public/schedule/detail/31903
Big Impact from Big Data
Hadoop is a hammer,
but you need other tools along with it.

Designing Your Data-Centric Organization
Josh Klahr (Pivotal)

http://www.youtube.com/watch?v=D86udfrVzrI (12 minutes)
Big Impact from Big Data

The way you organize information
depends on the question
you intend to ask of it.

- Richard Saul Wurman
Building a Data Platform
HaDump

: Loading data into Hadoop
for not reason.

Data Science Without a Scientist
http://strataconf.com/stratany2013/public/schedule/detail/31801
Big Impact from Big Data

Technical people still don't understand the business needs of business people!
Business people don't know what's a table.

- Anurag Tandon (MicroStrategy)

Inject Big Data into your Corporate DNA: Enable Every Employee to Make Data Driven Decisions
Ask the Right Questions
Organizations already have people who know their own data
better than mystical data scientists.
Learning Hadoop is easier than learning the company’s business.
- Gartner, 2012

Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS
http://strataconf.com/stratany2013/public/schedule/detail/29968
Non-linear Storytelling: Towards New Methods and Aesthetics for Data Narrative
http://strataconf.com/stratany2013/public/schedule/detail/30207
Every Soldier is a Sensor: Countering Corruption in Afghanistan
http://strataconf.com/stratany2013/public/schedule/detail/30828
Big Impact from Big Data
Big Impact from Big Data
Big Impact from Big Data
Value of Data
Usable < Useful < Actionable
with Impact

If you can't answer for "so what?",
you only have facts, not insight
- Baron Schwartz (VividCortex Inc)
Making Big Data Small

Descriptive (Easy)
Predictive (Medium)
Prescriptive (Hard)

What happened?
What will happen?
What should we do about it?

Hadoop & Data Science for the Enterprise
The Future of Hadoop
: What Happened
& What's Possible?
Co-Founder of Hadoop
Doug Cutting
http://www.youtube.com/watch?v=_WwuZI6AhN8
(14 minutes 41 seconds)
http://strataconf.com/stratany2013/public/
schedule/detail/31591

Big Data is first industry that was created
by open source.

- Jack Norris (MapR Technologies)
Separating Hadoop Myths from Reality

Hadoop the kernel of the OS for data.
Hadoop's Impact on the Future of Data Management
Mike Olson (Cloudera)

http://www.youtube.com/watch?v=puHS2JNKgRM
http://strataconf.com/stratany2013/public/schedule/detail/31380
Single
:
:
:
:
:
:

S/W & H/W system
security model
management model
metadata model
audit model
resource
management model

Common

: storage & schema
http://www.slideshare.net/cloudera/enterprise-data-hub-the-next-big-thing-in-big-data
Unifying Your Data Management Platform with Hadoop: Batch and Real-time Machine Data Ingest, Alerts, and Analytics
http://strataconf.com/stratany2013/public/schedule/detail/30282
Last generation of data management is not sufficient
More copies, representations, transformations increase risk
Index once and reuse across workloads, lifecycle
NoSQL: indexing and updates for interactive apps
Hadoop: staging, persistence, and analytics

Data Governance for Regulated Industries Using Hadoop
http://strataconf.com/stratany2013/public/schedule/detail/30738
Data Intelligence
Rethink How You See Data

Sharmila Shahani-Mulligan (ClearStory Data)

http://www.youtube.com/watch?v=07hGulTOZGk (9 minutes 6 seconds)
http://strataconf.com/stratany2013/public/schedule/detail/31742
The Data Availability Problem

?

Access

Question
Sampling

Analysis & Disc
Modeling
overy

Loading
Insight

Data Prep – too slow!

Information Supply Chain
Introducing a New Way to Interact with Insight
http://strataconf.com/stratany2013/public/schedule/detail/31743

Presentation
Running Non-MapReduce Big Data applications on Apache Hadoop
http://strataconf.com/stratany2013/public/schedule/detail/30755
Apache HBase for Architects
http://strataconf.com/stratany2013/public/schedule/detail/30619
What’s Next for Apache HBase: Multi-tenancy, Predictability, and Extensions.
http://strataconf.com/stratany2013/public/schedule/detail/30857
Securing the Apache Hadoop Ecosystem
http://strataconf.com/stratany2013/public/schedule/detail/30302
An Introduction to the Berkeley Data Analytics Stack With Spark, Spark Streaming, Shark, Tachyon, and BlinkDB
http://strataconf.com/stratany2013/public/schedule/detail/30959
Schema
Information does not exist until a schema is defined
and data is stored in a relational database

- anonymous

Building a Data Platform
http://strataconf.com/stratany2013/public/schedule/detail/31400
Lessons Learned From A Decade’s Worth of Big Data At The U.S. National Security Agency (NSA)
http://strataconf.com/stratany2013/public/schedule/detail/30913
Managing a Rapidly Evolving Analytics Pipeline
http://strataconf.com/stratany2013/public/schedule/detail/30635
Managing a Rapidly Evolving Analytics Pipeline
http://strataconf.com/stratany2013/public/schedule/detail/30635
Stringer/Tez

Shark

SQL on/in Hadoop/Hbase Solutions

Perception is Key: Telescopes, Microscopes and Data
http://strataconf.com/strataeu2013/public/schedule/detail/32351
All SQL on Hadoop Solutions are
Missing the Point of Hadoop
Every Solution makes you define a schema

- SQL(Structured Query Language) is expressed over an assumed schema

Major reasons why Hadoop has taken of include:

- Ability to load data without defining a schema
- Process data using schema-on-read instead of first defining a schema

Hadoop contains a lot of:

- Raw, granular data sets with potentially inconsistent schemas
- Data sets in JSON, key-value, and other self-describing (non-relational) models
designed for schema-on-read processing

SQL on Hadoop solutions that make you first define a schema are missing
a major part of Hadoop’s usage patterns

Flexible Schema and the End of ETL
http://strataconf.com/stratany2013/public/schedule/detail/31868
Lessons Learned
Hadoop Adventures At Spotify
http://strataconf.com/stratany2013/public/schedule/detail/30570
Hadoop Adventures At Spotify
http://strataconf.com/stratany2013/public/schedule/detail/30570
Quick prototyping is the fastest way to internal advocacy. Ship It!
Cloud == Speed
We don’t always need a complicated solution. KISS
Play to your differentiating strengths. Experience >> Data
Bias towards impact.
It Takes a Village
EASE!! (Emulate, Analyze, Scale, Evaluate)
How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce
http://strataconf.com/stratany2013/public/schedule/detail/30707

Prototyping is key to overcoming resistance to change
Technical architecture is heavily influenced by people organization
Developing a team of experienced Hadoop users can often be done
using internal employees
A culture of experimentation and innovation yields the best result
Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop
http://strataconf.com/stratany2013/public/schedule/detail/30499
Questions?
SELECT questions FROM audience;
References
Strata Conference + Hadoop World 2013 Keynotes & Interviews

http://www.youtube.com/playlist?list=PL055Epbe6d5ZtziVAooUC04i1hL_Z9Xvk

Slides & Video

http://strataconf.com/stratany2013/public/schedule/proceedings

Tweets

https://twitter.com/search?q=%23strataconf #strataconf
How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce
http://strataconf.com/stratany2013/public/schedule/detail/30707
http://nordstrom.github.io/stratanyc/
http://complexdiagrams.com/properties
Four Pillars of Visualization
http://strataconf.com/stratany2013/public/schedule/detail/31182
Building a production machine learning infrastructure
http://www.slideshare.net/joshwills/production-machine-learninginfrastructure
Text Analytics at Scale: Listening to 45 Million Customers
http://strataconf.com/stratany2013/public/schedule/detail/30757
Words + Numbers = Insights

Text Analytics at Scale: Listening to 45 Million Customers
http://strataconf.com/stratany2013/public/schedule/detail/30757

More Related Content

What's hot

Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
 
GIS and Asset Management Moving to the Future :
GIS and Asset Management  Moving to the Future : GIS and Asset Management  Moving to the Future :
GIS and Asset Management Moving to the Future : Symphony3
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Big Data Spain
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012Alan Said
 
Ethics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningEthics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningHJ van Veen
 
Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...UNDP Eurasia
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceWesley Eldridge
 
Cloud Analytics - Using cloud based services to analyse big data
Cloud Analytics - Using cloud based services to analyse big dataCloud Analytics - Using cloud based services to analyse big data
Cloud Analytics - Using cloud based services to analyse big dataDavid Parsons
 
Big Data v. Small data - Rules to thumb for 2015
Big Data v. Small data - Rules to thumb for 2015Big Data v. Small data - Rules to thumb for 2015
Big Data v. Small data - Rules to thumb for 2015Visart
 
Data and Ethics: Why Data Science Needs One
Data and Ethics: Why Data Science Needs OneData and Ethics: Why Data Science Needs One
Data and Ethics: Why Data Science Needs OneTim Rich
 
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...J T "Tom" Johnson
 
Internet of Things: Luxury for the Rich or Sustainable Equity for All?
Internet of Things: Luxury for the Rich or Sustainable Equity for All?Internet of Things: Luxury for the Rich or Sustainable Equity for All?
Internet of Things: Luxury for the Rich or Sustainable Equity for All?The Transformation Society
 
Maps and data esri health care 2012
Maps and data   esri health care 2012Maps and data   esri health care 2012
Maps and data esri health care 2012J T "Tom" Johnson
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingUniversity of Washington
 
EuroIA 2012 highlights
EuroIA 2012 highlightsEuroIA 2012 highlights
EuroIA 2012 highlightsDimiter Simov
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 
AI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesAI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesBrian Pichman
 

What's hot (20)

Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data Decade
 
GIS and Asset Management Moving to the Future :
GIS and Asset Management  Moving to the Future : GIS and Asset Management  Moving to the Future :
GIS and Asset Management Moving to the Future :
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012
 
Ethics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningEthics in Data Science and Machine Learning
Ethics in Data Science and Machine Learning
 
Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data Science
 
Cloud Analytics - Using cloud based services to analyse big data
Cloud Analytics - Using cloud based services to analyse big dataCloud Analytics - Using cloud based services to analyse big data
Cloud Analytics - Using cloud based services to analyse big data
 
Big Data v. Small data - Rules to thumb for 2015
Big Data v. Small data - Rules to thumb for 2015Big Data v. Small data - Rules to thumb for 2015
Big Data v. Small data - Rules to thumb for 2015
 
Data and Ethics: Why Data Science Needs One
Data and Ethics: Why Data Science Needs OneData and Ethics: Why Data Science Needs One
Data and Ethics: Why Data Science Needs One
 
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...
IRE "Better Watchdog" workshop presentation "Data: Now I've got it, what do I...
 
Internet of Things: Luxury for the Rich or Sustainable Equity for All?
Internet of Things: Luxury for the Rich or Sustainable Equity for All?Internet of Things: Luxury for the Rich or Sustainable Equity for All?
Internet of Things: Luxury for the Rich or Sustainable Equity for All?
 
Rogers Aestheticizing Google Critique
Rogers Aestheticizing Google CritiqueRogers Aestheticizing Google Critique
Rogers Aestheticizing Google Critique
 
Maps and data esri health care 2012
Maps and data   esri health care 2012Maps and data   esri health care 2012
Maps and data esri health care 2012
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
 
EuroIA 2012 highlights
EuroIA 2012 highlightsEuroIA 2012 highlights
EuroIA 2012 highlights
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Digital reality
Digital realityDigital reality
Digital reality
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 
AI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesAI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for Libraries
 

Viewers also liked

TOC 2011: Content as Application, presented by Thane Kerner
TOC 2011: Content as Application, presented by Thane KernerTOC 2011: Content as Application, presented by Thane Kerner
TOC 2011: Content as Application, presented by Thane KernerSilverchair
 
TOC 2011: Content as Application, presented by Scott Grillo
TOC 2011: Content as Application, presented by Scott GrilloTOC 2011: Content as Application, presented by Scott Grillo
TOC 2011: Content as Application, presented by Scott GrilloSilverchair
 
Strata Conference NYC 2013
Strata Conference NYC 2013Strata Conference NYC 2013
Strata Conference NYC 2013Taewook Eom
 
Extreme Web Performance for Mobile Devices
Extreme Web Performance for Mobile Devices Extreme Web Performance for Mobile Devices
Extreme Web Performance for Mobile Devices Maximiliano Firtman
 
Costa Pacifica in Baler, Aurora
Costa Pacifica in Baler, AuroraCosta Pacifica in Baler, Aurora
Costa Pacifica in Baler, AuroraClaire Algarme
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheBlaze Software Inc.
 
Continuous Delivery in Financial Trading at IG
Continuous Delivery in Financial Trading at IGContinuous Delivery in Financial Trading at IG
Continuous Delivery in Financial Trading at IGDavid Genn
 
Velocity 2015-tim-prendergast-continuous-security-the-devops-way
Velocity 2015-tim-prendergast-continuous-security-the-devops-wayVelocity 2015-tim-prendergast-continuous-security-the-devops-way
Velocity 2015-tim-prendergast-continuous-security-the-devops-wayEvident.io
 
Is there such a thing as a good business model for publishing these days?
Is there such a thing as a good business model  for publishing these days?Is there such a thing as a good business model  for publishing these days?
Is there such a thing as a good business model for publishing these days?Louis Rosenfeld
 
Can you wireframe 'Delightful'?
Can you wireframe 'Delightful'?Can you wireframe 'Delightful'?
Can you wireframe 'Delightful'?Ben Tollady
 
We Are Killing Serendipity
We Are Killing SerendipityWe Are Killing Serendipity
We Are Killing SerendipitySchneider, Mike
 
Forensic Tools for In-Depth Performance Investigations
Forensic Tools for In-Depth Performance InvestigationsForensic Tools for In-Depth Performance Investigations
Forensic Tools for In-Depth Performance InvestigationsNicholas Jansma
 
Locked Out in London (and tweeting about it)
Locked Out in London (and tweeting about it)Locked Out in London (and tweeting about it)
Locked Out in London (and tweeting about it)Sylvain Carle
 
What You Need to Know About Email Authentication
What You Need to Know About Email AuthenticationWhat You Need to Know About Email Authentication
What You Need to Know About Email AuthenticationKurt Andersen
 
TOC 2011: Content as Application, presented by Reid Sherline
TOC 2011: Content as Application, presented by Reid SherlineTOC 2011: Content as Application, presented by Reid Sherline
TOC 2011: Content as Application, presented by Reid SherlineSilverchair
 
Case Studies: Harnessing Speed for Competitive Advantage
Case Studies: Harnessing Speed for Competitive AdvantageCase Studies: Harnessing Speed for Competitive Advantage
Case Studies: Harnessing Speed for Competitive AdvantageVMware Tanzu
 
Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop Guy Harrison
 
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]How slow load times hurt UX (and what you can do about it) [FluentConf 2016]
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]Tammy Everts
 

Viewers also liked (20)

TOC 2011: Content as Application, presented by Thane Kerner
TOC 2011: Content as Application, presented by Thane KernerTOC 2011: Content as Application, presented by Thane Kerner
TOC 2011: Content as Application, presented by Thane Kerner
 
TOC 2011: Content as Application, presented by Scott Grillo
TOC 2011: Content as Application, presented by Scott GrilloTOC 2011: Content as Application, presented by Scott Grillo
TOC 2011: Content as Application, presented by Scott Grillo
 
Strata Conference NYC 2013
Strata Conference NYC 2013Strata Conference NYC 2013
Strata Conference NYC 2013
 
Extreme Web Performance for Mobile Devices
Extreme Web Performance for Mobile Devices Extreme Web Performance for Mobile Devices
Extreme Web Performance for Mobile Devices
 
Costa Pacifica in Baler, Aurora
Costa Pacifica in Baler, AuroraCosta Pacifica in Baler, Aurora
Costa Pacifica in Baler, Aurora
 
Pacifica Affiliates Program
Pacifica Affiliates ProgramPacifica Affiliates Program
Pacifica Affiliates Program
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
 
Continuous Delivery in Financial Trading at IG
Continuous Delivery in Financial Trading at IGContinuous Delivery in Financial Trading at IG
Continuous Delivery in Financial Trading at IG
 
Velocity 2015-tim-prendergast-continuous-security-the-devops-way
Velocity 2015-tim-prendergast-continuous-security-the-devops-wayVelocity 2015-tim-prendergast-continuous-security-the-devops-way
Velocity 2015-tim-prendergast-continuous-security-the-devops-way
 
Is there such a thing as a good business model for publishing these days?
Is there such a thing as a good business model  for publishing these days?Is there such a thing as a good business model  for publishing these days?
Is there such a thing as a good business model for publishing these days?
 
Can you wireframe 'Delightful'?
Can you wireframe 'Delightful'?Can you wireframe 'Delightful'?
Can you wireframe 'Delightful'?
 
We Are Killing Serendipity
We Are Killing SerendipityWe Are Killing Serendipity
We Are Killing Serendipity
 
Forensic Tools for In-Depth Performance Investigations
Forensic Tools for In-Depth Performance InvestigationsForensic Tools for In-Depth Performance Investigations
Forensic Tools for In-Depth Performance Investigations
 
Locked Out in London (and tweeting about it)
Locked Out in London (and tweeting about it)Locked Out in London (and tweeting about it)
Locked Out in London (and tweeting about it)
 
What You Need to Know About Email Authentication
What You Need to Know About Email AuthenticationWhat You Need to Know About Email Authentication
What You Need to Know About Email Authentication
 
TOC 2011: Content as Application, presented by Reid Sherline
TOC 2011: Content as Application, presented by Reid SherlineTOC 2011: Content as Application, presented by Reid Sherline
TOC 2011: Content as Application, presented by Reid Sherline
 
Case Studies: Harnessing Speed for Competitive Advantage
Case Studies: Harnessing Speed for Competitive AdvantageCase Studies: Harnessing Speed for Competitive Advantage
Case Studies: Harnessing Speed for Competitive Advantage
 
Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop
 
Advanced Sqoop
Advanced Sqoop Advanced Sqoop
Advanced Sqoop
 
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]How slow load times hurt UX (and what you can do about it) [FluentConf 2016]
How slow load times hurt UX (and what you can do about it) [FluentConf 2016]
 

Similar to Strata Conference NYC 2013 Full Version

Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving UpPaco Nathan
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October IssueJIMS Rohini Sector 5
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febJonathan Woodward
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Good Rebels
 
Making friends with big data resource links
Making friends with big data resource linksMaking friends with big data resource links
Making friends with big data resource linksHeather Stark
 
Big Data, IoT and The Third Industrial Revolution
Big Data, IoT and The Third Industrial RevolutionBig Data, IoT and The Third Industrial Revolution
Big Data, IoT and The Third Industrial Revolutionglobexspain
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle Kimberly Hoffman
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain
 
9 Great Quotes about Data
9 Great Quotes about Data9 Great Quotes about Data
9 Great Quotes about DataSean Ammirati
 
Fontys Eric van Tol
Fontys Eric van TolFontys Eric van Tol
Fontys Eric van TolTalentEvent
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
BIG DATA: hacking complexity - Digital for Business
BIG DATA: hacking complexity - Digital for BusinessBIG DATA: hacking complexity - Digital for Business
BIG DATA: hacking complexity - Digital for BusinessCultura Digitale
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAgeFriendlyEconomy
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Geoffrey Fox
 
Opportunities with data science
Opportunities with data scienceOpportunities with data science
Opportunities with data scienceAshiq Rahman
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesTyler Bell
 
Over the past weeks we have been examining the inference process- big.docx
Over the past weeks we have been examining the inference process- big.docxOver the past weeks we have been examining the inference process- big.docx
Over the past weeks we have been examining the inference process- big.docxlmark1
 

Similar to Strata Conference NYC 2013 Full Version (20)

Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October Issue
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th feb
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -
 
Making friends with big data resource links
Making friends with big data resource linksMaking friends with big data resource links
Making friends with big data resource links
 
Big Data, IoT and The Third Industrial Revolution
Big Data, IoT and The Third Industrial RevolutionBig Data, IoT and The Third Industrial Revolution
Big Data, IoT and The Third Industrial Revolution
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun Sukhani
 
9 Great Quotes about Data
9 Great Quotes about Data9 Great Quotes about Data
9 Great Quotes about Data
 
Fontys Eric van Tol
Fontys Eric van TolFontys Eric van Tol
Fontys Eric van Tol
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
BIG DATA: hacking complexity - Digital for Business
BIG DATA: hacking complexity - Digital for BusinessBIG DATA: hacking complexity - Digital for Business
BIG DATA: hacking complexity - Digital for Business
 
Using Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay VinzeUsing Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay Vinze
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big Data
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
Opportunities with data science
Opportunities with data scienceOpportunities with data science
Opportunities with data science
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
 
Over the past weeks we have been examining the inference process- big.docx
Over the past weeks we have been examining the inference process- big.docxOver the past weeks we have been examining the inference process- big.docx
Over the past weeks we have been examining the inference process- big.docx
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Strata Conference NYC 2013 Full Version

  • 1. Taewook Eom Data Infrastructure Team SK planet 2014-01-28
  • 2. Taewook Eom Data Programmer Plaster(Planet Master) of Big Data Infra Pre-Assessor of Hiring Programmers Mentor of 101 Startup Korea Twitter: @taewooke LinkedIn: http://kr.linkedin.com/in/taewookeom http://www.flickr.com/photos/oreillyconf/10616622085/
  • 3. Santa Clara : Technical New York with Cloudera : Financial, Business Europe : Privacy, Government Boston : Medical http://strataconf.com/ by O’Reilly Web 2.0 : Open, Sharing, Participation Big Data : Making Data Work Change the World with Data.
  • 4. Data When hardware became commoditized, software was valuable. Now software being commoditized, data is valuable. – Tim O’Reilly, 2011 Data is like the blood of the enterprise. – Amr Awadallah, CTO at Cloudera, 2013
  • 5. What is Big Data? All data that is not a fit for a traditional RDBMS, whether used for OLTP or Analytics purposes Big Data Architectural Patterns http://strataconf.com/stratany2013/public/schedule/detail/30397
  • 6. Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data - Gartner, 2011 http://blog.vitria.com/Portals/47881/images/3values-resized-600.png
  • 8. Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS http://strataconf.com/stratany2013/public/schedule/detail/29968
  • 11. Big Data Gartner's 2013 Hype Cycle for Emerging Technologies (2013-08-19)
  • 12. more than half of technical sessions are presented by Chinese or Indian 39 of 125 sessions are sponsored sessions
  • 13. Big Data: 4 Approaches Hadoop-based RDB-based Search-based NoSQL
  • 14. Real-time Processing Real-time Recommendations for Retail: Architecture, Algorithms, and Design http://strataconf.com/stratany2013/public/schedule/detail/30217
  • 16. … not yet Graph Processing
  • 17. Big Data Space No one tools is the right fit for all Big Data problem Do not be afraid to recommend the right solution for the problem over the popular solution To do this, you must be aware of the entire ecosystem Big Data Architectural Patterns http://strataconf.com/stratany2013/public/schedule/detail/30397
  • 18. Practical Performance Analysis and Tuning for Cloudera Impala http://strataconf.com/stratany2013/public/schedule/detail/30551
  • 19. Big Data Architectural Patterns http://strataconf.com/stratany2013/public/schedule/detail/30397
  • 20. Hadoop and the Relational Data Warehouse – When to Use Which? http://strataconf.com/stratany2013/public/schedule/detail/30964
  • 21. Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS http://strataconf.com/stratany2013/public/schedule/detail/29968
  • 22. Each speaker is allocated five minutes of presentation time and is accompanied by 20 presentation slides. During presentations, each slide is displayed for 15 seconds and then automatically advanced. - http://en.wikipedia.org/wiki/Ignite_(event) http://oreilly.com/pub/pr/2242
  • 23. Ignite Talks Hilary: The Most Poisoned Name In US History - Hilary Parker sudo make me a visualization! - Jeroen Janssens Design as a Fulcrum for Societal Change: the influence of Jimmyjane on female sexuality - Lisa Green Spaces in Between: The Transdisciplinary Niche to Type 1 Diabetes Living - Jorge Luna Why are women better data scientists than men? - Carolyn Martin Memoirs of a Prolific Moonlighter: A Chronic Writing Disorder…or Insanity? - Matthew Russell The Data Behind H1B Visas - Melissa Smolensky Signal Detection Theory: Man vs Machine - Kyle Redinger Algorithms of Pain - Heather Fenby Hadoop Playlist - Adam Kawa Why a Data Community is like a Music Scene - Harlan Harris A Tale of two Kinds of Startups - Jen van der Meer http://strataconf.com/stratany2013/public/schedule/detail/32182
  • 24. Ignite Signal Detection Theory: Man vs Machine Co-Founder @VividCortex Kyle Redinger http://www.youtube.com/watch?v=Fg6mN-jevds (5 minutes 6 seconds) http://www.slideshare.net/realkyleredinger/man-vs-machine-signal-detection-theory-and-big-data
  • 25. Signal Detection Theory: Man vs Machine Remove the obvious and look at what is important Remember: Less is more.
  • 26. Ignite A Tale of Two Kinds of Startups CSO at Luminary Labs Jen van der Meer http://www.youtube.com/watch?v=0ooIs4cy5uM (5 minutes 2 seconds) http://www.slideshare.net/bettybluegreen/twokindsofstartups
  • 27.
  • 28.
  • 29. Keynote Towards Strata 2014 Director of market research at O’Reilly Media Roger Magoulas http://www.youtube.com/watch?v=Ytd5VkEgQf8 (5 minutes 26 seconds) http://strataconf.com/stratany2013/public/schedule/detail/31935 http://www.oreilly.com/data/free/files/stratasurvey.pdf
  • 34. Science is fundamentally about data, but data is not fundamentally about science Beyond R and Ph.D.s: The Mythology of Data Science Debunked Douglas Merrill (ZestFinance) http://www.youtube.com/watch?v=J2sgObXbIWY (8 minutes 9 seconds)
  • 35. People A data scientist is a data analyst who lives in California. – George Roumeliotis, (Intuit)
  • 37. Data Data Data Data Businessperson: Business person, Leader, Entrepreneur Creative: Artist, Jack-of-All-Trades, Hacker Researcher: Scientist, Researcher, Statistician Engineer: Engineer, Developer http://datacommunitydc.org/blog/2012/08/data-scientists-survey-results-teaser/ http://cdn.oreillystatic.com/oreilly/radarreport/0636920029014/Analyzing_the_Analyzers.pdf
  • 38. Scientists think they can code, software engineers think they are scientists. Team them up so they collaborate. – Scott Sorenson (Ancestry.com) Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop
  • 39. How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce http://strataconf.com/stratany2013/public/schedule/detail/30707
  • 40. Data scientists spend their lives as data janitors instead of leveraging their skills – Wes McKinney (DataPad) Building More Productive Data Science and Analytics Workflows
  • 41. Keynote Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data Professor at the NYU Stern School of Business Foster Provost http://www.youtube.com/watch?v=1jzMiAfLH2c (10 minutes 16 seconds) http://strataconf.com/stratany2013/public/schedule/detail/31685
  • 42. Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data
  • 43. Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data
  • 44. Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data Predictive does not mean actionable. – Scott Sorenson (Ancestry.com) Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop
  • 45. More data gives you more precision, not more prediction. Using multiple datasets to reduce errors when measuring values. Is Bigger Really Better? - Ravi Iyer (Ranker.com) Predictive Analytics with Fine-grained Understand yourData Users, and Employees Behavior Customers, Using Graphs of Data to
  • 46. Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data
  • 47. Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data
  • 48. Keynote Big Impact from Big Data Head of Analytics at Facebook Ken Rudin http://www.youtube.com/watch?v=RJFwsZwTBgg (11 minutes 57 seconds) http://strataconf.com/stratany2013/public/schedule/detail/31903
  • 49. Big Impact from Big Data
  • 50. Hadoop is a hammer, but you need other tools along with it. Designing Your Data-Centric Organization Josh Klahr (Pivotal) http://www.youtube.com/watch?v=D86udfrVzrI (12 minutes)
  • 51. Big Impact from Big Data The way you organize information depends on the question you intend to ask of it. - Richard Saul Wurman Building a Data Platform
  • 52. HaDump : Loading data into Hadoop for not reason. Data Science Without a Scientist http://strataconf.com/stratany2013/public/schedule/detail/31801
  • 53. Big Impact from Big Data Technical people still don't understand the business needs of business people! Business people don't know what's a table. - Anurag Tandon (MicroStrategy) Inject Big Data into your Corporate DNA: Enable Every Employee to Make Data Driven Decisions
  • 54. Ask the Right Questions Organizations already have people who know their own data better than mystical data scientists. Learning Hadoop is easier than learning the company’s business. - Gartner, 2012 Defining your Big Data Arsenal: NoSQL, Hadoop, and RDBMS http://strataconf.com/stratany2013/public/schedule/detail/29968
  • 55. Non-linear Storytelling: Towards New Methods and Aesthetics for Data Narrative http://strataconf.com/stratany2013/public/schedule/detail/30207
  • 56. Every Soldier is a Sensor: Countering Corruption in Afghanistan http://strataconf.com/stratany2013/public/schedule/detail/30828
  • 57. Big Impact from Big Data
  • 58. Big Impact from Big Data
  • 59. Big Impact from Big Data
  • 60. Value of Data Usable < Useful < Actionable with Impact If you can't answer for "so what?", you only have facts, not insight - Baron Schwartz (VividCortex Inc) Making Big Data Small Descriptive (Easy) Predictive (Medium) Prescriptive (Hard) What happened? What will happen? What should we do about it? Hadoop & Data Science for the Enterprise
  • 61. The Future of Hadoop : What Happened & What's Possible? Co-Founder of Hadoop Doug Cutting http://www.youtube.com/watch?v=_WwuZI6AhN8 (14 minutes 41 seconds) http://strataconf.com/stratany2013/public/ schedule/detail/31591 Big Data is first industry that was created by open source. - Jack Norris (MapR Technologies) Separating Hadoop Myths from Reality Hadoop the kernel of the OS for data.
  • 62. Hadoop's Impact on the Future of Data Management Mike Olson (Cloudera) http://www.youtube.com/watch?v=puHS2JNKgRM http://strataconf.com/stratany2013/public/schedule/detail/31380
  • 63. Single : : : : : : S/W & H/W system security model management model metadata model audit model resource management model Common : storage & schema http://www.slideshare.net/cloudera/enterprise-data-hub-the-next-big-thing-in-big-data
  • 64. Unifying Your Data Management Platform with Hadoop: Batch and Real-time Machine Data Ingest, Alerts, and Analytics http://strataconf.com/stratany2013/public/schedule/detail/30282
  • 65. Last generation of data management is not sufficient More copies, representations, transformations increase risk Index once and reuse across workloads, lifecycle NoSQL: indexing and updates for interactive apps Hadoop: staging, persistence, and analytics Data Governance for Regulated Industries Using Hadoop http://strataconf.com/stratany2013/public/schedule/detail/30738
  • 66. Data Intelligence Rethink How You See Data Sharmila Shahani-Mulligan (ClearStory Data) http://www.youtube.com/watch?v=07hGulTOZGk (9 minutes 6 seconds) http://strataconf.com/stratany2013/public/schedule/detail/31742
  • 67. The Data Availability Problem ? Access Question Sampling Analysis & Disc Modeling overy Loading Insight Data Prep – too slow! Information Supply Chain Introducing a New Way to Interact with Insight http://strataconf.com/stratany2013/public/schedule/detail/31743 Presentation
  • 68. Running Non-MapReduce Big Data applications on Apache Hadoop http://strataconf.com/stratany2013/public/schedule/detail/30755
  • 69. Apache HBase for Architects http://strataconf.com/stratany2013/public/schedule/detail/30619 What’s Next for Apache HBase: Multi-tenancy, Predictability, and Extensions. http://strataconf.com/stratany2013/public/schedule/detail/30857
  • 70. Securing the Apache Hadoop Ecosystem http://strataconf.com/stratany2013/public/schedule/detail/30302
  • 71. An Introduction to the Berkeley Data Analytics Stack With Spark, Spark Streaming, Shark, Tachyon, and BlinkDB http://strataconf.com/stratany2013/public/schedule/detail/30959
  • 72. Schema Information does not exist until a schema is defined and data is stored in a relational database - anonymous Building a Data Platform http://strataconf.com/stratany2013/public/schedule/detail/31400
  • 73. Lessons Learned From A Decade’s Worth of Big Data At The U.S. National Security Agency (NSA) http://strataconf.com/stratany2013/public/schedule/detail/30913
  • 74. Managing a Rapidly Evolving Analytics Pipeline http://strataconf.com/stratany2013/public/schedule/detail/30635
  • 75. Managing a Rapidly Evolving Analytics Pipeline http://strataconf.com/stratany2013/public/schedule/detail/30635
  • 76. Stringer/Tez Shark SQL on/in Hadoop/Hbase Solutions Perception is Key: Telescopes, Microscopes and Data http://strataconf.com/strataeu2013/public/schedule/detail/32351
  • 77. All SQL on Hadoop Solutions are Missing the Point of Hadoop Every Solution makes you define a schema - SQL(Structured Query Language) is expressed over an assumed schema Major reasons why Hadoop has taken of include: - Ability to load data without defining a schema - Process data using schema-on-read instead of first defining a schema Hadoop contains a lot of: - Raw, granular data sets with potentially inconsistent schemas - Data sets in JSON, key-value, and other self-describing (non-relational) models designed for schema-on-read processing SQL on Hadoop solutions that make you first define a schema are missing a major part of Hadoop’s usage patterns Flexible Schema and the End of ETL http://strataconf.com/stratany2013/public/schedule/detail/31868
  • 79. Hadoop Adventures At Spotify http://strataconf.com/stratany2013/public/schedule/detail/30570
  • 80. Hadoop Adventures At Spotify http://strataconf.com/stratany2013/public/schedule/detail/30570
  • 81. Quick prototyping is the fastest way to internal advocacy. Ship It! Cloud == Speed We don’t always need a complicated solution. KISS Play to your differentiating strengths. Experience >> Data Bias towards impact. It Takes a Village EASE!! (Emulate, Analyze, Scale, Evaluate) How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce http://strataconf.com/stratany2013/public/schedule/detail/30707 Prototyping is key to overcoming resistance to change Technical architecture is heavily influenced by people organization Developing a team of experienced Hadoop users can often be done using internal employees A culture of experimentation and innovation yields the best result Ancestry.com: Managing Big Data Reaching Back to the 11th Century with Hadoop http://strataconf.com/stratany2013/public/schedule/detail/30499
  • 82.
  • 84. References Strata Conference + Hadoop World 2013 Keynotes & Interviews http://www.youtube.com/playlist?list=PL055Epbe6d5ZtziVAooUC04i1hL_Z9Xvk Slides & Video http://strataconf.com/stratany2013/public/schedule/proceedings Tweets https://twitter.com/search?q=%23strataconf #strataconf
  • 85. How Nordstrom Utilizes Human Intelligence to Blend Brick-and-Mortar with Online Commerce http://strataconf.com/stratany2013/public/schedule/detail/30707 http://nordstrom.github.io/stratanyc/
  • 86. http://complexdiagrams.com/properties Four Pillars of Visualization http://strataconf.com/stratany2013/public/schedule/detail/31182
  • 87. Building a production machine learning infrastructure http://www.slideshare.net/joshwills/production-machine-learninginfrastructure
  • 88. Text Analytics at Scale: Listening to 45 Million Customers http://strataconf.com/stratany2013/public/schedule/detail/30757
  • 89. Words + Numbers = Insights Text Analytics at Scale: Listening to 45 Million Customers http://strataconf.com/stratany2013/public/schedule/detail/30757