SlideShare a Scribd company logo
1 of 92
Download to read offline
The Journey to Becoming a Data-
Driven Enterprise
Pivotal Big Data Roadshow 2015
2© Copyright 2015 Pivotal. All rights reserved.
Where we’re going today…
3 Great Keynotes
•  Journey to a Data-driven Enterprise
•  Data Science Use Cases
•  Streaming Data and Internet of Things
Internet of Things Demo and Architecture Overview
Intensive hands-on training sessions
3© Copyright 2015 Pivotal. All rights reserved.
Today’s Agenda
8:00 AM - 9:00 AM - Check-In & Breakfast
9:00 AM - 10:15 AM - Journey to a Data-Driven Enterprise…
10:15 AM - 10:30 AM - Coffee Break
10:30 AM - 12:00 PM - Internet of Things Demo and Architecture Overview
12:00 PM - 1:00 PM - Lunch & Birds of a Feather Discussion
1:00 PM - 3:00 PM - Hands-On Technical Workshop
© Copyright 2015 Pivotal. All rights reserved.
MASHING BIG DATA WITH BIG MACHINES
IS ‘BEAUTIFUL, DESIRABLE, INVESTABLE’
- IT COULD TRANSFORM GE'S BUSINESS -
AND THE ECONOMY.
“
”JEFF IMMELT, CEO, GE
© Copyright 2015 Pivotal. All rights reserved.
ANALYZING INTERNET OF THINGS USING
BIG DATA SUITE
Internet of Things matter for...
•  Industrial Manufacturers
•  Transportation
•  Healthcare, Life Sciences
•  Financial Services
•  Retail
•  Telecom and Media
© Copyright 2015 Pivotal. All rights reserved.
THE POWER OF 1
R
X
Increasing
Freight Utilization Rail
Predictive
Maintenance Healthcare
Predictive
Diagnostics Power
Driving Outcomes That Matter
One Percent Improvement Equals
$27B
Industry Value by
Reducing System
Inefficiency
$63B
Industry Value by
Reducing Process
Inefficiency
$66B
Industry Value with
Efficiency Improvements
In Gas-fired Power
Plant Fleets
Source: General Electric
© Copyright 2015 Pivotal. All rights reserved.
THE INTERNET OF THINGS JOURNEY
STORE
•  Structured
•  Unstructured
•  High Volume
•  High Velocity
ANALYZE
•  Predictive Analytics
•  Machine Learning
•  Advance Data Science
•  Realtime Analytics
DEVELOP
•  Advanced Analytic Pipelines
•  Realtime Analytical Applications
•  Global Scale Data-Driven
Applications
•  Enterprise, Consumer, IoT, and
Mobile
INNOVATE
•  Agile Dev Expertise
•  DevOps
•  Hybrid Cloud
•  Continuous Delivery
•  Closed Loop Applications
AGILE DEVELOPMENT
BIG DATA
PREDICTIVE ANALYTICS
ENTERPRISE PAAS
8© Copyright 2015 Pivotal. All rights reserved.
0% of CIOs think
their IT infrastructure
is fully prepared for
big data (3)
30% of
companies have
deployed advanced
analytics, 11% big
data analysis (4)
44% of new applications
failed to meet performance
expectations (5)
2X
90% of companies
allocate at least 2X
more cloud capacity
than needed to ensure
performance (6)
But…
80% of
CEOs thinking
data mining and
analysis are
strategically important
(1)
4% of
companies use
analytics
effectively (2)
(1) 2015 PWC CEO Survey; (2)2013 Baine and Company - The Value of Big Data; (3) 2014 IT Infrastructure Conversation - IBM; (4)
Ernest and Young - 2014 Enterprise IT Trends and Investments; (5) 2014 Riverbed Tecnologies - The Transformers; (6) 2014 ElasticHosts CIO Study
LARGE ENTERPRISE BIG DATA TROUBLE
9© Copyright 2015 Pivotal. All rights reserved.
BIG DATA
CHASM
70%
of data
generated by
customers
80%
of data stored
3%
prepared for
analysis
0.5%
being
analyzed
<0.5%
being
operationalized
9
THE DATA DIVIDE
10© Copyright 2015 Pivotal. All rights reserved.
Software Is Eating The World
Data Is Fueling Software
SOFTWARE IS EATING THE WORLD
11© Copyright 2015 Pivotal. All rights reserved.
WE CHOSE PIVOTAL BECAUSE WE BELIEVE IT
PROVIDES A 360-DEGREE VIEW OF THE PROCESS.
FROM A DATA SCIENCE AND DATA TECHNOLOGY
PERSPECTIVE, IT MEANS DELIVERING BEST-IN-
CLASS DATA TECHNOLOGIES AND ENABLING THEM
ON THEIR PLATFORM.
“
”
12© Copyright 2015 Pivotal. All rights reserved.
ACROSS INDUSTRIES
13© Copyright 2015 Pivotal. All rights reserved.
THE NEW DATA IMPERATIVES
Converged
Data & Cloud
OpenData-Driven
Apps
14© Copyright 2015 Pivotal. All rights reserved.
THE BIG DATA PROBLEM
Fragmentation ConstraintsComplexity
15© Copyright 2015 Pivotal. All rights reserved.
•  Remove Lock-in
•  Leverage Ecosystem
•  Co-innovate
GUIDING PRINCIPLES IN THE NEW ERA
OPEN AGILE CLOUD-READY
•  Shorten innovation
cycles
•  Reduce TCO
•  Improve TTM
•  Solve business
problems
•  Avoid lock-in
•  Appropriate security
16© Copyright 2015 Pivotal. All rights reserved.
JOURNEY TO A DATA-DRIVEN ENTERPRISE
Deploy analytic apps and
automate at scale
Perform advanced analytics
Discover insights
Modernize data
infrastructure
17© Copyright 2015 Pivotal. All rights reserved.
Deploy analytic apps and
automate at scale
Perform advanced analytics
Discover insights
Modernize data
infrastructure
DATA-DRIVEN COMPANIES:
USE MODERN DATA INFRASTRUCTURE
18© Copyright 2015 Pivotal. All rights reserved.
MODERNIZE DATA INFRASTRUCTURE
Elastic, Scale-out
storage and processing
Flexible data types and
pipelining
ETL on demand: low operational cost
Expanded use cases
Higher quality analytics
Lowered storage/processing cost
Less fragmented ecosystem
Reduced vendor lock-in
REQUIREMENTS BENEFITS
Cloud friendly and
open-source based
19© Copyright 2015 Pivotal. All rights reserved.
Modernize data
infrastructure
Deploy analytic apps and
automate at scale
Perform advanced analytics
Discover insights
DATA-DRIVEN COMPANIES:
STRATEGICALLY USE ADVANCED ANALYTICS
20© Copyright 2015 Pivotal. All rights reserved.
ADVANCED ANALYTICS
Leverage existing skills and tools
Rapid time to insights
Internet of Things use cases
Rapid time to insights
Solve business problems
Predictive insights: proactive execution
REQUIREMENTS BENEFITS
Machine learning and
advanced analytics
01010101010101
01001010101010
10101100101010
SQL- compliant batch
and interactive queries
Massive stream
processing
0101010101010101001010
1010101010110010101010
10101010
21© Copyright 2015 Pivotal. All rights reserved.
Modernize data
infrastructure
Perform advanced analytics
Discover insights
DATA-DRIVEN COMPANIES:
INNOVATE AT SCALE
Deploy analytic apps and
automate at scale
22© Copyright 2015 Pivotal. All rights reserved.
ANALYTIC APPS AND AUTOMATION AT SCALE
Reduced time to action
Low ‘analytics ó app-dev’ integration cost
Reduced time to insights
Flexible ingestion: low operating cost
High performance: low operating cost
Transactional safety: business critical ops
REQUIREMENTS BENEFITS
Low-latency, distributed
in-memory transactions
Resilient, scale-out
messaging and object storage
Agile analytic app-dev
with enterprise PaaSPaaS
23© Copyright 2015 Pivotal. All rights reserved.
JOURNEY TO A DATA-DRIVEN ENTERPRISE
Deploy analytic apps and
automate at scale
Perform advanced analytics
Discover insights
Modernize data
infrastructure
Pivotal Data Science
helps you move from BI to
Data Science
Pivotal Labs
helps you move to an
agile development of apps
at scale
Pivotal Data Engineering
helps you move from data
administration to data engineering
24© Copyright 2015 Pivotal. All rights reserved.
•  Remove Lock-in
•  Leverage Ecosystem
•  Co-innovate
GUIDING PRINCIPLES IN THE NEW ERA
OPEN AGILE CLOUD-READY
•  Shorten innovation
cycles
•  Reduce TCO
•  Improve TTM
•  Solve business
problems
•  Avoid lock-in
•  Appropriate security
25© Copyright 2015 Pivotal. All rights reserved.
World’s First
Open Sourced,
Enterprise-Class Data
Portfolio
+
Open Data Platform
PIVOTAL BIG DATA SUITE
OPEN AGILE CLOUD-READY
Modern Data
Infrastructure
+
Advanced Analytics
+
Apps at Scale
Multiple Cloud
Deployment Models
+
Big Data Suite on Pivotal
Cloud Foundry
26© Copyright 2015 Pivotal. All rights reserved.
PIVOTAL BIG DATA SUITE
27© Copyright 2015 Pivotal. All rights reserved.
Open sourcing all Pivotal Big Data Suite components including:
WORLD’S FIRST OPEN SOURCED BIG DATA
PORTFOLIO
BUILDING ON SUCCESS OF CLOUD FOUNDRY FOUNDATION
BUILT FOR ENTERPRISES
Pivotal GemFire Pivotal HAWQPivotal
Greenplum Database
28© Copyright 2015 Pivotal. All rights reserved.
BUILT FOR ENTERPRISES
Value added features: enterprise grade performance + robustness without lock-in
•  Advanced Query Optimization in analytics
•  WAN replication and continuous query in transactional processing
Flexible Deployment models: align to business objectives and needs
•  Balance cost objectives with policy and compliance requirements
•  Leverage Pivotal’s pre-integration + certification on supported configurations
Enterprise grade support: one throat to choke for the suite
•  Focus on business problems – not on lifecycle management
•  Expert support on Big Data Suite means reduced business risk
29© Copyright 2015 Pivotal. All rights reserved.
•  Common core for Hadoop ecosystem
•  Rapidly accelerated certifications, ecosystem
development and enterprise-grade quality
OpenDataPlatform.org
OPEN
30© Copyright 2015 Pivotal. All rights reserved.
AGILE
Deploy analytic apps and
automate at scale
Perform advanced analytics
Discover insights
Modernize data infrastructure
Spring XD Spark Pivotal HD & Open Data Platform
Pivotal Greenplum Database
Pivotal HAWQ Rabbit MQ
Redis
Pivotal GemFire Pivotal BDS on PCF
Pivotal Cloud Foundry
31© Copyright 2015 Pivotal. All rights reserved.
CLOUD-READY
COMMODITY
HARDWARE
APPLIANCE HYBRID CLOUDCLOUD
IaaS IaaS
PAAS
32© Copyright 2015 Pivotal. All rights reserved.
THE INTERNET OF THINGS JOURNEY WITH
PIVOTAL BIG DATA SUITE
STORE
•  Structured
•  Unstructured
•  High Volume
•  High Velocity
ANALYZE
•  Predictive Analytics
•  Machine Learning
•  Advance Data Science
•  Realtime Analytics
DEVELOP
•  Advanced Analytic Pipelines
•  Realtime Analytical Applications
•  Global Scale Data-Driven
Applications
•  Enterprise, Consumer, IoT, and
Mobile
INNOVATE
•  Agile Dev Expertise
•  DevOps
•  Hybrid Cloud
•  Continuous Delivery
•  Closed Loop Applications
AGILE DEVELOPMENT
BIG DATA
PREDICTIVE ANALYTICS
ENTERPRISE PAAS
Spring XD
Spark
Pivotal HD &
Open Data Platform
Spring XD
Pivotal Greenplum
Database
Pivotal HAWQ
Spring XD
Pivotal GemFire
Redis
Rabbit MQ
Spring IO
Groovy
Pivotal BDS on PCF
Pivotal Cloud Foundry
Pivotal LabsData ScienceData Engineering
33© Copyright 2015 Pivotal. All rights reserved.
WHY PIVOTAL FOR BIG DATA ?
Complete
platform
SQL on Hadoop
leadership
Deployment
options
Open source
Flexible licensing
Advanced data
services
Pivotal Data Engineering Pivotal LabsPivotal Data Science
34© Copyright 2015 Pivotal. All rights reserved.
Complete
platform
SQL on Hadoop
leadership
Deployment
options
Open source
Flexible licensing
Advanced data
services
WHY PIVOTAL FOR BIG DATA ?
Pivotal Data Engineering Pivotal LabsPivotal Data Science
35© Copyright 2015 Pivotal. All rights reserved.
FOR FURTHER INFO, CHECKOUT…
•  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data
•  Pivotal Blog @ http://blog.pivotal.io
•  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal
•  Pivotal Academy @ https://pivotal.biglms.com
Or reach out to your local Pivotal Account Executive…
36© Copyright 2015 Pivotal. All rights reserved. 36© Copyright 2013 Pivotal. All rights reserved.
Pivotal Data
Science Overview
and Use Cases
Pivotal Big Data Roadshow
37© Copyright 2015 Pivotal. All rights reserved.
DATA SCIENCE?
App Development
Analytics
Business Intelligence
Reporting
Visualization
Dashboards
Insights
Big Data
Machine Learning
Statistics
Mathematics
Time Series
Algorithms
Databases
Software
Modeling
Queries
Real-Time
Sensors
Predictive Models
ETL
Research
Hadoop
Distributed Computing
MapReduce
SQL
In-Memory
OLAP
Text Mining
Unstructured Data
Open Source
Decision Science
Ad Hoc Queries
Hacking
In-Database Analytics
Internet of Things
Data Cleansing
Sentiment
38© Copyright 2015 Pivotal. All rights reserved.
•  ETL
•  Unstructured
•  Data Cleansing
•  Sensors
Data Related
•  Algorithms
•  Mathematics
•  Time Series
•  Statistics
•  Predictive Modeling
•  Machine Learning
•  Text Mining
•  Sentiment
•  Map Reduce
Fields of Study &
Techniques
•  Dashboards
•  Insights
•  Visualization
•  Ad Hoc Queries
•  Reporting
Business
Intelligence
•  Software
•  In-Database
Analysis
•  Distributed
Computing
•  Hadoop
•  Open Source
Implementation
•  Big Data
•  Decision
Science
•  Internet of
Things
•  Real-Time
•  Hacking
•  In-Memory
Industry
Buzzwords
39© Copyright 2015 Pivotal. All rights reserved.
What is Data Science?
The use of statistical and machine learning techniques on
big multi-structured data in a distributed computing
environment to identify correlations and causal
relationships, classify and predict events, identify patterns
and anomalies, and infer probabilities, interest, and
sentiment.
DRIVE AUTOMATED, LOW-LATENCY ACTIONS IN RESPONSE TO EVENTS OF INTEREST
40© Copyright 2015 Pivotal. All rights reserved.
Gene Sequencing
Smart Grids
COST TO SEQUENCE
ONE GENOME
HAS FALLEN FROM
$100M IN 2001
TO $10K IN 2011
TO $1K IN 2014
READING SMART METERS
EVERY 15 MINUTES IS
3000X MORE
DATA INTENSIVE
Stock Market
Social Media
FACEBOOK UPLOADS
250 MILLION
PHOTOS EACH DAY
Billions of Data Points
Oil Exploration
Video Surveillance
OIL RIGS GENERATE
25000
DATA POINTS
PER SECOND
Medical Imaging
Mobile Sensors
41© Copyright 2015 Pivotal. All rights reserved.
What is Big Data Analytics?
Descriptive
Analytics
WHAT HAPPENED?
Diagnostic
Analytics
WHY DID IT HAPPEN?
Predictive
Analytics
WHAT WILL HAPPEN?
Prescriptive
Analytics
HOW CAN WE MAKE IT HAPPEN?
Complexity
Value of
Analytics
($)
42© Copyright 2015 Pivotal. All rights reserved.
Pivotal Big Data Suite
P L A T F O R M
Data Science Toolkit
KEY TOOLS KEY LANGUAGES
SQL
43© Copyright 2015 Pivotal. All rights reserved.
A single address for everything analytics
Analytics with Pivotal
Time-to-Insights
FORECASTING CLUSTERING
REGRESSION
CLASSIFICATION
OPTIMIZATION
44© Copyright 2015 Pivotal. All rights reserved.
Smart Systems = Sensors + Digital Brain + Actuators
Problem
Formulation
Modeling
Step
Data Step
Application
Step
Data Science for
Building Models
Sensors &
Actuators
Data Lake
45© Copyright 2015 Pivotal. All rights reserved. 45© Copyright 2013 Pivotal. All rights reserved.
Data Science
Use Cases
46© Copyright 2015 Pivotal. All rights reserved. 46© Copyright 2013 Pivotal. All rights reserved.
Smart Meter
Analytics
47© Copyright 2015 Pivotal. All rights reserved.
The Digital Brain: Making a Smart Grid Smarter!
Action:
Where (and when) to
send trucks, preventive
maintenance
The Digital Brain:
Uses Fourier transform
extracts patterns and
flags outliers/anomalies
Input:
Data from smart
meters
48© Copyright 2015 Pivotal. All rights reserved.
Smart Meter Analytics – Significant Use Cases
•  Load profiling
•  Theft prevention
•  Demand prediction
•  Load forecasting
•  Root cause of power failures
•  Black-out warning
•  Anomaly detection
•  Network topology error
detection
49© Copyright 2015 Pivotal. All rights reserved.
SOLUTION
•  Analyze smart meter power data using
unsupervised clustering techniques and detect
anomalies based on distance metric in clusters
•  Reduce time required to monitor and improve
grid efficiencies
•  Leveraged the MPP architecture of Pivotal
GPDB and MADlib in-database machine
learning library for fast computation at scale
Electricity Network Load Profiling and Outlier
Detection
CUSTOMER
A major smart grid infrastructure provider
BUSINESS PROBLEM
Profile power consumption patterns based on smart
meter data and flag anomalous usage
CHALLENGES
•  Large volume of smart meter data (several
months of data from 100s of thousands of
meters) could not be analyzed effectively by
legacy system
•  Timely business insights on large scale smart grid
infrastructure demand fast processing of data
50© Copyright 2015 Pivotal. All rights reserved.
Electricity Network Load Profiling and Outlier
Detection
Dashboards for navigating clusters and outliers
51© Copyright 2015 Pivotal. All rights reserved.
Network Topology Error Detection
CUSTOMER
A major utility
BUSINESS PROBLEM
Use load and voltage meter readings to determine
errors in transformer network topologies
CHALLENGES
•  Time consuming process to detect network
topology errors on entire network in legacy
system
•  Timely detection of network topology errors
requires big data infrastructure and analytical
capabilities
SOLUTION
•  For each transformer network in parallel, solve
an LP to determine scale of topology error,
which can be used to flag and rank anomalous
network topologies
•  Reduce time for topology error detection from
several days/weeks to few minutes!
52© Copyright 2015 Pivotal. All rights reserved. 52© Copyright 2013 Pivotal. All rights reserved.
Security and
Fraud
53© Copyright 2015 Pivotal. All rights reserved.
Attacker elevates
access to important
user, service and
admin accounts, and
specific systems
Data is acquired from
target servers and
staged for exfiltration
Data is exfiltrated via
encrypted files over
ftp to external,
compromised
machine at a hosting
provider
A handful of users are
targeted by two
phishing attacks: one
user opens Zero day
payload
(CVE-02011-0609)
The user machine is
accessed remotely by
Poison Ivy tool
Advanced Persistent Threat (APT)
APT Kill Chain
1 432
Phishing & Zero
Day Attack
Back Door Lateral Movement Data Gathering Exfiltrate
5
54© Copyright 2015 Pivotal. All rights reserved.
Anomalous User-to-Resource Access Detection
BUSINESS PROBLEM
Detect anomalous user behaviors in the global
enterprise computer network
SUMMARY
Given local-to-local communication data, identify
anomalous users within an enterprise.
Ÿ  Reduce malware-dwell time, typical 243 days
Ÿ  Signature-based approaches cannot detect
such behavior
CHALLENGES
10 Billion events in 6 months; 15K+ network
devices; No existing SIEM solutions can model
user behavioral resource access baseline and
enable anomaly detection in an adaptive and
scalable architecture.
SOLUTION
An innovative Graph Mining based algorithmic
framework with advanced Machine Learning.
Network topology and temporal behaviors are both
modeled. (Patent pending). Implemented in MPP
and PL/R, enabling parallel model training and
behavior risk scoring. Successfully identified DLP
violating anomalous users.
55© Copyright 2015 Pivotal. All rights reserved. 55© Copyright 2013 Pivotal. All rights reserved.
Financial
Services
56© Copyright 2015 Pivotal. All rights reserved.
Identifying and Pricing Cross-Sell Opportunities
CUSTOMER
A global financial services provider
BUSINESS PROBLEM
Identify cross-sell opportunities between two
business arms of a financial institution.
CHALLENGES
Integration of large-scale data originating from
multiple data warehouses. Developing predictive
models to identify novel cross-sell opportunities
within the financial institution. Evaluate the
identified cross-sell opportunities by their revenue
potential.
SOLUTIONS
Ÿ  Fast integration of data in Pivotal Greenplum
Database.
Ÿ  Predictive models and evaluation of profitability:
–  Association rule.
–  Logistic regression for each product
offered.
–  Estimation of revenue opportunity.
Ÿ  On-demand reporting and visualization via
custom dashboards connected to
in-database models.
57© Copyright 2015 Pivotal. All rights reserved.
Credit Risk Assessment and Stress Testing
CUSTOMER
A global financial services provider
BUSINESS PROBLEM
Speed up the process of compliance reporting and
stress testing for Basel III.
CHALLENGES
Running the calculation procedures on the
customer’s legacy database were time-consuming,
therefore had to be done in overnight batch mode.
SOLUTION
Ÿ  Implement risk asset calculation and stress
testing on Pivotal Greenplum Database.
Ÿ  Three years of data was processed in well
under 2 minutes, significantly faster than the
customer’s current procedures.
Ÿ  Connect an “in-
database” visualization
tool to Pivotal Greenplum
Database via ODBC for
on-demand reporting and
visualization.
58© Copyright 2015 Pivotal. All rights reserved.
Financial Compliance
BUSINESS PROBLEM
•  Ensure compliance with Dodd-Frank and Basel
Committee regulations
•  Identify underlying risk and fraud while reducing the
compliance department’s overburdened
Emails Chats Trades
Transactions Policy Securities
Phone Calls Watch Lists …
Financial compliance
Data Lake
Data
integration
Data clean
up
Modeling
Classification
and ranking
Analyst user interfaces
Feedback
Analytics
Analyst feedback
Data integration: e.g., append
trade information with email and
chat communications
Data cleanup: e.g.,
identify newsletters and
spam emails
Modeling:
•  Predictive modeling to flag
messages and trades
•  Graph and cohort analysis
Analyst feedback
Reviewed fraud instances
included in periodic model
refreshes
SOLUTION
Ÿ  A data lake platform coupled with cutting edge data
science techniques
Ÿ  Flexible user interface to promote an adaptive,
continuously learning compliance framework
59© Copyright 2015 Pivotal. All rights reserved.
Pivotal Topic & Sentiment Analysis Engine
External
Tables
PXF
HDFS
Source: http Sink: hdfs
Parallel Parsing
of JSON
(PL/Python)
HAWQ
Nightly Cron Jobs
Topic Analysis
through MADlib pLDA
Unsupervised
Sentiment Analysis
(PL/Python)
D3.js
Spring XD
Twitter Decahose (~55 million tweets/day)
60© Copyright 2015 Pivotal. All rights reserved.
http://blog.pivotal.io/data-science-pivotal
Check out the Pivotal Data Science Blog!
61© Copyright 2015 Pivotal. All rights reserved.
FOR FURTHER INFO…
•  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data
•  Pivotal Blog @ http://blog.pivotal.io
•  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal
•  Pivotal Academy @ https://pivotal.biglms.com
•  Or reach out to your local Pivotal Account Executive…
62© Copyright 2015 Pivotal. All rights reserved.
Data Streaming and IoT
Using Pivotal Big Data Suite
63© Copyright 2015 Pivotal. All rights reserved.
Converging
Trends
Innovation
New Data New Processes New Insights
The Journey to the Data-Driven Enterprise
Data Science
and Machine
Learning
Big Data
Internet
of Things
64© Copyright 2015 Pivotal. All rights reserved.
HDFS
Data Lake
Ingest Store Analytics
Hard to change
Labor intensive
Inefficient
Coding based
No real-time information
Based on expensive ETL
Migrating from a Reactive, Static and
Constrained Model…
65© Copyright 2015 Pivotal. All rights reserved.
HDFSData Lake
Expert System /
Machine Learning
In-Memory
Real-Time Data
Continuous Learning
Continuous Improvement
Continuous Adapting
Data Stream Pipeline
Multiple Data Sources
Real-Time Processing
Store Everything
To Pro-Active, Self-Improving, Machine
Learning Systems
66© Copyright 2015 Pivotal. All rights reserved.
New York Times Research: http://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html
“
50-80% OF THE TIME ON DATA
SCIENCE PROJECTS IS SPENT ON
DATA WRANGLING
”
67© Copyright 2015 Pivotal. All rights reserved.
Data Feeds
Stream Processing
Expert Systems
Machine Learning
Historical Data
Business Value
Smart Decisions
Still…
HDFS
Data Lake
68© Copyright 2015 Pivotal. All rights reserved.
Ingest Transform Sink
SpringXD
GemFire
Data Stream Needs an Agile, Scalable and
Fast Solution
HAWQ GPDB
Data
Lake
69© Copyright 2015 Pivotal. All rights reserved.
Ingest Transform Sink
SpringXD
Distributed
Computing
In-Memory
Real-Time Data
Spring XD Orchestrates and Automates all the
Steps on Data Stream Pipelining
Expert System /
Machine Learning
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable HAWQ GPDB
Data
Lake
70© Copyright 2015 Pivotal. All rights reserved.
INGEST / SINK PROCESS ANALYZE
•  No coding required
•  Dozens of built-in
connectors
•  Seamless integration with
Kafka, Sqoop
•  Create new connectors
easily using Spring
•  Call Spark, Reactor or
RxJava
•  Built-in configurable filtering,
splitting and transformation
•  Out-of-box configurable jobs
for batch processing
•  Import and invoke PMML
jobs easily
•  Call Python, R, Madlib and
other tools
•  Built-in configurable
counters and gauges
Spring XD
State of the Art Data Pipeline Automation
71© Copyright 2015 Pivotal. All rights reserved.
Ingest Transform Sink
SpringXD
Distributed
Computing
GemFire Provides Scalable, Low-Latency
Data Access, Storage and Event Processing
Expert System /
Machine Learning
GemFire
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable HAWQ GPDB
Data
Lake
72© Copyright 2015 Pivotal. All rights reserved.
GemFire
•  In-Memory Enterprise Data Grid
•  Horizontally Scalable, Consistent, Highly
Available
•  Event handling
•  Continuous Queries
•  Enterprise Data Geo Distribution
In-memory Real Time Data
73© Copyright 2015 Pivotal. All rights reserved.
Ingest Transform Sink
SpringXD
Distributed
Computing
Pivotal Provides SQL Based
Advanced Analytics
Expert System /
Machine Learning
GemFire
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable
Data
Lake
HAWQ GPDB
74© Copyright 2015 Pivotal. All rights reserved.
HAWQ
•  Massively Parallel Processing
RDBMS on HADOOP
•  ANSI SQL on Hadoop
•  Extremely high performance for
analytics (not like Hive)
•  Stores all data directly on
HDFS
•  Open-Source
Advanced SQL analytics in Hadoop
Combining SQL with Hadoop is key for analytics
SQL remains #1 choice for Data Science
75© Copyright 2015 Pivotal. All rights reserved.
Ingest Transform Sink
SpringXD
Developers and Data Scientists Can Focus on
the Business Value of Data
GemFire
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable
Data
Lake
HAWQ GPDB
76© Copyright 2015 Pivotal. All rights reserved.
Data Streaming Reference Architecture
Data Feeds Transactional Apps Analytic Apps
Data Stream Pipeline
Distributed
Computing
Real-Time Data
Expert Systems &
Machine Learning
Advanced
Analytics
HDFSData Lake
77© Copyright 2015 Pivotal. All rights reserved.
Data Streaming Reference Architecture
Data Feeds Transactional Apps Analytic Apps
Data Stream Pipeline
HDFSData Lake
GemFire HAWQ GPDB
SpringXD
78© Copyright 2015 Pivotal. All rights reserved.
“
SO WE ARE MOVING TO A WORLD WHERE THE
MACHINES WE WORK WITH ARE NOT JUST
INTELLIGENT; THEY ARE BRILLIANT.THEY ARE
SELF-AWARE, THEY ARE PREDICTIVE, REACTIVE
AND SOCIAL. IT'S A WORLD WHERE INFORMATION
ITSELF BECOMES INTELLIGENT AND COMES TO US
AUTOMATICALLY WHEN WE NEED IT WITHOUT
HAVING TO LOOK FOR IT.
”MARCO ANNUNZIATA, GE
79© Copyright 2015 Pivotal. All rights reserved.
The IoT Market Verticals
Diversified Industrial
Manufacturers
Agriculture, Security, Retail
Auto Manufacturers
Media (via Mobile devices)
Urban Infrastructure, Cities
Consumer, Connected Home
etc.
Healthcare, Life Sciences
80© Copyright 2015 Pivotal. All rights reserved.
“
THE MAGIC HAPPENS WHEN YOU MARRY THE
TRADITIONAL ENGINEERING APPROACH WITH THE
DATA SCIENCE ENABLED BY THE DATA LAKE. IT
OPENS UP A WHOLE NEW WORLD OF POSSIBLE
‘WHAT IF’ QUESTIONS.
”DAVE BARTLETT, GE AVIATION
81© Copyright 2015 Pivotal. All rights reserved.
GE Aviation – Big Data & IoT
•  Goal
•  Improve jet engine efficiency and increase service profitability
•  Unable to store & analyze massive amounts of data for analytics
•  Solution
•  LARGE DATA SETS ingested via batch
•  Store 100s TB of engine data in Hadoop (PHD)
•  Open doors for industrial engineers to poke at data (HAWQ)
•  FAST MACHINE LEARNING based algorithms
•  2000x faster, 10x cheaper
•  Customer portals for visibility
82© Copyright 2015 Pivotal. All rights reserved.
“
THE REAL OPPORTUNITY FOR
CHANGE...SURPASSING THE MAGNITUDE OF THE
CONSUMER INTERNET...IS THE INDUSTRIAL
INTERNET, AN OPEN, GLOBAL NETWORK THAT
CONNECTS PEOPLE, DATA AND MACHINES.
”JEFF IMMELT, CEO, GE
83© Copyright 2015 Pivotal. All rights reserved.
GE Energy – Fast Data & IoT
•  Goal
•  Failing gas turbines causing issues with power generation
•  Unable to store & process fire-hose of data
•  Solution
•  HIGH VELOCITY data ingestion from Gas Turbines
•  Store 10 TB of turbine data in memory (GemFire)
•  VERY LOW LATENCY and HIGH SPEED data access
•  “Predictive” Maintenance
84© Copyright 2015 Pivotal. All rights reserved.
“
… YOU USE THOSE DEVICES TO INSPECT CARS AND
KEEP TRACK OF INSPECTION RECORDS. THAT CAN
THEN FLOW INTO ASSET MANAGEMENT SOFTWARE
TO PROVIDE PREDICTIVE ANALYSIS OF WHEN THE
ASSET NEEDS TO BE MAINTAINED ... WHEN YOU
BOIL THAT DOWN TO BUSINESS, IT’S ABOUT
COMPETITIVE ADVANTAGE.
”BRAD HOWELL, LODESTAR LOGISTICS
85© Copyright 2015 Pivotal. All rights reserved.
GE Transportation – Big & Fast Data
•  Goal
•  Help rail companies manage locomotives better
•  Fast data from tracks & Big data from sensors in locomotives
•  Solutions
•  MACHINE LEARNING MODEL built from combined data set (MADLib)
•  REAL TIME SCORING of rail sensor data (Spring XD)
•  REAL-TIME ALERTING of critical events via email & a REAL TIME DASHBOARD (Spring)
86© Copyright 2015 Pivotal. All rights reserved.
“
WE EXPECT THE PRECISION AGRICULTURE SPACE
TO CONTINUE TO GROW QUICKLY AS DATA
BECOMES CHEAPER TO STORE AND EASIER TO
MOVE FROM PLATFORM TO PLATFORM. WE ARE
JUST BEGINNING TO EXPLORE ALL THE VALUE WE
CAN CREATE FOR FARMERS WITH THESE TOOLS.
”BRETT BEGEMANN, MONSANTO
87© Copyright 2015 Pivotal. All rights reserved.
Monsanto – Agriculture & IoT
•  Goal
•  Help farmers maximize crop yields
•  Solution
•  Use of Big Data to collect & store data from farm equipment
•  Combine with climate and other information
•  Build custom apps using an agile app dev platform (PCF)
88© Copyright 2015 Pivotal. All rights reserved.
IoT - Need for a Platform
A Better Customer Experience
Pivotal
Cloud Foundry
Pivotal
Big Data Suite
Using Innovative Data-Driven Apps
On a Integrated Platform
89© Copyright 2015 Pivotal. All rights reserved.
The Connected Car Architecture
INGESTION
JSON / HTTP
STREAM PROCESSING
Spring XDTransformEnrich
DATA LAKE
Pivotal HDSink
ADVANCED ANALYTICS
HAWQ
REAL-TIME DATA INSIGHTS
GemFire
MOBILE SERVICES
MICROSERVICES
Pivotal CF Dashboard Analytics App Simulator
IoT APPS
Rabbit MQ
PUSH
90© Copyright 2015 Pivotal. All rights reserved.
Horizontally Scalable
Fault Tolerant
Extensible
Open-Source
STREAM PROCESSING
Spring XD
Rabbit MQ
DATA LAKE
Pivotal HD
ADVANCED ANALYTICS
HAWQ
ENRICHER PREDICTIVE ANALYTICS
+ Timestamp
& GUID
+ MPG, rangE
& route
MOBILE
APP
JSON
REAL-TIME DATA INSIGHTS
GemFire
CAR
SENSOR
Sink
Tap
DASHBOARD
91© Copyright 2015 Pivotal. All rights reserved.
FOR FURTHER INFO, CHECKOUT…
•  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data
•  Pivotal Blog @ http://blog.pivotal.io
•  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal
•  Pivotal Academy @ https://pivotal.biglms.com
•  Or reach out to your local Pivotal Account Executive…
BUILT FOR THE SPEED OF BUSINESS

More Related Content

What's hot

Data and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationData and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationVMware Tanzu
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworksHortonworks
 
How Data Science is Preventing College Dropouts and Advancing Student Success
How Data Science is Preventing College Dropouts and Advancing Student SuccessHow Data Science is Preventing College Dropouts and Advancing Student Success
How Data Science is Preventing College Dropouts and Advancing Student SuccessVMware Tanzu
 
Pivotal Digital Transformation Forum: Data Science
Pivotal Digital Transformation Forum: Data Science Pivotal Digital Transformation Forum: Data Science
Pivotal Digital Transformation Forum: Data Science VMware Tanzu
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Data Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal PlatformData Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal PlatformGautam S. Muralidhar
 
Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journeyDataWorks Summit
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeCloudera, Inc.
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationSnapLogic
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Precisely
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data editionMark Kerzner
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcarePerficient, Inc.
 

What's hot (20)

Data and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationData and its Role in Your Digital Transformation
Data and its Role in Your Digital Transformation
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
How Data Science is Preventing College Dropouts and Advancing Student Success
How Data Science is Preventing College Dropouts and Advancing Student SuccessHow Data Science is Preventing College Dropouts and Advancing Student Success
How Data Science is Preventing College Dropouts and Advancing Student Success
 
Pivotal Digital Transformation Forum: Data Science
Pivotal Digital Transformation Forum: Data Science Pivotal Digital Transformation Forum: Data Science
Pivotal Digital Transformation Forum: Data Science
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Data Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal PlatformData Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal Platform
 
Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journey
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data Integration
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in Healthcare
 

Viewers also liked

Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...
Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...
Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...VMware Tanzu
 
Business Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceBusiness Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceVMware Tanzu
 
Cloud Foundry Technical Overview
Cloud Foundry Technical OverviewCloud Foundry Technical Overview
Cloud Foundry Technical Overviewcornelia davis
 
Internet Of Things: How Data Science Driven Software is Eating the Connected ...
Internet Of Things: How Data Science Driven Software is Eating the Connected ...Internet Of Things: How Data Science Driven Software is Eating the Connected ...
Internet Of Things: How Data Science Driven Software is Eating the Connected ...VMware Tanzu
 
Personal Healthcare IOT on PCF using Spring
Personal Healthcare IOT on PCF using SpringPersonal Healthcare IOT on PCF using Spring
Personal Healthcare IOT on PCF using SpringJim Shingler
 
CenturyLink and Their Journey to Cloud Foundry
CenturyLink and Their Journey to Cloud FoundryCenturyLink and Their Journey to Cloud Foundry
CenturyLink and Their Journey to Cloud FoundryVMware Tanzu
 
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭Hang Geng
 
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...VMware Tanzu
 
Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWAREFernando Lopez Aguilar
 
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsAnalytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsVMware Tanzu
 
Introduction to the EDF Innovation Exchange
Introduction to the EDF Innovation ExchangeIntroduction to the EDF Innovation Exchange
Introduction to the EDF Innovation Exchangeedf_innovex
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Wall Street Derivative Risk Solutions Using Apache Geode
Wall Street Derivative Risk Solutions Using Apache GeodeWall Street Derivative Risk Solutions Using Apache Geode
Wall Street Derivative Risk Solutions Using Apache GeodeAndre Langevin
 
Cloud Foundry: Inside the Machine
Cloud Foundry: Inside the MachineCloud Foundry: Inside the Machine
Cloud Foundry: Inside the MachineDerek Collison
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Bootiful Code with Spring Boot
Bootiful Code with Spring BootBootiful Code with Spring Boot
Bootiful Code with Spring BootJoshua Long
 
Troubleshooting App Health and Performance with PCF Metrics 1.2
Troubleshooting App Health and Performance with PCF Metrics 1.2Troubleshooting App Health and Performance with PCF Metrics 1.2
Troubleshooting App Health and Performance with PCF Metrics 1.2VMware Tanzu
 
Présentation edf pulse 2017 (1)
Présentation edf pulse 2017 (1)Présentation edf pulse 2017 (1)
Présentation edf pulse 2017 (1)🚀Yan Thoinet
 

Viewers also liked (20)

Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...
Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...
Part 2: Architecture and the Operator Experience (Pivotal Cloud Platform Road...
 
Business Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceBusiness Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data Science
 
Cloud Foundry Technical Overview
Cloud Foundry Technical OverviewCloud Foundry Technical Overview
Cloud Foundry Technical Overview
 
Internet Of Things: How Data Science Driven Software is Eating the Connected ...
Internet Of Things: How Data Science Driven Software is Eating the Connected ...Internet Of Things: How Data Science Driven Software is Eating the Connected ...
Internet Of Things: How Data Science Driven Software is Eating the Connected ...
 
Personal Healthcare IOT on PCF using Spring
Personal Healthcare IOT on PCF using SpringPersonal Healthcare IOT on PCF using Spring
Personal Healthcare IOT on PCF using Spring
 
CenturyLink and Their Journey to Cloud Foundry
CenturyLink and Their Journey to Cloud FoundryCenturyLink and Their Journey to Cloud Foundry
CenturyLink and Their Journey to Cloud Foundry
 
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭
Ceph中国社区9.19 Ceph集群运维及案例分享04-武宇亭
 
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...
Cloud Foundry Summit 2015: Rocking the Lattice: A New Path for Cloud Foundry ...
 
Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARE
 
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsAnalytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
 
Introduction to the EDF Innovation Exchange
Introduction to the EDF Innovation ExchangeIntroduction to the EDF Innovation Exchange
Introduction to the EDF Innovation Exchange
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Wall Street Derivative Risk Solutions Using Apache Geode
Wall Street Derivative Risk Solutions Using Apache GeodeWall Street Derivative Risk Solutions Using Apache Geode
Wall Street Derivative Risk Solutions Using Apache Geode
 
Cloud Foundry: Inside the Machine
Cloud Foundry: Inside the MachineCloud Foundry: Inside the Machine
Cloud Foundry: Inside the Machine
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Bootiful Code with Spring Boot
Bootiful Code with Spring BootBootiful Code with Spring Boot
Bootiful Code with Spring Boot
 
Troubleshooting App Health and Performance with PCF Metrics 1.2
Troubleshooting App Health and Performance with PCF Metrics 1.2Troubleshooting App Health and Performance with PCF Metrics 1.2
Troubleshooting App Health and Performance with PCF Metrics 1.2
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Présentation edf pulse 2017 (1)
Présentation edf pulse 2017 (1)Présentation edf pulse 2017 (1)
Présentation edf pulse 2017 (1)
 

Similar to Pivotal Big Data Roadshow

Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Data Con LA
 
A New Day for Oracle Analytics
A New Day for Oracle AnalyticsA New Day for Oracle Analytics
A New Day for Oracle AnalyticsRich Clayton
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Cloudera, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best PracticesCapgemini
 
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace Emtec Inc.
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
10 Reasons Why Smart Organizations are Moving to Cloud BI
10 Reasons Why Smart Organizations are Moving to Cloud BI10 Reasons Why Smart Organizations are Moving to Cloud BI
10 Reasons Why Smart Organizations are Moving to Cloud BIGoodData
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Holden Ackerman
 
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...NuoDB
 
Role of Data in Digital Transformation
Role of Data in Digital TransformationRole of Data in Digital Transformation
Role of Data in Digital TransformationVMware Tanzu
 
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...AppDynamics
 
150601 gartner cloud_summit_vfinal
150601 gartner cloud_summit_vfinal150601 gartner cloud_summit_vfinal
150601 gartner cloud_summit_vfinalMichael Burian
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?SnapLogic
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital TransformationMukund Babbar
 

Similar to Pivotal Big Data Roadshow (20)

Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
 
A New Day for Oracle Analytics
A New Day for Oracle AnalyticsA New Day for Oracle Analytics
A New Day for Oracle Analytics
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace
How Finance is Adopting Analytics, and Reacting to Changes in the Marketplace
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
3.1 oracle salonika
3.1 oracle salonika3.1 oracle salonika
3.1 oracle salonika
 
10 Reasons Why Smart Organizations are Moving to Cloud BI
10 Reasons Why Smart Organizations are Moving to Cloud BI10 Reasons Why Smart Organizations are Moving to Cloud BI
10 Reasons Why Smart Organizations are Moving to Cloud BI
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...
The Enabling Power of Distributed SQL for Enterprise Digital Transformation I...
 
Agile EcoSystem
Agile EcoSystemAgile EcoSystem
Agile EcoSystem
 
Role of Data in Digital Transformation
Role of Data in Digital TransformationRole of Data in Digital Transformation
Role of Data in Digital Transformation
 
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...
Best Practices for Managing IaaS, PaaS, and Container-Based Deployments - App...
 
150601 gartner cloud_summit_vfinal
150601 gartner cloud_summit_vfinal150601 gartner cloud_summit_vfinal
150601 gartner cloud_summit_vfinal
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
 

More from VMware Tanzu

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItVMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleVMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductVMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And BeyondVMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptxVMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchVMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishVMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - FrenchVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootVMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerVMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeVMware Tanzu
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsVMware Tanzu
 

More from VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Pivotal Big Data Roadshow

  • 1. The Journey to Becoming a Data- Driven Enterprise Pivotal Big Data Roadshow 2015
  • 2. 2© Copyright 2015 Pivotal. All rights reserved. Where we’re going today… 3 Great Keynotes •  Journey to a Data-driven Enterprise •  Data Science Use Cases •  Streaming Data and Internet of Things Internet of Things Demo and Architecture Overview Intensive hands-on training sessions
  • 3. 3© Copyright 2015 Pivotal. All rights reserved. Today’s Agenda 8:00 AM - 9:00 AM - Check-In & Breakfast 9:00 AM - 10:15 AM - Journey to a Data-Driven Enterprise… 10:15 AM - 10:30 AM - Coffee Break 10:30 AM - 12:00 PM - Internet of Things Demo and Architecture Overview 12:00 PM - 1:00 PM - Lunch & Birds of a Feather Discussion 1:00 PM - 3:00 PM - Hands-On Technical Workshop
  • 4. © Copyright 2015 Pivotal. All rights reserved. MASHING BIG DATA WITH BIG MACHINES IS ‘BEAUTIFUL, DESIRABLE, INVESTABLE’ - IT COULD TRANSFORM GE'S BUSINESS - AND THE ECONOMY. “ ”JEFF IMMELT, CEO, GE
  • 5. © Copyright 2015 Pivotal. All rights reserved. ANALYZING INTERNET OF THINGS USING BIG DATA SUITE Internet of Things matter for... •  Industrial Manufacturers •  Transportation •  Healthcare, Life Sciences •  Financial Services •  Retail •  Telecom and Media
  • 6. © Copyright 2015 Pivotal. All rights reserved. THE POWER OF 1 R X Increasing Freight Utilization Rail Predictive Maintenance Healthcare Predictive Diagnostics Power Driving Outcomes That Matter One Percent Improvement Equals $27B Industry Value by Reducing System Inefficiency $63B Industry Value by Reducing Process Inefficiency $66B Industry Value with Efficiency Improvements In Gas-fired Power Plant Fleets Source: General Electric
  • 7. © Copyright 2015 Pivotal. All rights reserved. THE INTERNET OF THINGS JOURNEY STORE •  Structured •  Unstructured •  High Volume •  High Velocity ANALYZE •  Predictive Analytics •  Machine Learning •  Advance Data Science •  Realtime Analytics DEVELOP •  Advanced Analytic Pipelines •  Realtime Analytical Applications •  Global Scale Data-Driven Applications •  Enterprise, Consumer, IoT, and Mobile INNOVATE •  Agile Dev Expertise •  DevOps •  Hybrid Cloud •  Continuous Delivery •  Closed Loop Applications AGILE DEVELOPMENT BIG DATA PREDICTIVE ANALYTICS ENTERPRISE PAAS
  • 8. 8© Copyright 2015 Pivotal. All rights reserved. 0% of CIOs think their IT infrastructure is fully prepared for big data (3) 30% of companies have deployed advanced analytics, 11% big data analysis (4) 44% of new applications failed to meet performance expectations (5) 2X 90% of companies allocate at least 2X more cloud capacity than needed to ensure performance (6) But… 80% of CEOs thinking data mining and analysis are strategically important (1) 4% of companies use analytics effectively (2) (1) 2015 PWC CEO Survey; (2)2013 Baine and Company - The Value of Big Data; (3) 2014 IT Infrastructure Conversation - IBM; (4) Ernest and Young - 2014 Enterprise IT Trends and Investments; (5) 2014 Riverbed Tecnologies - The Transformers; (6) 2014 ElasticHosts CIO Study LARGE ENTERPRISE BIG DATA TROUBLE
  • 9. 9© Copyright 2015 Pivotal. All rights reserved. BIG DATA CHASM 70% of data generated by customers 80% of data stored 3% prepared for analysis 0.5% being analyzed <0.5% being operationalized 9 THE DATA DIVIDE
  • 10. 10© Copyright 2015 Pivotal. All rights reserved. Software Is Eating The World Data Is Fueling Software SOFTWARE IS EATING THE WORLD
  • 11. 11© Copyright 2015 Pivotal. All rights reserved. WE CHOSE PIVOTAL BECAUSE WE BELIEVE IT PROVIDES A 360-DEGREE VIEW OF THE PROCESS. FROM A DATA SCIENCE AND DATA TECHNOLOGY PERSPECTIVE, IT MEANS DELIVERING BEST-IN- CLASS DATA TECHNOLOGIES AND ENABLING THEM ON THEIR PLATFORM. “ ”
  • 12. 12© Copyright 2015 Pivotal. All rights reserved. ACROSS INDUSTRIES
  • 13. 13© Copyright 2015 Pivotal. All rights reserved. THE NEW DATA IMPERATIVES Converged Data & Cloud OpenData-Driven Apps
  • 14. 14© Copyright 2015 Pivotal. All rights reserved. THE BIG DATA PROBLEM Fragmentation ConstraintsComplexity
  • 15. 15© Copyright 2015 Pivotal. All rights reserved. •  Remove Lock-in •  Leverage Ecosystem •  Co-innovate GUIDING PRINCIPLES IN THE NEW ERA OPEN AGILE CLOUD-READY •  Shorten innovation cycles •  Reduce TCO •  Improve TTM •  Solve business problems •  Avoid lock-in •  Appropriate security
  • 16. 16© Copyright 2015 Pivotal. All rights reserved. JOURNEY TO A DATA-DRIVEN ENTERPRISE Deploy analytic apps and automate at scale Perform advanced analytics Discover insights Modernize data infrastructure
  • 17. 17© Copyright 2015 Pivotal. All rights reserved. Deploy analytic apps and automate at scale Perform advanced analytics Discover insights Modernize data infrastructure DATA-DRIVEN COMPANIES: USE MODERN DATA INFRASTRUCTURE
  • 18. 18© Copyright 2015 Pivotal. All rights reserved. MODERNIZE DATA INFRASTRUCTURE Elastic, Scale-out storage and processing Flexible data types and pipelining ETL on demand: low operational cost Expanded use cases Higher quality analytics Lowered storage/processing cost Less fragmented ecosystem Reduced vendor lock-in REQUIREMENTS BENEFITS Cloud friendly and open-source based
  • 19. 19© Copyright 2015 Pivotal. All rights reserved. Modernize data infrastructure Deploy analytic apps and automate at scale Perform advanced analytics Discover insights DATA-DRIVEN COMPANIES: STRATEGICALLY USE ADVANCED ANALYTICS
  • 20. 20© Copyright 2015 Pivotal. All rights reserved. ADVANCED ANALYTICS Leverage existing skills and tools Rapid time to insights Internet of Things use cases Rapid time to insights Solve business problems Predictive insights: proactive execution REQUIREMENTS BENEFITS Machine learning and advanced analytics 01010101010101 01001010101010 10101100101010 SQL- compliant batch and interactive queries Massive stream processing 0101010101010101001010 1010101010110010101010 10101010
  • 21. 21© Copyright 2015 Pivotal. All rights reserved. Modernize data infrastructure Perform advanced analytics Discover insights DATA-DRIVEN COMPANIES: INNOVATE AT SCALE Deploy analytic apps and automate at scale
  • 22. 22© Copyright 2015 Pivotal. All rights reserved. ANALYTIC APPS AND AUTOMATION AT SCALE Reduced time to action Low ‘analytics ó app-dev’ integration cost Reduced time to insights Flexible ingestion: low operating cost High performance: low operating cost Transactional safety: business critical ops REQUIREMENTS BENEFITS Low-latency, distributed in-memory transactions Resilient, scale-out messaging and object storage Agile analytic app-dev with enterprise PaaSPaaS
  • 23. 23© Copyright 2015 Pivotal. All rights reserved. JOURNEY TO A DATA-DRIVEN ENTERPRISE Deploy analytic apps and automate at scale Perform advanced analytics Discover insights Modernize data infrastructure Pivotal Data Science helps you move from BI to Data Science Pivotal Labs helps you move to an agile development of apps at scale Pivotal Data Engineering helps you move from data administration to data engineering
  • 24. 24© Copyright 2015 Pivotal. All rights reserved. •  Remove Lock-in •  Leverage Ecosystem •  Co-innovate GUIDING PRINCIPLES IN THE NEW ERA OPEN AGILE CLOUD-READY •  Shorten innovation cycles •  Reduce TCO •  Improve TTM •  Solve business problems •  Avoid lock-in •  Appropriate security
  • 25. 25© Copyright 2015 Pivotal. All rights reserved. World’s First Open Sourced, Enterprise-Class Data Portfolio + Open Data Platform PIVOTAL BIG DATA SUITE OPEN AGILE CLOUD-READY Modern Data Infrastructure + Advanced Analytics + Apps at Scale Multiple Cloud Deployment Models + Big Data Suite on Pivotal Cloud Foundry
  • 26. 26© Copyright 2015 Pivotal. All rights reserved. PIVOTAL BIG DATA SUITE
  • 27. 27© Copyright 2015 Pivotal. All rights reserved. Open sourcing all Pivotal Big Data Suite components including: WORLD’S FIRST OPEN SOURCED BIG DATA PORTFOLIO BUILDING ON SUCCESS OF CLOUD FOUNDRY FOUNDATION BUILT FOR ENTERPRISES Pivotal GemFire Pivotal HAWQPivotal Greenplum Database
  • 28. 28© Copyright 2015 Pivotal. All rights reserved. BUILT FOR ENTERPRISES Value added features: enterprise grade performance + robustness without lock-in •  Advanced Query Optimization in analytics •  WAN replication and continuous query in transactional processing Flexible Deployment models: align to business objectives and needs •  Balance cost objectives with policy and compliance requirements •  Leverage Pivotal’s pre-integration + certification on supported configurations Enterprise grade support: one throat to choke for the suite •  Focus on business problems – not on lifecycle management •  Expert support on Big Data Suite means reduced business risk
  • 29. 29© Copyright 2015 Pivotal. All rights reserved. •  Common core for Hadoop ecosystem •  Rapidly accelerated certifications, ecosystem development and enterprise-grade quality OpenDataPlatform.org OPEN
  • 30. 30© Copyright 2015 Pivotal. All rights reserved. AGILE Deploy analytic apps and automate at scale Perform advanced analytics Discover insights Modernize data infrastructure Spring XD Spark Pivotal HD & Open Data Platform Pivotal Greenplum Database Pivotal HAWQ Rabbit MQ Redis Pivotal GemFire Pivotal BDS on PCF Pivotal Cloud Foundry
  • 31. 31© Copyright 2015 Pivotal. All rights reserved. CLOUD-READY COMMODITY HARDWARE APPLIANCE HYBRID CLOUDCLOUD IaaS IaaS PAAS
  • 32. 32© Copyright 2015 Pivotal. All rights reserved. THE INTERNET OF THINGS JOURNEY WITH PIVOTAL BIG DATA SUITE STORE •  Structured •  Unstructured •  High Volume •  High Velocity ANALYZE •  Predictive Analytics •  Machine Learning •  Advance Data Science •  Realtime Analytics DEVELOP •  Advanced Analytic Pipelines •  Realtime Analytical Applications •  Global Scale Data-Driven Applications •  Enterprise, Consumer, IoT, and Mobile INNOVATE •  Agile Dev Expertise •  DevOps •  Hybrid Cloud •  Continuous Delivery •  Closed Loop Applications AGILE DEVELOPMENT BIG DATA PREDICTIVE ANALYTICS ENTERPRISE PAAS Spring XD Spark Pivotal HD & Open Data Platform Spring XD Pivotal Greenplum Database Pivotal HAWQ Spring XD Pivotal GemFire Redis Rabbit MQ Spring IO Groovy Pivotal BDS on PCF Pivotal Cloud Foundry Pivotal LabsData ScienceData Engineering
  • 33. 33© Copyright 2015 Pivotal. All rights reserved. WHY PIVOTAL FOR BIG DATA ? Complete platform SQL on Hadoop leadership Deployment options Open source Flexible licensing Advanced data services Pivotal Data Engineering Pivotal LabsPivotal Data Science
  • 34. 34© Copyright 2015 Pivotal. All rights reserved. Complete platform SQL on Hadoop leadership Deployment options Open source Flexible licensing Advanced data services WHY PIVOTAL FOR BIG DATA ? Pivotal Data Engineering Pivotal LabsPivotal Data Science
  • 35. 35© Copyright 2015 Pivotal. All rights reserved. FOR FURTHER INFO, CHECKOUT… •  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data •  Pivotal Blog @ http://blog.pivotal.io •  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal •  Pivotal Academy @ https://pivotal.biglms.com Or reach out to your local Pivotal Account Executive…
  • 36. 36© Copyright 2015 Pivotal. All rights reserved. 36© Copyright 2013 Pivotal. All rights reserved. Pivotal Data Science Overview and Use Cases Pivotal Big Data Roadshow
  • 37. 37© Copyright 2015 Pivotal. All rights reserved. DATA SCIENCE? App Development Analytics Business Intelligence Reporting Visualization Dashboards Insights Big Data Machine Learning Statistics Mathematics Time Series Algorithms Databases Software Modeling Queries Real-Time Sensors Predictive Models ETL Research Hadoop Distributed Computing MapReduce SQL In-Memory OLAP Text Mining Unstructured Data Open Source Decision Science Ad Hoc Queries Hacking In-Database Analytics Internet of Things Data Cleansing Sentiment
  • 38. 38© Copyright 2015 Pivotal. All rights reserved. •  ETL •  Unstructured •  Data Cleansing •  Sensors Data Related •  Algorithms •  Mathematics •  Time Series •  Statistics •  Predictive Modeling •  Machine Learning •  Text Mining •  Sentiment •  Map Reduce Fields of Study & Techniques •  Dashboards •  Insights •  Visualization •  Ad Hoc Queries •  Reporting Business Intelligence •  Software •  In-Database Analysis •  Distributed Computing •  Hadoop •  Open Source Implementation •  Big Data •  Decision Science •  Internet of Things •  Real-Time •  Hacking •  In-Memory Industry Buzzwords
  • 39. 39© Copyright 2015 Pivotal. All rights reserved. What is Data Science? The use of statistical and machine learning techniques on big multi-structured data in a distributed computing environment to identify correlations and causal relationships, classify and predict events, identify patterns and anomalies, and infer probabilities, interest, and sentiment. DRIVE AUTOMATED, LOW-LATENCY ACTIONS IN RESPONSE TO EVENTS OF INTEREST
  • 40. 40© Copyright 2015 Pivotal. All rights reserved. Gene Sequencing Smart Grids COST TO SEQUENCE ONE GENOME HAS FALLEN FROM $100M IN 2001 TO $10K IN 2011 TO $1K IN 2014 READING SMART METERS EVERY 15 MINUTES IS 3000X MORE DATA INTENSIVE Stock Market Social Media FACEBOOK UPLOADS 250 MILLION PHOTOS EACH DAY Billions of Data Points Oil Exploration Video Surveillance OIL RIGS GENERATE 25000 DATA POINTS PER SECOND Medical Imaging Mobile Sensors
  • 41. 41© Copyright 2015 Pivotal. All rights reserved. What is Big Data Analytics? Descriptive Analytics WHAT HAPPENED? Diagnostic Analytics WHY DID IT HAPPEN? Predictive Analytics WHAT WILL HAPPEN? Prescriptive Analytics HOW CAN WE MAKE IT HAPPEN? Complexity Value of Analytics ($)
  • 42. 42© Copyright 2015 Pivotal. All rights reserved. Pivotal Big Data Suite P L A T F O R M Data Science Toolkit KEY TOOLS KEY LANGUAGES SQL
  • 43. 43© Copyright 2015 Pivotal. All rights reserved. A single address for everything analytics Analytics with Pivotal Time-to-Insights FORECASTING CLUSTERING REGRESSION CLASSIFICATION OPTIMIZATION
  • 44. 44© Copyright 2015 Pivotal. All rights reserved. Smart Systems = Sensors + Digital Brain + Actuators Problem Formulation Modeling Step Data Step Application Step Data Science for Building Models Sensors & Actuators Data Lake
  • 45. 45© Copyright 2015 Pivotal. All rights reserved. 45© Copyright 2013 Pivotal. All rights reserved. Data Science Use Cases
  • 46. 46© Copyright 2015 Pivotal. All rights reserved. 46© Copyright 2013 Pivotal. All rights reserved. Smart Meter Analytics
  • 47. 47© Copyright 2015 Pivotal. All rights reserved. The Digital Brain: Making a Smart Grid Smarter! Action: Where (and when) to send trucks, preventive maintenance The Digital Brain: Uses Fourier transform extracts patterns and flags outliers/anomalies Input: Data from smart meters
  • 48. 48© Copyright 2015 Pivotal. All rights reserved. Smart Meter Analytics – Significant Use Cases •  Load profiling •  Theft prevention •  Demand prediction •  Load forecasting •  Root cause of power failures •  Black-out warning •  Anomaly detection •  Network topology error detection
  • 49. 49© Copyright 2015 Pivotal. All rights reserved. SOLUTION •  Analyze smart meter power data using unsupervised clustering techniques and detect anomalies based on distance metric in clusters •  Reduce time required to monitor and improve grid efficiencies •  Leveraged the MPP architecture of Pivotal GPDB and MADlib in-database machine learning library for fast computation at scale Electricity Network Load Profiling and Outlier Detection CUSTOMER A major smart grid infrastructure provider BUSINESS PROBLEM Profile power consumption patterns based on smart meter data and flag anomalous usage CHALLENGES •  Large volume of smart meter data (several months of data from 100s of thousands of meters) could not be analyzed effectively by legacy system •  Timely business insights on large scale smart grid infrastructure demand fast processing of data
  • 50. 50© Copyright 2015 Pivotal. All rights reserved. Electricity Network Load Profiling and Outlier Detection Dashboards for navigating clusters and outliers
  • 51. 51© Copyright 2015 Pivotal. All rights reserved. Network Topology Error Detection CUSTOMER A major utility BUSINESS PROBLEM Use load and voltage meter readings to determine errors in transformer network topologies CHALLENGES •  Time consuming process to detect network topology errors on entire network in legacy system •  Timely detection of network topology errors requires big data infrastructure and analytical capabilities SOLUTION •  For each transformer network in parallel, solve an LP to determine scale of topology error, which can be used to flag and rank anomalous network topologies •  Reduce time for topology error detection from several days/weeks to few minutes!
  • 52. 52© Copyright 2015 Pivotal. All rights reserved. 52© Copyright 2013 Pivotal. All rights reserved. Security and Fraud
  • 53. 53© Copyright 2015 Pivotal. All rights reserved. Attacker elevates access to important user, service and admin accounts, and specific systems Data is acquired from target servers and staged for exfiltration Data is exfiltrated via encrypted files over ftp to external, compromised machine at a hosting provider A handful of users are targeted by two phishing attacks: one user opens Zero day payload (CVE-02011-0609) The user machine is accessed remotely by Poison Ivy tool Advanced Persistent Threat (APT) APT Kill Chain 1 432 Phishing & Zero Day Attack Back Door Lateral Movement Data Gathering Exfiltrate 5
  • 54. 54© Copyright 2015 Pivotal. All rights reserved. Anomalous User-to-Resource Access Detection BUSINESS PROBLEM Detect anomalous user behaviors in the global enterprise computer network SUMMARY Given local-to-local communication data, identify anomalous users within an enterprise. Ÿ  Reduce malware-dwell time, typical 243 days Ÿ  Signature-based approaches cannot detect such behavior CHALLENGES 10 Billion events in 6 months; 15K+ network devices; No existing SIEM solutions can model user behavioral resource access baseline and enable anomaly detection in an adaptive and scalable architecture. SOLUTION An innovative Graph Mining based algorithmic framework with advanced Machine Learning. Network topology and temporal behaviors are both modeled. (Patent pending). Implemented in MPP and PL/R, enabling parallel model training and behavior risk scoring. Successfully identified DLP violating anomalous users.
  • 55. 55© Copyright 2015 Pivotal. All rights reserved. 55© Copyright 2013 Pivotal. All rights reserved. Financial Services
  • 56. 56© Copyright 2015 Pivotal. All rights reserved. Identifying and Pricing Cross-Sell Opportunities CUSTOMER A global financial services provider BUSINESS PROBLEM Identify cross-sell opportunities between two business arms of a financial institution. CHALLENGES Integration of large-scale data originating from multiple data warehouses. Developing predictive models to identify novel cross-sell opportunities within the financial institution. Evaluate the identified cross-sell opportunities by their revenue potential. SOLUTIONS Ÿ  Fast integration of data in Pivotal Greenplum Database. Ÿ  Predictive models and evaluation of profitability: –  Association rule. –  Logistic regression for each product offered. –  Estimation of revenue opportunity. Ÿ  On-demand reporting and visualization via custom dashboards connected to in-database models.
  • 57. 57© Copyright 2015 Pivotal. All rights reserved. Credit Risk Assessment and Stress Testing CUSTOMER A global financial services provider BUSINESS PROBLEM Speed up the process of compliance reporting and stress testing for Basel III. CHALLENGES Running the calculation procedures on the customer’s legacy database were time-consuming, therefore had to be done in overnight batch mode. SOLUTION Ÿ  Implement risk asset calculation and stress testing on Pivotal Greenplum Database. Ÿ  Three years of data was processed in well under 2 minutes, significantly faster than the customer’s current procedures. Ÿ  Connect an “in- database” visualization tool to Pivotal Greenplum Database via ODBC for on-demand reporting and visualization.
  • 58. 58© Copyright 2015 Pivotal. All rights reserved. Financial Compliance BUSINESS PROBLEM •  Ensure compliance with Dodd-Frank and Basel Committee regulations •  Identify underlying risk and fraud while reducing the compliance department’s overburdened Emails Chats Trades Transactions Policy Securities Phone Calls Watch Lists … Financial compliance Data Lake Data integration Data clean up Modeling Classification and ranking Analyst user interfaces Feedback Analytics Analyst feedback Data integration: e.g., append trade information with email and chat communications Data cleanup: e.g., identify newsletters and spam emails Modeling: •  Predictive modeling to flag messages and trades •  Graph and cohort analysis Analyst feedback Reviewed fraud instances included in periodic model refreshes SOLUTION Ÿ  A data lake platform coupled with cutting edge data science techniques Ÿ  Flexible user interface to promote an adaptive, continuously learning compliance framework
  • 59. 59© Copyright 2015 Pivotal. All rights reserved. Pivotal Topic & Sentiment Analysis Engine External Tables PXF HDFS Source: http Sink: hdfs Parallel Parsing of JSON (PL/Python) HAWQ Nightly Cron Jobs Topic Analysis through MADlib pLDA Unsupervised Sentiment Analysis (PL/Python) D3.js Spring XD Twitter Decahose (~55 million tweets/day)
  • 60. 60© Copyright 2015 Pivotal. All rights reserved. http://blog.pivotal.io/data-science-pivotal Check out the Pivotal Data Science Blog!
  • 61. 61© Copyright 2015 Pivotal. All rights reserved. FOR FURTHER INFO… •  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data •  Pivotal Blog @ http://blog.pivotal.io •  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal •  Pivotal Academy @ https://pivotal.biglms.com •  Or reach out to your local Pivotal Account Executive…
  • 62. 62© Copyright 2015 Pivotal. All rights reserved. Data Streaming and IoT Using Pivotal Big Data Suite
  • 63. 63© Copyright 2015 Pivotal. All rights reserved. Converging Trends Innovation New Data New Processes New Insights The Journey to the Data-Driven Enterprise Data Science and Machine Learning Big Data Internet of Things
  • 64. 64© Copyright 2015 Pivotal. All rights reserved. HDFS Data Lake Ingest Store Analytics Hard to change Labor intensive Inefficient Coding based No real-time information Based on expensive ETL Migrating from a Reactive, Static and Constrained Model…
  • 65. 65© Copyright 2015 Pivotal. All rights reserved. HDFSData Lake Expert System / Machine Learning In-Memory Real-Time Data Continuous Learning Continuous Improvement Continuous Adapting Data Stream Pipeline Multiple Data Sources Real-Time Processing Store Everything To Pro-Active, Self-Improving, Machine Learning Systems
  • 66. 66© Copyright 2015 Pivotal. All rights reserved. New York Times Research: http://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html “ 50-80% OF THE TIME ON DATA SCIENCE PROJECTS IS SPENT ON DATA WRANGLING ”
  • 67. 67© Copyright 2015 Pivotal. All rights reserved. Data Feeds Stream Processing Expert Systems Machine Learning Historical Data Business Value Smart Decisions Still… HDFS Data Lake
  • 68. 68© Copyright 2015 Pivotal. All rights reserved. Ingest Transform Sink SpringXD GemFire Data Stream Needs an Agile, Scalable and Fast Solution HAWQ GPDB Data Lake
  • 69. 69© Copyright 2015 Pivotal. All rights reserved. Ingest Transform Sink SpringXD Distributed Computing In-Memory Real-Time Data Spring XD Orchestrates and Automates all the Steps on Data Stream Pipelining Expert System / Machine Learning Extensible Open-Source Fault-Tolerant Horizontally Scalable HAWQ GPDB Data Lake
  • 70. 70© Copyright 2015 Pivotal. All rights reserved. INGEST / SINK PROCESS ANALYZE •  No coding required •  Dozens of built-in connectors •  Seamless integration with Kafka, Sqoop •  Create new connectors easily using Spring •  Call Spark, Reactor or RxJava •  Built-in configurable filtering, splitting and transformation •  Out-of-box configurable jobs for batch processing •  Import and invoke PMML jobs easily •  Call Python, R, Madlib and other tools •  Built-in configurable counters and gauges Spring XD State of the Art Data Pipeline Automation
  • 71. 71© Copyright 2015 Pivotal. All rights reserved. Ingest Transform Sink SpringXD Distributed Computing GemFire Provides Scalable, Low-Latency Data Access, Storage and Event Processing Expert System / Machine Learning GemFire Extensible Open-Source Fault-Tolerant Horizontally Scalable HAWQ GPDB Data Lake
  • 72. 72© Copyright 2015 Pivotal. All rights reserved. GemFire •  In-Memory Enterprise Data Grid •  Horizontally Scalable, Consistent, Highly Available •  Event handling •  Continuous Queries •  Enterprise Data Geo Distribution In-memory Real Time Data
  • 73. 73© Copyright 2015 Pivotal. All rights reserved. Ingest Transform Sink SpringXD Distributed Computing Pivotal Provides SQL Based Advanced Analytics Expert System / Machine Learning GemFire Extensible Open-Source Fault-Tolerant Horizontally Scalable Data Lake HAWQ GPDB
  • 74. 74© Copyright 2015 Pivotal. All rights reserved. HAWQ •  Massively Parallel Processing RDBMS on HADOOP •  ANSI SQL on Hadoop •  Extremely high performance for analytics (not like Hive) •  Stores all data directly on HDFS •  Open-Source Advanced SQL analytics in Hadoop Combining SQL with Hadoop is key for analytics SQL remains #1 choice for Data Science
  • 75. 75© Copyright 2015 Pivotal. All rights reserved. Ingest Transform Sink SpringXD Developers and Data Scientists Can Focus on the Business Value of Data GemFire Extensible Open-Source Fault-Tolerant Horizontally Scalable Data Lake HAWQ GPDB
  • 76. 76© Copyright 2015 Pivotal. All rights reserved. Data Streaming Reference Architecture Data Feeds Transactional Apps Analytic Apps Data Stream Pipeline Distributed Computing Real-Time Data Expert Systems & Machine Learning Advanced Analytics HDFSData Lake
  • 77. 77© Copyright 2015 Pivotal. All rights reserved. Data Streaming Reference Architecture Data Feeds Transactional Apps Analytic Apps Data Stream Pipeline HDFSData Lake GemFire HAWQ GPDB SpringXD
  • 78. 78© Copyright 2015 Pivotal. All rights reserved. “ SO WE ARE MOVING TO A WORLD WHERE THE MACHINES WE WORK WITH ARE NOT JUST INTELLIGENT; THEY ARE BRILLIANT.THEY ARE SELF-AWARE, THEY ARE PREDICTIVE, REACTIVE AND SOCIAL. IT'S A WORLD WHERE INFORMATION ITSELF BECOMES INTELLIGENT AND COMES TO US AUTOMATICALLY WHEN WE NEED IT WITHOUT HAVING TO LOOK FOR IT. ”MARCO ANNUNZIATA, GE
  • 79. 79© Copyright 2015 Pivotal. All rights reserved. The IoT Market Verticals Diversified Industrial Manufacturers Agriculture, Security, Retail Auto Manufacturers Media (via Mobile devices) Urban Infrastructure, Cities Consumer, Connected Home etc. Healthcare, Life Sciences
  • 80. 80© Copyright 2015 Pivotal. All rights reserved. “ THE MAGIC HAPPENS WHEN YOU MARRY THE TRADITIONAL ENGINEERING APPROACH WITH THE DATA SCIENCE ENABLED BY THE DATA LAKE. IT OPENS UP A WHOLE NEW WORLD OF POSSIBLE ‘WHAT IF’ QUESTIONS. ”DAVE BARTLETT, GE AVIATION
  • 81. 81© Copyright 2015 Pivotal. All rights reserved. GE Aviation – Big Data & IoT •  Goal •  Improve jet engine efficiency and increase service profitability •  Unable to store & analyze massive amounts of data for analytics •  Solution •  LARGE DATA SETS ingested via batch •  Store 100s TB of engine data in Hadoop (PHD) •  Open doors for industrial engineers to poke at data (HAWQ) •  FAST MACHINE LEARNING based algorithms •  2000x faster, 10x cheaper •  Customer portals for visibility
  • 82. 82© Copyright 2015 Pivotal. All rights reserved. “ THE REAL OPPORTUNITY FOR CHANGE...SURPASSING THE MAGNITUDE OF THE CONSUMER INTERNET...IS THE INDUSTRIAL INTERNET, AN OPEN, GLOBAL NETWORK THAT CONNECTS PEOPLE, DATA AND MACHINES. ”JEFF IMMELT, CEO, GE
  • 83. 83© Copyright 2015 Pivotal. All rights reserved. GE Energy – Fast Data & IoT •  Goal •  Failing gas turbines causing issues with power generation •  Unable to store & process fire-hose of data •  Solution •  HIGH VELOCITY data ingestion from Gas Turbines •  Store 10 TB of turbine data in memory (GemFire) •  VERY LOW LATENCY and HIGH SPEED data access •  “Predictive” Maintenance
  • 84. 84© Copyright 2015 Pivotal. All rights reserved. “ … YOU USE THOSE DEVICES TO INSPECT CARS AND KEEP TRACK OF INSPECTION RECORDS. THAT CAN THEN FLOW INTO ASSET MANAGEMENT SOFTWARE TO PROVIDE PREDICTIVE ANALYSIS OF WHEN THE ASSET NEEDS TO BE MAINTAINED ... WHEN YOU BOIL THAT DOWN TO BUSINESS, IT’S ABOUT COMPETITIVE ADVANTAGE. ”BRAD HOWELL, LODESTAR LOGISTICS
  • 85. 85© Copyright 2015 Pivotal. All rights reserved. GE Transportation – Big & Fast Data •  Goal •  Help rail companies manage locomotives better •  Fast data from tracks & Big data from sensors in locomotives •  Solutions •  MACHINE LEARNING MODEL built from combined data set (MADLib) •  REAL TIME SCORING of rail sensor data (Spring XD) •  REAL-TIME ALERTING of critical events via email & a REAL TIME DASHBOARD (Spring)
  • 86. 86© Copyright 2015 Pivotal. All rights reserved. “ WE EXPECT THE PRECISION AGRICULTURE SPACE TO CONTINUE TO GROW QUICKLY AS DATA BECOMES CHEAPER TO STORE AND EASIER TO MOVE FROM PLATFORM TO PLATFORM. WE ARE JUST BEGINNING TO EXPLORE ALL THE VALUE WE CAN CREATE FOR FARMERS WITH THESE TOOLS. ”BRETT BEGEMANN, MONSANTO
  • 87. 87© Copyright 2015 Pivotal. All rights reserved. Monsanto – Agriculture & IoT •  Goal •  Help farmers maximize crop yields •  Solution •  Use of Big Data to collect & store data from farm equipment •  Combine with climate and other information •  Build custom apps using an agile app dev platform (PCF)
  • 88. 88© Copyright 2015 Pivotal. All rights reserved. IoT - Need for a Platform A Better Customer Experience Pivotal Cloud Foundry Pivotal Big Data Suite Using Innovative Data-Driven Apps On a Integrated Platform
  • 89. 89© Copyright 2015 Pivotal. All rights reserved. The Connected Car Architecture INGESTION JSON / HTTP STREAM PROCESSING Spring XDTransformEnrich DATA LAKE Pivotal HDSink ADVANCED ANALYTICS HAWQ REAL-TIME DATA INSIGHTS GemFire MOBILE SERVICES MICROSERVICES Pivotal CF Dashboard Analytics App Simulator IoT APPS Rabbit MQ PUSH
  • 90. 90© Copyright 2015 Pivotal. All rights reserved. Horizontally Scalable Fault Tolerant Extensible Open-Source STREAM PROCESSING Spring XD Rabbit MQ DATA LAKE Pivotal HD ADVANCED ANALYTICS HAWQ ENRICHER PREDICTIVE ANALYTICS + Timestamp & GUID + MPG, rangE & route MOBILE APP JSON REAL-TIME DATA INSIGHTS GemFire CAR SENSOR Sink Tap DASHBOARD
  • 91. 91© Copyright 2015 Pivotal. All rights reserved. FOR FURTHER INFO, CHECKOUT… •  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data •  Pivotal Blog @ http://blog.pivotal.io •  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal •  Pivotal Academy @ https://pivotal.biglms.com •  Or reach out to your local Pivotal Account Executive…
  • 92. BUILT FOR THE SPEED OF BUSINESS