SlideShare a Scribd company logo
1 of 14
Download to read offline
Agile Deployment of
                                    Predictive Analytics on
                                                   Hadoop

         Faster Insights through Open Standards
                                                           Hadoop Summit 2012



     © 2012 Datameer, Inc. All rights reserved.

© 2012 Datameer, Inc. All rights reserved.        Page 1
Today s Session

                      Ulrich Rueckert                                      Michael Zeller
                      Data Scientist                                       CEO
                      Datameer                                             Zementis



    After this session, you will be able to…

    1.  Effectively deliver predictive solutions combining:
             a.  R, KNIME & Others               [Model Development]
             b.  Zementis Universal PMML Plug-in [Model Deployment & Execution]
             c.  Datameer                        [Scalable Hadoop Infrastructure]

    2.  Identify PMML as a vendor-neutral & open standard to:
             a.  Incorporate predictive models from virtually any commercial vendor or open source tool
             b.  Apply such models on Big Data

    3.  Leverage a lightweight, agile deployment process for predictive analytics to:
             a.  Accelerate time-to-market
             b.  Lower cost and complexity
             c.  Reuse existing predictive assets

© 2012 Datameer, Inc. All rights reserved.          Page 2
Who is Datameer?

     §  “Business Intelligence on top of Hadoop”
     §  Established 2009 by Hadoop and enterprise software veterans
     §  Offices in Silicon Valley, New York and Germany




     §  Some customers:




© 2012 Datameer, Inc. All rights reserved.   Page 3
Who is Zementis?

     §  Focus on Operational Predictive Analytics
     §  Offices in San Diego and Hong Kong
     §  Predictive Analytics Software Technology:
              •    ADAPA® Decision Engine (Predictive Models and Rules)
              •    ADAPA Add-in for Excel
              •    PMML Converter
              •    Universal PMML Plug-in (UPPI)


     §  Global Partner Network




© 2012 Datameer, Inc. All rights reserved.      Page 4
Big Data and Analytics


        §  People and Sensor Data
             •  Transaction records
             •  Social media
             •  Climate information                   90% of the data today
                                                      created in the last 2 years
             •  Mobile GPS signals
             •  Healthcare
             •  Smart Grid

        §  Benefits from Analytics
             •  Descriptive Analytics answers What happened?
             •  Predictive Analytics answers What will happen next?


© 2012 Datameer, Inc. All rights reserved.   Page 5
Operational Predictive Analytics

                                                                                                               Score Distribution
                                                                                                         1st Lien Stand-Alone Loans

                                                                    14%                              Goods
                                                                                                     Bads
                                                                    12%
                                                                                                     Poly. (Goods)
                                                                                                     Poly. (Bads)
                                                   % Within Class




                                                                    10%

                                                                    8%

                                                                    6%

                                                                    4%

                                                                    2%

                                                                    0%
                                                                           50

                                                                                100

                                                                                      150

                                                                                            200

                                                                                                   250

                                                                                                          300

                                                                                                                350

                                                                                                                      400

                                                                                                                            450

                                                                                                                                  500

                                                                                                                                        550

                                                                                                                                              600

                                                                                                                                                     650

                                                                                                                                                           700

                                                                                                                                                                 750

                                                                                                                                                                       800

                                                                                                                                                                             850

                                                                                                                                                                                   900

                                                                                                                                                                                         950

                                                                                                                                                                                               1000
                                                      % of Delinquent Loans per Month
                                                                                                                              Score
                                      90

                                      80
              % of Delinquent Loans




                                      70
                                                                                                                                               700
                                      60
                                                                                                                                               750
                                      50                                                                                                       800
                                      40                                                                                                       850
                                                                                                                                               900
                                      30
                                                                                                                                               950
                                      20

                                      10

                                      0
                                       Jan   Feb      Mar            Apr    May       Jun    Jul         Aug    Sep     Oct       Nov

                                                                                  Months




© 2012 Datameer, Inc. All rights reserved.                                                                                                    Page 6
From Model Building to Deployment

              Model Building                                     Model Deployment
                                                               Integration / Execution



                                                                      Datameer Server
                                                               	
  
                                                               	
          PMML	
  
                                                                            PMML	
  
                                                                             PMML	
  
                                                                          (models)	
  
                                                               	
          (models)	
  
                                                                            (models)	
  
                                             PMML
                                                	
             	
  
                                                               	
  
                                                               	
           UPPI	
  
                                                               	
  
                                                               	
  


                                                          Simple Deployment & Execution
                                                          1.  Upload PMML file(s) in DAS
                                                          2.  PMML turns into custom function
                                                          3.  Seamlessly score data in Datameer

© 2012 Datameer, Inc. All rights reserved.       Page 7
PMML
Predictive Model Markup Language



                                             •  PMML is an XML-based language used to define
                                             statistical and data mining models and to share these
                                             between compliant applications.

                                             •  Mature standard developed by the DMG (Data Mining
                                             Group) to avoid proprietary issues and incompatibilities
                                             and to deploy models.
 Transformations
                                             •  Supported by all leading data mining tools, commercial
                                             and open-source.

                                             •  Allows for the clear separation of tasks: Model
                                             development vs. model deployment.

                                             •  Eliminates the need for custom code and proprietary
      PMML book available on                 model deployment solutions.
          Amazon.com
                                             •  Uniform deployment platform ensures scalability and
                                             reliability of model execution.
© 2012 Datameer, Inc. All rights reserved.        Page 8
PMML: Predictive Model Management
  Integrating across all systems and processes



            Business Process




                                             PMML


                                                      IBM SmartCloud
         Applications                                 Amazon EC2
         CRM, ERP, EXCEL, etc.


© 2012 Datameer, Inc. All rights reserved.   Page 9
PMML: One Standard, One Process


                                                  Divisions



      Service Providers
                                                                 External Vendors




                                                       PMML




                                             Applications
© 2012 Datameer, Inc. All rights reserved.             Page 10
Demo Setup

    §  End-to-end Model Development Lifecycle
    §  PMML Standard as the Glue

Real-time Process
                                                                                                Understand
Improvement and ROI                             Model
                                                                                Data Analysis   Client s Data
                                              Deployment




                                                     Universal	
  
                                                      PMML	
  	
  
                                                      Plug-­‐In	
  


                                              Development
Demonstrate                                                                     Model Design    Build Model(s) to
                                                and Test
Model Performance                                                                               Unlock Hidden Value


 © 2012 Datameer, Inc. All rights reserved.                           Page 11
Demo: Annual Marketing Campaign

   §  Which customers should we
       target?                                                 2011                    2012
                                                             Campaign                Customer
   §  Split 2011 results in training                         Results                   List


       and test set
   §  Learn model on training set                                      Subset for
                                                                         Testing

   §  Apply model on test set                                                       Fine-Tuned
                                                                                      Prediction
                                                                                        Model
   §  Fine-tune model until                           Subset for       Prediction

       evaluation shows success                         Training          Model



   §  Apply final model on 2012
       customer list                                                      Model
                                                                        Evaluation
                                                                                     Campaign
                                                                                     Candidates




© 2012 Datameer, Inc. All rights reserved.   Page 12
Summary


•      Open Standards vs.                    •    Minimize Data Movement         •    Leverage Datameer UI
       Proprietary Code                      •    Massively Parallel Execution   •    Deploy in Minutes vs. Months
•      Best-of-Breed Tool Set                •    Scale with Business Demand     •    No Coding Skills Required




      Avoid Vendor                                                                     Ease of Use
        Lock-in                                    Hadoop-based                         Fast ROI
                                                  Scoring Paradigm
© 2012 Datameer, Inc. All rights reserved.                 Page 13
Online Resources




 §  Learn More About PMML
 §     Data Mining Group website                                 http://www.dmg.org
 §     Join LinkedIn PMML Discussion Group                       http://www.linkedin.com/groupRegistration?gid=2328634
 §     Articles, on-line videos, blogs                           http://www.zementis.com/community.htm



 §  Product Info
 §     On Demand Webinar                    http://data.datameer.com/power-of-big-data-insights-of-predictive-analytics/

 §     UPPI for Datameer                    http://www.zementis.com/DAS-plugin.htm



© 2012 Datameer, Inc. All rights reserved.                  Page 14

More Related Content

Viewers also liked

Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPaco Nathan
 
Deploying Data Science with Docker and AWS
Deploying Data Science with Docker and AWSDeploying Data Science with Docker and AWS
Deploying Data Science with Docker and AWSMatt McDonnell
 
A Short PMML Tutorial by LatentView
A Short PMML Tutorial by LatentViewA Short PMML Tutorial by LatentView
A Short PMML Tutorial by LatentViewramesh.latentview
 
PMML - Predictive Model Markup Language
PMML - Predictive Model Markup LanguagePMML - Predictive Model Markup Language
PMML - Predictive Model Markup Languageaguazzel
 
PMML Execution of R Built Predictive Solutions
PMML Execution of R Built Predictive SolutionsPMML Execution of R Built Predictive Solutions
PMML Execution of R Built Predictive Solutionsaguazzel
 
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0lisanl
 
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in MotionInfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in MotionAvadhoot Patwardhan
 
Using Docker Containers to Improve Reproducibility in Software and Web Engine...
Using Docker Containers to Improve Reproducibility in Software and Web Engine...Using Docker Containers to Improve Reproducibility in Software and Web Engine...
Using Docker Containers to Improve Reproducibility in Software and Web Engine...Vincenzo Ferme
 
Docker @ Data Science Meetup
Docker @ Data Science MeetupDocker @ Data Science Meetup
Docker @ Data Science MeetupDaniel Nüst
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data scienceCalvin Giles
 
Docker for data science
Docker for data scienceDocker for data science
Docker for data scienceCalvin Giles
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilChristian Frech
 
Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Martin Hiesboeck
 
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...Alex Zeltov
 
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopКонстантин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopMedia Gorod
 
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)SironaHealth
 
Healthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelHealthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelFrank Wang
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Spark Summit
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesAli Sanousi, MD, MBA, PhD
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingCascading
 

Viewers also liked (20)

Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and Hadoop
 
Deploying Data Science with Docker and AWS
Deploying Data Science with Docker and AWSDeploying Data Science with Docker and AWS
Deploying Data Science with Docker and AWS
 
A Short PMML Tutorial by LatentView
A Short PMML Tutorial by LatentViewA Short PMML Tutorial by LatentView
A Short PMML Tutorial by LatentView
 
PMML - Predictive Model Markup Language
PMML - Predictive Model Markup LanguagePMML - Predictive Model Markup Language
PMML - Predictive Model Markup Language
 
PMML Execution of R Built Predictive Solutions
PMML Execution of R Built Predictive SolutionsPMML Execution of R Built Predictive Solutions
PMML Execution of R Built Predictive Solutions
 
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0
Geospatial Toolkit Enhancements for IBM InfoSphere Streams V4.0
 
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in MotionInfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
InfoSphere Streams toolkits :Real-Time Analytics on Data in Motion
 
Using Docker Containers to Improve Reproducibility in Software and Web Engine...
Using Docker Containers to Improve Reproducibility in Software and Web Engine...Using Docker Containers to Improve Reproducibility in Software and Web Engine...
Using Docker Containers to Improve Reproducibility in Software and Web Engine...
 
Docker @ Data Science Meetup
Docker @ Data Science MeetupDocker @ Data Science Meetup
Docker @ Data Science Meetup
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data science
 
Docker for data science
Docker for data scienceDocker for data science
Docker for data science
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare
 
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
 
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopКонстантин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
 
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
 
Healthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelHealthcare Analytics Maturity Model
Healthcare Analytics Maturity Model
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life Sciences
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using Cascading
 

Similar to Agile deployment predictive analytics on hadoop

Managing a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectManaging a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectYottaa
 
Bango WiFi Market Data 1 Q09
Bango WiFi Market Data 1 Q09Bango WiFi Market Data 1 Q09
Bango WiFi Market Data 1 Q09Bango
 
E bootcamp right track manufacturing solutions 3 min pitch 04172013
E bootcamp right track manufacturing solutions 3 min pitch 04172013E bootcamp right track manufacturing solutions 3 min pitch 04172013
E bootcamp right track manufacturing solutions 3 min pitch 04172013Sola Lawal
 
"So – are we getting better?”
"So – are we getting better?”"So – are we getting better?”
"So – are we getting better?”AgileSparks
 
Business Services as a Resource to Business - Kristina Harrell
Business Services as a Resource to Business - Kristina HarrellBusiness Services as a Resource to Business - Kristina Harrell
Business Services as a Resource to Business - Kristina HarrellPAPartners
 
Embraer day 2011_ny_ds(1)
Embraer day 2011_ny_ds(1)Embraer day 2011_ny_ds(1)
Embraer day 2011_ny_ds(1)Embraer RI
 
Embraer Day NY 2011 - Defense and Security
Embraer Day NY 2011 - Defense and SecurityEmbraer Day NY 2011 - Defense and Security
Embraer Day NY 2011 - Defense and SecurityEmbraer RI
 
Smaato - NOAH12 San Francisco
Smaato - NOAH12 San FranciscoSmaato - NOAH12 San Francisco
Smaato - NOAH12 San FranciscoNOAH Advisors
 
3 things to start this afternoon to improve your paid search
3 things to start this afternoon to improve your paid search3 things to start this afternoon to improve your paid search
3 things to start this afternoon to improve your paid searchJonathan Beeston
 
The DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupThe DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupNorm Leitman
 
Cloudify summit2012 pub
Cloudify summit2012 pubCloudify summit2012 pub
Cloudify summit2012 pubGary Berger
 
Driving a High Performance Culture
Driving a High Performance CultureDriving a High Performance Culture
Driving a High Performance CultureBadr Al Badr
 
Ashnik corporate presentation Dec 2012
Ashnik corporate presentation Dec 2012Ashnik corporate presentation Dec 2012
Ashnik corporate presentation Dec 2012Sachin Dabir
 
Sapphire Online 2009 Or1005
Sapphire Online 2009 Or1005Sapphire Online 2009 Or1005
Sapphire Online 2009 Or1005Shereen Zubair
 
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics Cloud
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics CloudHow To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics Cloud
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics CloudWiiisdom
 
New IDC Research on Software Analysis & Measurement
New IDC Research on Software Analysis & MeasurementNew IDC Research on Software Analysis & Measurement
New IDC Research on Software Analysis & MeasurementCAST
 
Measuring interaction in digital publications
Measuring interaction in digital publicationsMeasuring interaction in digital publications
Measuring interaction in digital publicationsWAN-IFRA
 
Solarmer Energy Profile
Solarmer Energy ProfileSolarmer Energy Profile
Solarmer Energy Profilesolarmer
 
Transportation Systems Maturity
Transportation Systems MaturityTransportation Systems Maturity
Transportation Systems MaturityJBF Consulting
 

Similar to Agile deployment predictive analytics on hadoop (20)

Managing a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectManaging a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) Project
 
Bango WiFi Market Data 1 Q09
Bango WiFi Market Data 1 Q09Bango WiFi Market Data 1 Q09
Bango WiFi Market Data 1 Q09
 
E bootcamp right track manufacturing solutions 3 min pitch 04172013
E bootcamp right track manufacturing solutions 3 min pitch 04172013E bootcamp right track manufacturing solutions 3 min pitch 04172013
E bootcamp right track manufacturing solutions 3 min pitch 04172013
 
"So – are we getting better?”
"So – are we getting better?”"So – are we getting better?”
"So – are we getting better?”
 
Business Services as a Resource to Business - Kristina Harrell
Business Services as a Resource to Business - Kristina HarrellBusiness Services as a Resource to Business - Kristina Harrell
Business Services as a Resource to Business - Kristina Harrell
 
Embraer day 2011_ny_ds(1)
Embraer day 2011_ny_ds(1)Embraer day 2011_ny_ds(1)
Embraer day 2011_ny_ds(1)
 
Embraer Day NY 2011 - Defense and Security
Embraer Day NY 2011 - Defense and SecurityEmbraer Day NY 2011 - Defense and Security
Embraer Day NY 2011 - Defense and Security
 
Smaato - NOAH12 San Francisco
Smaato - NOAH12 San FranciscoSmaato - NOAH12 San Francisco
Smaato - NOAH12 San Francisco
 
3 things to start this afternoon to improve your paid search
3 things to start this afternoon to improve your paid search3 things to start this afternoon to improve your paid search
3 things to start this afternoon to improve your paid search
 
The DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupThe DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetup
 
Cloudify summit2012 pub
Cloudify summit2012 pubCloudify summit2012 pub
Cloudify summit2012 pub
 
Driving a High Performance Culture
Driving a High Performance CultureDriving a High Performance Culture
Driving a High Performance Culture
 
Ashnik corporate presentation Dec 2012
Ashnik corporate presentation Dec 2012Ashnik corporate presentation Dec 2012
Ashnik corporate presentation Dec 2012
 
Sapphire Online 2009 Or1005
Sapphire Online 2009 Or1005Sapphire Online 2009 Or1005
Sapphire Online 2009 Or1005
 
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics Cloud
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics CloudHow To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics Cloud
How To Convert Your SAP BusinessObjects Unused Licenses To SAP Analytics Cloud
 
New IDC Research on Software Analysis & Measurement
New IDC Research on Software Analysis & MeasurementNew IDC Research on Software Analysis & Measurement
New IDC Research on Software Analysis & Measurement
 
Measuring interaction in digital publications
Measuring interaction in digital publicationsMeasuring interaction in digital publications
Measuring interaction in digital publications
 
Solarmer Energy Profile
Solarmer Energy ProfileSolarmer Energy Profile
Solarmer Energy Profile
 
The dark side of IoT
The dark side of IoT The dark side of IoT
The dark side of IoT
 
Transportation Systems Maturity
Transportation Systems MaturityTransportation Systems Maturity
Transportation Systems Maturity
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Agile deployment predictive analytics on hadoop

  • 1. Agile Deployment of Predictive Analytics on Hadoop Faster Insights through Open Standards Hadoop Summit 2012 © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved. Page 1
  • 2. Today s Session Ulrich Rueckert Michael Zeller Data Scientist CEO Datameer Zementis After this session, you will be able to… 1.  Effectively deliver predictive solutions combining: a.  R, KNIME & Others [Model Development] b.  Zementis Universal PMML Plug-in [Model Deployment & Execution] c.  Datameer [Scalable Hadoop Infrastructure] 2.  Identify PMML as a vendor-neutral & open standard to: a.  Incorporate predictive models from virtually any commercial vendor or open source tool b.  Apply such models on Big Data 3.  Leverage a lightweight, agile deployment process for predictive analytics to: a.  Accelerate time-to-market b.  Lower cost and complexity c.  Reuse existing predictive assets © 2012 Datameer, Inc. All rights reserved. Page 2
  • 3. Who is Datameer? §  “Business Intelligence on top of Hadoop” §  Established 2009 by Hadoop and enterprise software veterans §  Offices in Silicon Valley, New York and Germany §  Some customers: © 2012 Datameer, Inc. All rights reserved. Page 3
  • 4. Who is Zementis? §  Focus on Operational Predictive Analytics §  Offices in San Diego and Hong Kong §  Predictive Analytics Software Technology: •  ADAPA® Decision Engine (Predictive Models and Rules) •  ADAPA Add-in for Excel •  PMML Converter •  Universal PMML Plug-in (UPPI) §  Global Partner Network © 2012 Datameer, Inc. All rights reserved. Page 4
  • 5. Big Data and Analytics §  People and Sensor Data •  Transaction records •  Social media •  Climate information 90% of the data today created in the last 2 years •  Mobile GPS signals •  Healthcare •  Smart Grid §  Benefits from Analytics •  Descriptive Analytics answers What happened? •  Predictive Analytics answers What will happen next? © 2012 Datameer, Inc. All rights reserved. Page 5
  • 6. Operational Predictive Analytics Score Distribution 1st Lien Stand-Alone Loans 14% Goods Bads 12% Poly. (Goods) Poly. (Bads) % Within Class 10% 8% 6% 4% 2% 0% 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 % of Delinquent Loans per Month Score 90 80 % of Delinquent Loans 70 700 60 750 50 800 40 850 900 30 950 20 10 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Months © 2012 Datameer, Inc. All rights reserved. Page 6
  • 7. From Model Building to Deployment Model Building Model Deployment Integration / Execution Datameer Server     PMML   PMML   PMML   (models)     (models)   (models)   PMML         UPPI       Simple Deployment & Execution 1.  Upload PMML file(s) in DAS 2.  PMML turns into custom function 3.  Seamlessly score data in Datameer © 2012 Datameer, Inc. All rights reserved. Page 7
  • 8. PMML Predictive Model Markup Language •  PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications. •  Mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models. Transformations •  Supported by all leading data mining tools, commercial and open-source. •  Allows for the clear separation of tasks: Model development vs. model deployment. •  Eliminates the need for custom code and proprietary PMML book available on model deployment solutions. Amazon.com •  Uniform deployment platform ensures scalability and reliability of model execution. © 2012 Datameer, Inc. All rights reserved. Page 8
  • 9. PMML: Predictive Model Management Integrating across all systems and processes Business Process PMML IBM SmartCloud Applications Amazon EC2 CRM, ERP, EXCEL, etc. © 2012 Datameer, Inc. All rights reserved. Page 9
  • 10. PMML: One Standard, One Process Divisions Service Providers External Vendors PMML Applications © 2012 Datameer, Inc. All rights reserved. Page 10
  • 11. Demo Setup §  End-to-end Model Development Lifecycle §  PMML Standard as the Glue Real-time Process Understand Improvement and ROI Model Data Analysis Client s Data Deployment Universal   PMML     Plug-­‐In   Development Demonstrate Model Design Build Model(s) to and Test Model Performance Unlock Hidden Value © 2012 Datameer, Inc. All rights reserved. Page 11
  • 12. Demo: Annual Marketing Campaign §  Which customers should we target? 2011 2012 Campaign Customer §  Split 2011 results in training Results List and test set §  Learn model on training set Subset for Testing §  Apply model on test set Fine-Tuned Prediction Model §  Fine-tune model until Subset for Prediction evaluation shows success Training Model §  Apply final model on 2012 customer list Model Evaluation Campaign Candidates © 2012 Datameer, Inc. All rights reserved. Page 12
  • 13. Summary •  Open Standards vs. •  Minimize Data Movement •  Leverage Datameer UI Proprietary Code •  Massively Parallel Execution •  Deploy in Minutes vs. Months •  Best-of-Breed Tool Set •  Scale with Business Demand •  No Coding Skills Required Avoid Vendor Ease of Use Lock-in Hadoop-based Fast ROI Scoring Paradigm © 2012 Datameer, Inc. All rights reserved. Page 13
  • 14. Online Resources §  Learn More About PMML §  Data Mining Group website http://www.dmg.org §  Join LinkedIn PMML Discussion Group http://www.linkedin.com/groupRegistration?gid=2328634 §  Articles, on-line videos, blogs http://www.zementis.com/community.htm §  Product Info §  On Demand Webinar http://data.datameer.com/power-of-big-data-insights-of-predictive-analytics/ §  UPPI for Datameer http://www.zementis.com/DAS-plugin.htm © 2012 Datameer, Inc. All rights reserved. Page 14