SlideShare a Scribd company logo
1 of 33
©2014 DesignMind. All Rights Reserved.
An Analytics Sandbox
in a World of Big Data
Roberto Arnetoli
roberto@designmind.com
Vice President,Big DataSolutions
Andrew Eichenbaum
andrew@designmind.com
Principal DataScience Consultant
Platfora
2
©2014 DesignMind. All Rights Reserved.
DesignMind’s Expertise and Offering
Power BI
Applications
Databases
Data Warehousing
Big Data
BI & Data Visualization
Information Sharing
& CollaborationCloud Computing
Data Science
3
©2014 DesignMind. All Rights Reserved.
Our Clients
4
©2014 DesignMind. All Rights Reserved.
Agenda
 Big Data and Self-Service Analytics
 Platfora
 Case Study: Peer-2-Peer Lending
 Demo
 Conclusion and Questions
5
©2014 DesignMind. All Rights Reserved.
Big Data and Self-Service Analytics
6
©2014 DesignMind. All Rights Reserved.
What is Big Data?
 Largedata sets
 excessive
retrievaland processing
time

structured and
unstructured collections
BIG DATA
7
©2014 DesignMind. All Rights Reserved.

volume
velocity
variety
Volum
e
Velocity
Variety
SQL
BIG DATA
SQL vs. Big Data
8
©2014 DesignMind. All Rights Reserved.
We tend to structure data

we tend to prepare,
transform and structuredata
 severaladvantages
-
-
-
-
 severalnon-trivial
disadvantages
-
-
-
Traditional
DataWarehouse
Big Data
Platform
9
©2014 DesignMind. All Rights Reserved.
For today’s Data Scientistsit issimply not enough!
mailfeeds
additional
databases
multimedia
logs social
geo
e-commerce
unstructured
text
web
Traditional
DataWarehouse
Big Data
Platform
10
©2014 DesignMind. All Rights Reserved.
mailfeeds
additional
databases
ia
social
web
Traditional
DataWarehouse
Big Data
Platform
For today’s Data Scientistsit issimply not enough!
 self-serviceanalyticsplatform
 ‘analyticssandbox’

significantly
reduce timeand costs
11
©2014 DesignMind. All Rights Reserved.
DesignMind chooses Platfora
 Microsoft Gold Data
PlatformPartnerand SilverBI
Partner
ClouderaPartner
PlatforaPartner

data
analyticswinning solution
maximize
thevalueof their data
makefact-based decisions
Big Data
Platform
Traditional
Data Warehouse
Self-Service
Analytics
12
©2014 DesignMind. All Rights Reserved.
Platfora
13
©2014 DesignMind. All Rights Reserved.
Platfora is an All in One Data Sandbox
Ingest
Select
Explore
14
©2014 DesignMind. All Rights Reserved.
Platfora Easily Ingests Data

Delimited Text XML JSON Raw Text Avro

15
©2014 DesignMind. All Rights Reserved.
Platfora MeansHands Off ETL



lenses
16
©2014 DesignMind. All Rights Reserved.
Platfora MeansHands Off ETL
 Platfora ETLprocessbacked by Hadoop
- Automaticcluster creation on multiple
platforms(Amazon,Cloudera,
Hortonworks)
- Cluster sizesfrom one node to many
 Automaticallyhandlesthe handoff of
multiple filesof any size to the cluster
 Scheduling available for data
reprocessing or updates
17
©2014 DesignMind. All Rights Reserved.
Platfora Allows for Easy Data Exploration

18
©2014 DesignMind. All Rights Reserved.
Typical Big Data Warehousing Stack
 complexlinear process
Data warehouse accesstools
have no easy way to accessthe
data from earlier stages
Only way to get new data in is
to reprocess the data at the
Ingestion and Transformation
levels
Ingest Select Explore
Transformation
I
n
g
e
t
s
i
o
n
19
©2014 DesignMind. All Rights Reserved.
Big Data Warehousing Tools
Pig
 Transformation
 Each step can be complexand need a
knowledgeablesupport staff
 Ingestion
 BI Tools  data warehousing
20
©2014 DesignMind. All Rights Reserved.
Platfora Sits Parallel to the Traditional Stack

Ingest Select Explore
Data Catalog VizboardsLenses
Transformation
I
n
g
e
t
s
i
o
n
21
©2014 DesignMind. All Rights Reserved.
Case Study: Peer-2-Peer Lending
22
©2014 DesignMind. All Rights Reserved.
What is P2P Lending



23
©2014 DesignMind. All Rights Reserved.

-
-
-

-
-
-
24
©2014 DesignMind. All Rights Reserved.
Completed Loans: Months to Last Payment
 Loans can complete in two ways: Charge Off
(Default) and Fully Paid
 Normal loan durations are 36 and 60 months.
 Early payoff and
Charge Offs follow the
same curve after two
months of payments.
 Loan Charge Off rate
is approximately 16%
for loans completed in
the first the first 18
months.
25
©2014 DesignMind. All Rights Reserved.
Loan Stats: Average Revolving to Maximum Credit
 When loans are in funding, can we find predictors of default?
 We look at loan applicants total revolving credit (e.g. credit
cards) vs the average revolving credit balance
26
©2014 DesignMind. All Rights Reserved.
Loan Stats: Average Revolving to Maximum Credit
27
©2014 DesignMind. All Rights Reserved.
Demo
28
©2014 DesignMind. All Rights Reserved.
Demo Notes

-
-

-
-


29
©2014 DesignMind. All Rights Reserved.
Conclusion
30
©2014 DesignMind. All Rights Reserved.
31
©2014 DesignMind. All Rights Reserved.
 Concluding Remarks
 Quick Introduction to Platfora and its abilities
- It is a data analytics sandbox that is complimentary
to current ETL/Warehouse implementations
- Allows data practitioners free range to access and
use new data easily
 Platfora can do a lot more than shown
 Platfora is extensible:
- UDFs allow access to almost any Java routine
- Data ingestion can be scheduled
32
©2014 DesignMind. All Rights Reserved.
Questions
33
©2014 DesignMind. All Rights Reserved.
www.designmind.com

More Related Content

What's hot

Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...VMware Tanzu
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoDenodo
 
Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Inside Analysis
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics WebinarEckerson Group
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Technologies
 
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal..."Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...Tech in Asia ID
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkCaserta
 
VYW_Online Live Story Pitch OK
VYW_Online Live Story Pitch OKVYW_Online Live Story Pitch OK
VYW_Online Live Story Pitch OKMarco Zampieri
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Looker
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketApigee | Google Cloud
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) Dataiku
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on HadoopCaserta
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubCloudera, Inc.
 
Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User Datameer
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudRaul Goycoolea Seoane
 
Best Practices for Development Apps for Big Data
Best Practices for Development Apps for Big DataBest Practices for Development Apps for Big Data
Best Practices for Development Apps for Big DataRaul Goycoolea Seoane
 
Data Exploration and Analytics for the Modern Business
Data Exploration and Analytics for the Modern BusinessData Exploration and Analytics for the Modern Business
Data Exploration and Analytics for the Modern BusinessDATAVERSITY
 

What's hot (20)

Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from Denodo
 
Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal..."Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
VYW_Online Live Story Pitch OK
VYW_Online Live Story Pitch OKVYW_Online Live Story Pitch OK
VYW_Online Live Story Pitch OK
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the Market
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
 
Rb wilmer peres
Rb wilmer peresRb wilmer peres
Rb wilmer peres
 
Best Practices for Development Apps for Big Data
Best Practices for Development Apps for Big DataBest Practices for Development Apps for Big Data
Best Practices for Development Apps for Big Data
 
Data Exploration and Analytics for the Modern Business
Data Exploration and Analytics for the Modern BusinessData Exploration and Analytics for the Modern Business
Data Exploration and Analytics for the Modern Business
 

Similar to Platfora - An Analytics Sandbox In A World Of Big Data

Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationIntellipaat
 
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceSkillspeed
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Software AG
 
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...webwinkelvakdag
 
ConIT's Service Stack and Toolchain
ConIT's Service Stack and ToolchainConIT's Service Stack and Toolchain
ConIT's Service Stack and ToolchainCode Runners
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsVMware Tanzu
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...IngridBuenaventura
 
Big Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsBig Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsCA Technologies
 
Turning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable InsightsTurning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable InsightsG3 Communications
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelLima Consulting Group
 
Emil Eifrém - The Data Platform for Today’s Intelligent Applications
Emil Eifrém - The Data Platform for Today’s Intelligent ApplicationsEmil Eifrém - The Data Platform for Today’s Intelligent Applications
Emil Eifrém - The Data Platform for Today’s Intelligent ApplicationsNeo4j
 
Big Data in Hong Kong -- Dr. Toa Charm
Big Data in Hong Kong -- Dr. Toa CharmBig Data in Hong Kong -- Dr. Toa Charm
Big Data in Hong Kong -- Dr. Toa Charmorcsab
 
Cloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera, Inc.
 
Big Data Enabled: How YARN Changes the Game
Big Data Enabled: How YARN Changes the GameBig Data Enabled: How YARN Changes the Game
Big Data Enabled: How YARN Changes the GameInside Analysis
 

Similar to Platfora - An Analytics Sandbox In A World Of Big Data (20)

Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in Finance
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things Data, Interconnectedness & The Internet of Things
Data, Interconnectedness & The Internet of Things
 
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...
VERSNEL INNOVATIE MET DATA SCIENCE - WERK SAMEN, OPERATIONALISEER EN SCHAAL M...
 
Taming the Beast: Extracting Value from Hadoop
Taming the Beast: Extracting Value from HadoopTaming the Beast: Extracting Value from Hadoop
Taming the Beast: Extracting Value from Hadoop
 
ConIT's Service Stack and Toolchain
ConIT's Service Stack and ToolchainConIT's Service Stack and Toolchain
ConIT's Service Stack and Toolchain
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and Tools
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
 
Big Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsBig Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business Results
 
Turning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable InsightsTurning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable Insights
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
 
Emil Eifrém - The Data Platform for Today’s Intelligent Applications
Emil Eifrém - The Data Platform for Today’s Intelligent ApplicationsEmil Eifrém - The Data Platform for Today’s Intelligent Applications
Emil Eifrém - The Data Platform for Today’s Intelligent Applications
 
Big Data in Hong Kong -- Dr. Toa Charm
Big Data in Hong Kong -- Dr. Toa CharmBig Data in Hong Kong -- Dr. Toa Charm
Big Data in Hong Kong -- Dr. Toa Charm
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Cloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UK
 
Big data/Hadoop/HANA Basics
Big data/Hadoop/HANA BasicsBig data/Hadoop/HANA Basics
Big data/Hadoop/HANA Basics
 
Big Data Enabled: How YARN Changes the Game
Big Data Enabled: How YARN Changes the GameBig Data Enabled: How YARN Changes the Game
Big Data Enabled: How YARN Changes the Game
 

More from Mark Ginnebaugh

Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Mark Ginnebaugh
 
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Mark Ginnebaugh
 
Microsoft SQL Server Relational Databases and Primary Keys
Microsoft SQL Server Relational Databases and Primary KeysMicrosoft SQL Server Relational Databases and Primary Keys
Microsoft SQL Server Relational Databases and Primary KeysMark Ginnebaugh
 
DesignMind Microsoft Business Intelligence SQL Server
DesignMind Microsoft Business Intelligence SQL ServerDesignMind Microsoft Business Intelligence SQL Server
DesignMind Microsoft Business Intelligence SQL ServerMark Ginnebaugh
 
San Francisco Bay Area SQL Server July 2013 meetings
San Francisco Bay Area SQL Server July 2013 meetingsSan Francisco Bay Area SQL Server July 2013 meetings
San Francisco Bay Area SQL Server July 2013 meetingsMark Ginnebaugh
 
Silicon Valley SQL Server User Group June 2013
Silicon Valley SQL Server User Group June 2013Silicon Valley SQL Server User Group June 2013
Silicon Valley SQL Server User Group June 2013Mark Ginnebaugh
 
Microsoft SQL Server Continuous Integration
Microsoft SQL Server Continuous IntegrationMicrosoft SQL Server Continuous Integration
Microsoft SQL Server Continuous IntegrationMark Ginnebaugh
 
Hortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopHortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopMark Ginnebaugh
 
Microsoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMicrosoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMark Ginnebaugh
 
Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013Mark Ginnebaugh
 
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMicrosoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMark Ginnebaugh
 
Fusion-io Memory Flash for Microsoft SQL Server 2012
Fusion-io Memory Flash for Microsoft SQL Server 2012Fusion-io Memory Flash for Microsoft SQL Server 2012
Fusion-io Memory Flash for Microsoft SQL Server 2012Mark Ginnebaugh
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012Microsoft Data Mining 2012
Microsoft Data Mining 2012Mark Ginnebaugh
 
Microsoft SQL Server PASS News August 2012
Microsoft SQL Server PASS News August 2012Microsoft SQL Server PASS News August 2012
Microsoft SQL Server PASS News August 2012Mark Ginnebaugh
 
Business Intelligence Dashboard Design Best Practices
Business Intelligence Dashboard Design Best PracticesBusiness Intelligence Dashboard Design Best Practices
Business Intelligence Dashboard Design Best PracticesMark Ginnebaugh
 
Microsoft Mobile Business Intelligence
Microsoft Mobile Business Intelligence Microsoft Mobile Business Intelligence
Microsoft Mobile Business Intelligence Mark Ginnebaugh
 
Microsoft SQL Server 2012 Cloud Ready
Microsoft SQL Server 2012 Cloud ReadyMicrosoft SQL Server 2012 Cloud Ready
Microsoft SQL Server 2012 Cloud ReadyMark Ginnebaugh
 
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMicrosoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMark Ginnebaugh
 
Microsoft SQL Server PowerPivot
Microsoft SQL Server PowerPivotMicrosoft SQL Server PowerPivot
Microsoft SQL Server PowerPivotMark Ginnebaugh
 
Microsoft SQL Server Testing Frameworks
Microsoft SQL Server Testing FrameworksMicrosoft SQL Server Testing Frameworks
Microsoft SQL Server Testing FrameworksMark Ginnebaugh
 

More from Mark Ginnebaugh (20)

Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015
 
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
 
Microsoft SQL Server Relational Databases and Primary Keys
Microsoft SQL Server Relational Databases and Primary KeysMicrosoft SQL Server Relational Databases and Primary Keys
Microsoft SQL Server Relational Databases and Primary Keys
 
DesignMind Microsoft Business Intelligence SQL Server
DesignMind Microsoft Business Intelligence SQL ServerDesignMind Microsoft Business Intelligence SQL Server
DesignMind Microsoft Business Intelligence SQL Server
 
San Francisco Bay Area SQL Server July 2013 meetings
San Francisco Bay Area SQL Server July 2013 meetingsSan Francisco Bay Area SQL Server July 2013 meetings
San Francisco Bay Area SQL Server July 2013 meetings
 
Silicon Valley SQL Server User Group June 2013
Silicon Valley SQL Server User Group June 2013Silicon Valley SQL Server User Group June 2013
Silicon Valley SQL Server User Group June 2013
 
Microsoft SQL Server Continuous Integration
Microsoft SQL Server Continuous IntegrationMicrosoft SQL Server Continuous Integration
Microsoft SQL Server Continuous Integration
 
Hortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopHortonworks Big Data & Hadoop
Hortonworks Big Data & Hadoop
 
Microsoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMicrosoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join Operators
 
Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013
 
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMicrosoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
 
Fusion-io Memory Flash for Microsoft SQL Server 2012
Fusion-io Memory Flash for Microsoft SQL Server 2012Fusion-io Memory Flash for Microsoft SQL Server 2012
Fusion-io Memory Flash for Microsoft SQL Server 2012
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012Microsoft Data Mining 2012
Microsoft Data Mining 2012
 
Microsoft SQL Server PASS News August 2012
Microsoft SQL Server PASS News August 2012Microsoft SQL Server PASS News August 2012
Microsoft SQL Server PASS News August 2012
 
Business Intelligence Dashboard Design Best Practices
Business Intelligence Dashboard Design Best PracticesBusiness Intelligence Dashboard Design Best Practices
Business Intelligence Dashboard Design Best Practices
 
Microsoft Mobile Business Intelligence
Microsoft Mobile Business Intelligence Microsoft Mobile Business Intelligence
Microsoft Mobile Business Intelligence
 
Microsoft SQL Server 2012 Cloud Ready
Microsoft SQL Server 2012 Cloud ReadyMicrosoft SQL Server 2012 Cloud Ready
Microsoft SQL Server 2012 Cloud Ready
 
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMicrosoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
 
Microsoft SQL Server PowerPivot
Microsoft SQL Server PowerPivotMicrosoft SQL Server PowerPivot
Microsoft SQL Server PowerPivot
 
Microsoft SQL Server Testing Frameworks
Microsoft SQL Server Testing FrameworksMicrosoft SQL Server Testing Frameworks
Microsoft SQL Server Testing Frameworks
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Platfora - An Analytics Sandbox In A World Of Big Data

  • 1. ©2014 DesignMind. All Rights Reserved. An Analytics Sandbox in a World of Big Data Roberto Arnetoli roberto@designmind.com Vice President,Big DataSolutions Andrew Eichenbaum andrew@designmind.com Principal DataScience Consultant Platfora
  • 2. 2 ©2014 DesignMind. All Rights Reserved. DesignMind’s Expertise and Offering Power BI Applications Databases Data Warehousing Big Data BI & Data Visualization Information Sharing & CollaborationCloud Computing Data Science
  • 3. 3 ©2014 DesignMind. All Rights Reserved. Our Clients
  • 4. 4 ©2014 DesignMind. All Rights Reserved. Agenda  Big Data and Self-Service Analytics  Platfora  Case Study: Peer-2-Peer Lending  Demo  Conclusion and Questions
  • 5. 5 ©2014 DesignMind. All Rights Reserved. Big Data and Self-Service Analytics
  • 6. 6 ©2014 DesignMind. All Rights Reserved. What is Big Data?  Largedata sets  excessive retrievaland processing time  structured and unstructured collections BIG DATA
  • 7. 7 ©2014 DesignMind. All Rights Reserved.  volume velocity variety Volum e Velocity Variety SQL BIG DATA SQL vs. Big Data
  • 8. 8 ©2014 DesignMind. All Rights Reserved. We tend to structure data  we tend to prepare, transform and structuredata  severaladvantages - - - -  severalnon-trivial disadvantages - - - Traditional DataWarehouse Big Data Platform
  • 9. 9 ©2014 DesignMind. All Rights Reserved. For today’s Data Scientistsit issimply not enough! mailfeeds additional databases multimedia logs social geo e-commerce unstructured text web Traditional DataWarehouse Big Data Platform
  • 10. 10 ©2014 DesignMind. All Rights Reserved. mailfeeds additional databases ia social web Traditional DataWarehouse Big Data Platform For today’s Data Scientistsit issimply not enough!  self-serviceanalyticsplatform  ‘analyticssandbox’  significantly reduce timeand costs
  • 11. 11 ©2014 DesignMind. All Rights Reserved. DesignMind chooses Platfora  Microsoft Gold Data PlatformPartnerand SilverBI Partner ClouderaPartner PlatforaPartner  data analyticswinning solution maximize thevalueof their data makefact-based decisions Big Data Platform Traditional Data Warehouse Self-Service Analytics
  • 12. 12 ©2014 DesignMind. All Rights Reserved. Platfora
  • 13. 13 ©2014 DesignMind. All Rights Reserved. Platfora is an All in One Data Sandbox Ingest Select Explore
  • 14. 14 ©2014 DesignMind. All Rights Reserved. Platfora Easily Ingests Data  Delimited Text XML JSON Raw Text Avro 
  • 15. 15 ©2014 DesignMind. All Rights Reserved. Platfora MeansHands Off ETL    lenses
  • 16. 16 ©2014 DesignMind. All Rights Reserved. Platfora MeansHands Off ETL  Platfora ETLprocessbacked by Hadoop - Automaticcluster creation on multiple platforms(Amazon,Cloudera, Hortonworks) - Cluster sizesfrom one node to many  Automaticallyhandlesthe handoff of multiple filesof any size to the cluster  Scheduling available for data reprocessing or updates
  • 17. 17 ©2014 DesignMind. All Rights Reserved. Platfora Allows for Easy Data Exploration 
  • 18. 18 ©2014 DesignMind. All Rights Reserved. Typical Big Data Warehousing Stack  complexlinear process Data warehouse accesstools have no easy way to accessthe data from earlier stages Only way to get new data in is to reprocess the data at the Ingestion and Transformation levels Ingest Select Explore Transformation I n g e t s i o n
  • 19. 19 ©2014 DesignMind. All Rights Reserved. Big Data Warehousing Tools Pig  Transformation  Each step can be complexand need a knowledgeablesupport staff  Ingestion  BI Tools  data warehousing
  • 20. 20 ©2014 DesignMind. All Rights Reserved. Platfora Sits Parallel to the Traditional Stack  Ingest Select Explore Data Catalog VizboardsLenses Transformation I n g e t s i o n
  • 21. 21 ©2014 DesignMind. All Rights Reserved. Case Study: Peer-2-Peer Lending
  • 22. 22 ©2014 DesignMind. All Rights Reserved. What is P2P Lending   
  • 23. 23 ©2014 DesignMind. All Rights Reserved.  - - -  - - -
  • 24. 24 ©2014 DesignMind. All Rights Reserved. Completed Loans: Months to Last Payment  Loans can complete in two ways: Charge Off (Default) and Fully Paid  Normal loan durations are 36 and 60 months.  Early payoff and Charge Offs follow the same curve after two months of payments.  Loan Charge Off rate is approximately 16% for loans completed in the first the first 18 months.
  • 25. 25 ©2014 DesignMind. All Rights Reserved. Loan Stats: Average Revolving to Maximum Credit  When loans are in funding, can we find predictors of default?  We look at loan applicants total revolving credit (e.g. credit cards) vs the average revolving credit balance
  • 26. 26 ©2014 DesignMind. All Rights Reserved. Loan Stats: Average Revolving to Maximum Credit
  • 27. 27 ©2014 DesignMind. All Rights Reserved. Demo
  • 28. 28 ©2014 DesignMind. All Rights Reserved. Demo Notes  - -  - -  
  • 29. 29 ©2014 DesignMind. All Rights Reserved. Conclusion
  • 30. 30 ©2014 DesignMind. All Rights Reserved.
  • 31. 31 ©2014 DesignMind. All Rights Reserved.  Concluding Remarks  Quick Introduction to Platfora and its abilities - It is a data analytics sandbox that is complimentary to current ETL/Warehouse implementations - Allows data practitioners free range to access and use new data easily  Platfora can do a lot more than shown  Platfora is extensible: - UDFs allow access to almost any Java routine - Data ingestion can be scheduled
  • 32. 32 ©2014 DesignMind. All Rights Reserved. Questions
  • 33. 33 ©2014 DesignMind. All Rights Reserved. www.designmind.com