SlideShare a Scribd company logo
1 of 23
Download to read offline
2© 2016 Pivotal Software, Inc. All rights reserved. 2© 2016 Pivotal Software, Inc. All rights reserved.
Large Scale Fraud Analytics
GemFire Greenplum Connector (G2C)
3© 2016 Pivotal Software, Inc. All rights reserved.
Background
Ÿ  Government fraud revenue retention program
Ÿ  Detecting & retaining ~$5B annually
–  Primary focus on identity theft
–  Processes up to 8 million cases per day
–  Current & historic data size ~60 TB (compressed)
Ÿ  Modifying architecture to integrate GemFire for scalable
Java-based business logic, web service integration, and
event driven design
4© 2016 Pivotal Software, Inc. All rights reserved.
Fraud Systems Simplified
Prepare
•  Ingest
•  Restructure (ETL)
Score
•  Model Evaluation
Disposition
•  Business Logic
•  Prioritization
Respond
•  Investigation
•  Stop Payments
Business Logic Engine
ETL
Reporting
In-db Analytics
Application Services
5© 2016 Pivotal Software, Inc. All rights reserved.
Case Study Architecture – Scaling Up
GemFire
Greenplum
Spring Boot App Services
Informatica w/ PWX (ETL)
Business Objects
(Reporting)
Legacy Logic
Implementation
Logic Engine
In-db Analytics
Greenplum
Prepare
•  Ingest
•  Restructure (ETL)
Score
•  Model Evaluation
Disposition
•  Business Logic
•  Prioritization
Respond
•  Investigation
•  Stop Payments
6© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal Greenplum (GPDB)
Ÿ  Postgres Community OSS
–  Original fork of 8.2.15
–  Massively parallel processing
database
Ÿ  Master coordinates queries
across segments databases
Ÿ  Supports in-database model
evaluation
–  MadLib, PL/R, SAS
GPDB
Logical
GPDB
Physical
GPDB
Software
Master
Segments
7© 2016 Pivotal Software, Inc. All rights reserved.
Initial Implementation
Ÿ  Fraud model results evaluated
by business logic engine
Ÿ  Flat file data extraction
–  Significant custom code to
construct required object model
–  Table à CSV à POJO
Ÿ  Shared element in an otherwise
distributed system
–  Performance considerations
GPDB
Legacy Logic
Implementation
8© 2016 Pivotal Software, Inc. All rights reserved.
Architecture Adjustments
Ÿ  New requirements introduced
external integrations
–  Drives desire for web-services
Ÿ  Desire to improve performance
& simplify codebase
Ÿ  Expanding business logic
–  Logic engine run as a GemFire
function
GemFire
GPDB
Legacy Logic
Implementation
Spring Boot (App Services)
9© 2016 Pivotal Software, Inc. All rights reserved. 9© 2016 Pivotal Software, Inc. All rights reserved.
GemFire Greenplum Connector
10© 2016 Pivotal Software, Inc. All rights reserved.
Context
Greenplum!
ANSI
SQL
Analytical
Parallel
Configurable Data
Load
GemFire!App 1App 1App 1
App 1App 1App 2
Native API
Rest /
HTTP
Transactional
Custom Apps
Transactional
data write
behind
Data Science,
Analytics & ML
11© 2016 Pivotal Software, Inc. All rights reserved.
GemFire Greenplum Connector (G2C)
Ÿ  Extension package for GemFire
Ÿ  Provides simple import and export of data between GemFire
regions & Greenplum tables
–  Parallel data motion leveraging Greenplum’s external table interface
Ÿ  Simple mapping between table rows and PdxInstance
–  Flat object relational mapping
–  Set of predefined type conversions
–  Configurable GemFire data collocation
12© 2016 Pivotal Software, Inc. All rights reserved.
Greenplum
Master
Segments GemFire
G2C Data Interfaces
JDBC /
ODBC
Data
Node
Data
Node
Control Logic
13© 2016 Pivotal Software, Inc. All rights reserved.
GpdbService is the primary entry
point for explicitly invoked data
motion
1.  Import - loads the full table
contents from Greenplum
2.  Export - sends region
contents to Greenplum
Sample Data Import / Export
Cache cache = CacheFactory.getAnyInstance();
GpdbService gpdb = GpdbService.getInstance(cache);
long count;
count = gpdb.importRegion(region);
count = gpdb.exportRegion(region);
1
2
14© 2016 Pivotal Software, Inc. All rights reserved.
Basic Cache Configuration
Configured via GemFire extension
framework
•  1) Each region maps to a jndi data
source back by Greenplum
•  2) Link an entity type and table
•  3) Declare a field to be used as the key
•  Compound keys supported
•  4) Define a mapping between the table
columns
•  Default auto-configuration
•  Optional name and column attributes for
naming convention changes
•  Class used to control type conversion
•  Set of built in types
<region name="Parent">
<region-attributes refid="PARTITION">
<partition-attributes/>
</region-attributes>
<gpdb:store datasource="datasource">
<gpdb:types>
<gpdb:pdx name="io.pivotal...entity.Parent"
table="parent">
<gpdb:id field="id" />
<gpdb:fields>
<gpdb:field name="name" />
<gpdb:field name="id" column="id" />
<gpdb:field name="income"
class="java.math.BigDecimal" />
</gpdb:fields>
</gpdb:pdx>
</gpdb:types>
</gpdb:store>
</region>
2
1
3
4
15© 2016 Pivotal Software, Inc. All rights reserved.
Configuring Collocation
Parent-child foreign key relationships
supported through collocation
1.  Compound keys configurations
result in a HashMap based key in
GemFire
2.  Provided partition resolver works
with compound keys
<region name="Child">
<...>
<partition-resolver>
<class-name>
io.pivotal.gemfire.gpdb.IdPartitionResolver
</class-name>
<parameter name="field">
<string>parentId</string>
</parameter>
</...>
<gpdb:id>
<gpdb:field ref="parentId" />
<gpdb:field ref="id" />
</gpdb:id>
<gpdb:fields>
<gpdb:field name="parentId"/>
<gpdb:field name="id" />
</...>
1
2
16© 2016 Pivotal Software, Inc. All rights reserved.
Configuring Automatic Synchronization
●  Data exported to Greenplum via
asynchronous eventing
○  Time and batch size triggers
available
●  Causes each GemFire member to
independently interact with Greenplum
○  Configure GPDB resource queues
accordingly
<region name="Child">
<...>
<gpdb:store datasource="datasource">
<gpdb:synchronize mode="automatic"
time-interval="3000"
persistent="false" />
<gpdb:types>
<...>
17© 2016 Pivotal Software, Inc. All rights reserved.
Case Study G2C Configuration Details
Ÿ  Existing required domain objects
–  Multiple many-to-one groupings
Ÿ  Wide tables / objects (500+ fields)
Ÿ  Data Collocation configured on
caseId
Ÿ  Source tables wrapped in views
CaseWrapper
-  caseId
-  …
ModelScores
-  caseId
-  …
Documents
-  caseId
-  …
PriorHistory
-  caseId
-  …
OtherData…
-  caseId
-  …
* *
* *
1
LogicResults
-  caseId
-  …
18© 2016 Pivotal Software, Inc. All rights reserved.
Simple Loading – Single Table per Object
:LoadTrigger :GPDBService :Region :AsyncEventLister :LogicEngine results:Region
Import()
put()
processEvents()
process()
put()
19© 2016 Pivotal Software, Inc. All rights reserved.
Complex Loading – Multiple Tables per Object
:MergeLoader :GPDBService :Region :LogicEngine results:Region
Import()
put()
process()
put()
par
assemble()
:LoadTrigger
executeFunction()
20© 2016 Pivotal Software, Inc. All rights reserved.
Impacts & Results
Ÿ  Simplified implementation & code reduction
Ÿ  Maintained or improved data motion rates
–  Case study CPU bound
–  Additional improvements in the backlog
Ÿ  Improved system throughput
21© 2016 Pivotal Software, Inc. All rights reserved. 21© 2016 Pivotal Software, Inc. All rights reserved.
Questions?
Join the Apache Geode Community!
•  Check out: http://geode.incubator.apache.org
•  Subscribe: user-subscribe@geode.incubator.apache.org
•  Download: http://geode.incubator.apache.org/releases/
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

More Related Content

What's hot

Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Build your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceBuild your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceApache Geode
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainData Con LA
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu
 
Spark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsSpark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsClaudiu Barbura
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Spark Summit
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive DataWorks Summit
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Databricks
 
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Stephen Darlington
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at ScaleElasticsearch
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Spark Summit
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...DataWorks Summit
 

What's hot (20)

Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
ApexMeetup Geode - Talk1 2016-03-17
ApexMeetup Geode - Talk1 2016-03-17ApexMeetup Geode - Talk1 2016-03-17
ApexMeetup Geode - Talk1 2016-03-17
 
Build your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceBuild your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open Source
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGain
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 
Spark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsSpark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internals
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
 
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at Scale
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
 

Viewers also liked

#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future DesignPivotalOpenSourceHub
 
#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode AdaptorPivotalOpenSourceHub
 
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...PivotalOpenSourceHub
 
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and GeodePivotalOpenSourceHub
 
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using GeodePivotalOpenSourceHub
 
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...PivotalOpenSourceHub
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache GeodePivotalOpenSourceHub
 
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)PivotalOpenSourceHub
 
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System ArchitecturesPivotalOpenSourceHub
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analyticsPivotalOpenSourceHub
 
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...PivotalOpenSourceHub
 
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and FuturePivotalOpenSourceHub
 
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & GeodePivotalOpenSourceHub
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
 
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed SystemsPivotalOpenSourceHub
 
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + DellGet Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + Dellskahler
 
Build Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + DellBuild Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + Dellskahler
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal GemfireIn-Memory Computing Summit
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Christian Tzolov
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 

Viewers also liked (20)

#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design
 
#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor
 
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
 
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
 
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
 
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
 
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
 
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
 
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
 
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future
 
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
 
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems
 
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + DellGet Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
 
Build Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + DellBuild Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + Dell
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 

Similar to #GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformDeepak Chandramouli
 
Spring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionSpring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionJohn Blum
 
Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Goutam Tadi
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to GreenplumDave Cramer
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetTigerGraph
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overviewjimliddle
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical ReplicationEDB
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Provectus
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Romit Mehta
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
 
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireSpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireJay Lee
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricNETWAYS
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit
 

Similar to #GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum (20)

CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Spring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionSpring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in Action
 
Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical Replication
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireSpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos Erotocritou
 

More from PivotalOpenSourceHub

Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabasePivotalOpenSourceHub
 
New Security Framework in Apache Geode
New Security Framework in Apache GeodeNew Security Framework in Apache Geode
New Security Framework in Apache GeodePivotalOpenSourceHub
 
Apache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationApache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationPivotalOpenSourceHub
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodePivotalOpenSourceHub
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServicePivotalOpenSourceHub
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivotalOpenSourceHub
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 PivotalOpenSourceHub
 
Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPivotalOpenSourceHub
 
Geode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarGeode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarPivotalOpenSourceHub
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015PivotalOpenSourceHub
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRPivotalOpenSourceHub
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo PivotalOpenSourceHub
 

More from PivotalOpenSourceHub (15)

Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum Database
 
New Security Framework in Apache Geode
New Security Framework in Apache GeodeNew Security Framework in Apache Geode
New Security Framework in Apache Geode
 
Apache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationApache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based Replication
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache Geode
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
 
Apache Geode Offheap Storage
Apache Geode Offheap StorageApache Geode Offheap Storage
Apache Geode Offheap Storage
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16
 
Build & test Apache Hawq
Build & test Apache Hawq Build & test Apache Hawq
Build & test Apache Hawq
 
Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh Shah
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Geode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarGeode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil Bawaskar
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

  • 1.
  • 2. 2© 2016 Pivotal Software, Inc. All rights reserved. 2© 2016 Pivotal Software, Inc. All rights reserved. Large Scale Fraud Analytics GemFire Greenplum Connector (G2C)
  • 3. 3© 2016 Pivotal Software, Inc. All rights reserved. Background Ÿ  Government fraud revenue retention program Ÿ  Detecting & retaining ~$5B annually –  Primary focus on identity theft –  Processes up to 8 million cases per day –  Current & historic data size ~60 TB (compressed) Ÿ  Modifying architecture to integrate GemFire for scalable Java-based business logic, web service integration, and event driven design
  • 4. 4© 2016 Pivotal Software, Inc. All rights reserved. Fraud Systems Simplified Prepare •  Ingest •  Restructure (ETL) Score •  Model Evaluation Disposition •  Business Logic •  Prioritization Respond •  Investigation •  Stop Payments Business Logic Engine ETL Reporting In-db Analytics Application Services
  • 5. 5© 2016 Pivotal Software, Inc. All rights reserved. Case Study Architecture – Scaling Up GemFire Greenplum Spring Boot App Services Informatica w/ PWX (ETL) Business Objects (Reporting) Legacy Logic Implementation Logic Engine In-db Analytics Greenplum Prepare •  Ingest •  Restructure (ETL) Score •  Model Evaluation Disposition •  Business Logic •  Prioritization Respond •  Investigation •  Stop Payments
  • 6. 6© 2016 Pivotal Software, Inc. All rights reserved. Pivotal Greenplum (GPDB) Ÿ  Postgres Community OSS –  Original fork of 8.2.15 –  Massively parallel processing database Ÿ  Master coordinates queries across segments databases Ÿ  Supports in-database model evaluation –  MadLib, PL/R, SAS GPDB Logical GPDB Physical GPDB Software Master Segments
  • 7. 7© 2016 Pivotal Software, Inc. All rights reserved. Initial Implementation Ÿ  Fraud model results evaluated by business logic engine Ÿ  Flat file data extraction –  Significant custom code to construct required object model –  Table à CSV à POJO Ÿ  Shared element in an otherwise distributed system –  Performance considerations GPDB Legacy Logic Implementation
  • 8. 8© 2016 Pivotal Software, Inc. All rights reserved. Architecture Adjustments Ÿ  New requirements introduced external integrations –  Drives desire for web-services Ÿ  Desire to improve performance & simplify codebase Ÿ  Expanding business logic –  Logic engine run as a GemFire function GemFire GPDB Legacy Logic Implementation Spring Boot (App Services)
  • 9. 9© 2016 Pivotal Software, Inc. All rights reserved. 9© 2016 Pivotal Software, Inc. All rights reserved. GemFire Greenplum Connector
  • 10. 10© 2016 Pivotal Software, Inc. All rights reserved. Context Greenplum! ANSI SQL Analytical Parallel Configurable Data Load GemFire!App 1App 1App 1 App 1App 1App 2 Native API Rest / HTTP Transactional Custom Apps Transactional data write behind Data Science, Analytics & ML
  • 11. 11© 2016 Pivotal Software, Inc. All rights reserved. GemFire Greenplum Connector (G2C) Ÿ  Extension package for GemFire Ÿ  Provides simple import and export of data between GemFire regions & Greenplum tables –  Parallel data motion leveraging Greenplum’s external table interface Ÿ  Simple mapping between table rows and PdxInstance –  Flat object relational mapping –  Set of predefined type conversions –  Configurable GemFire data collocation
  • 12. 12© 2016 Pivotal Software, Inc. All rights reserved. Greenplum Master Segments GemFire G2C Data Interfaces JDBC / ODBC Data Node Data Node Control Logic
  • 13. 13© 2016 Pivotal Software, Inc. All rights reserved. GpdbService is the primary entry point for explicitly invoked data motion 1.  Import - loads the full table contents from Greenplum 2.  Export - sends region contents to Greenplum Sample Data Import / Export Cache cache = CacheFactory.getAnyInstance(); GpdbService gpdb = GpdbService.getInstance(cache); long count; count = gpdb.importRegion(region); count = gpdb.exportRegion(region); 1 2
  • 14. 14© 2016 Pivotal Software, Inc. All rights reserved. Basic Cache Configuration Configured via GemFire extension framework •  1) Each region maps to a jndi data source back by Greenplum •  2) Link an entity type and table •  3) Declare a field to be used as the key •  Compound keys supported •  4) Define a mapping between the table columns •  Default auto-configuration •  Optional name and column attributes for naming convention changes •  Class used to control type conversion •  Set of built in types <region name="Parent"> <region-attributes refid="PARTITION"> <partition-attributes/> </region-attributes> <gpdb:store datasource="datasource"> <gpdb:types> <gpdb:pdx name="io.pivotal...entity.Parent" table="parent"> <gpdb:id field="id" /> <gpdb:fields> <gpdb:field name="name" /> <gpdb:field name="id" column="id" /> <gpdb:field name="income" class="java.math.BigDecimal" /> </gpdb:fields> </gpdb:pdx> </gpdb:types> </gpdb:store> </region> 2 1 3 4
  • 15. 15© 2016 Pivotal Software, Inc. All rights reserved. Configuring Collocation Parent-child foreign key relationships supported through collocation 1.  Compound keys configurations result in a HashMap based key in GemFire 2.  Provided partition resolver works with compound keys <region name="Child"> <...> <partition-resolver> <class-name> io.pivotal.gemfire.gpdb.IdPartitionResolver </class-name> <parameter name="field"> <string>parentId</string> </parameter> </...> <gpdb:id> <gpdb:field ref="parentId" /> <gpdb:field ref="id" /> </gpdb:id> <gpdb:fields> <gpdb:field name="parentId"/> <gpdb:field name="id" /> </...> 1 2
  • 16. 16© 2016 Pivotal Software, Inc. All rights reserved. Configuring Automatic Synchronization ●  Data exported to Greenplum via asynchronous eventing ○  Time and batch size triggers available ●  Causes each GemFire member to independently interact with Greenplum ○  Configure GPDB resource queues accordingly <region name="Child"> <...> <gpdb:store datasource="datasource"> <gpdb:synchronize mode="automatic" time-interval="3000" persistent="false" /> <gpdb:types> <...>
  • 17. 17© 2016 Pivotal Software, Inc. All rights reserved. Case Study G2C Configuration Details Ÿ  Existing required domain objects –  Multiple many-to-one groupings Ÿ  Wide tables / objects (500+ fields) Ÿ  Data Collocation configured on caseId Ÿ  Source tables wrapped in views CaseWrapper -  caseId -  … ModelScores -  caseId -  … Documents -  caseId -  … PriorHistory -  caseId -  … OtherData… -  caseId -  … * * * * 1 LogicResults -  caseId -  …
  • 18. 18© 2016 Pivotal Software, Inc. All rights reserved. Simple Loading – Single Table per Object :LoadTrigger :GPDBService :Region :AsyncEventLister :LogicEngine results:Region Import() put() processEvents() process() put()
  • 19. 19© 2016 Pivotal Software, Inc. All rights reserved. Complex Loading – Multiple Tables per Object :MergeLoader :GPDBService :Region :LogicEngine results:Region Import() put() process() put() par assemble() :LoadTrigger executeFunction()
  • 20. 20© 2016 Pivotal Software, Inc. All rights reserved. Impacts & Results Ÿ  Simplified implementation & code reduction Ÿ  Maintained or improved data motion rates –  Case study CPU bound –  Additional improvements in the backlog Ÿ  Improved system throughput
  • 21. 21© 2016 Pivotal Software, Inc. All rights reserved. 21© 2016 Pivotal Software, Inc. All rights reserved. Questions?
  • 22. Join the Apache Geode Community! •  Check out: http://geode.incubator.apache.org •  Subscribe: user-subscribe@geode.incubator.apache.org •  Download: http://geode.incubator.apache.org/releases/