4. Solutions: new mindset
Yesterday:
- Static Applications
- Designed to fulfill current
requirements
- Performance Constraints
- Domain experts versus IT
experts
Tomorrow:
- Flexible Applications
- Designed to fulfill tomorrows
requirements
- Performance is not limiting
- Domain experts work hand in
hand with IT experts
5. Evolution using Neo4j
Neo4j Platform
Graph Transactions Graph Analytics
Data Integration
Development &
Admin
Analytics Tooling
Drivers & APIs Discovery & Visualization
Developers
Admins
Applications Business Users
Data Analysts
Data Scientists
3rd Party Tools
“The Graph Advantage”
Domain know-how Professional Services PS Packages
Graph Based Solution
6. Neo4j based Solutions
Neo4j Graph Based Solutions
- Neo4j DB / Platform
- Data Integration Platform
- Blueprint Datamodel
- Blueprint Architecture
- Domain know how
- Professional Services
7. Evolution using Neo4j
Neo4j enables Graph Based
Solutions with a need for:
- Agility
- Intuitiveness
- High Performance to support
connected data scenarios
- Scalable on traversing through
connected data
8. Speed: Real time query
enabled
Graph Based Solutions
Enables Up-Sell / Cross-sell
Key Features Added Value
360 degree view on data
Using data Connections as a value
Intuitive: Supports Business Needs
Flexible: enabled for
additional requirements
Finding patterns within the data
Detect anomalies
Prevent rather than detect
Enables conversation across Functions
Comply to regulations
What-if Analysis
Telco
OSS
GDPR
Fraud
Telco BSS
Recomm
endations
MDM
Resource efficient
10. The Impact of Fraud
The payment card fraud alone, constitutes
for over 16 billion dollar in losses for the
bank-sector in the US.
$16Bpayment card fraud in 2014*
Banking
$32Byearly e-commerce fraud**
Fraud in E-commerce is estimated
to cost over 32 billion dollars
annually is the US..
E-commerce
The impact of fraud on the insurance
industry is estimated to be $80 billion
annually in the US.
Insurance
$80Bestimated yearly impact***
*) Business Wire: http://www.businesswire.com/news/home/20150804007054/en/Global-Card-Fraud-Losses-Reach-16.31-Billion#.VcJZlvlVhBc
**) E-commerce expert Andreas Thim, Klarna, 2015
***) Coalition against insurance fraud: http://www.insurancefraud.org/article.htm?RecID=3274#.UnWuZ5E7ROA
24. Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation
Centric
Analysis of navigation
behavior and suspect
patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
PC:s
Mobile Phones
IP-addresses
User ID:s
Comparing Transaction
Identity Vetting
Traditional Fraud Detection Methods
25. Unable to detect
• Fraud rings
• Fake IP-adresses
• Hijacked devices
• Synthetic Identities
• Stolen Identities
• And more…
Weaknesses
DISCRETE ANALYSIS
Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation
Centric
Analysis of navigation
behavior and suspect
patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
Traditional Fraud Detection Methods
27. Revolving Debt
Number of Accounts
Normal behavior
Fraud Detection With Connected Analysis
Fraudulent pattern
28. CONNECTED ANALYSIS
Augmented Fraud Detection
Endpoint-Centric
Analysis of users and
their end-points
Navigation
Centric
Analysis of navigation
behavior and suspect
patterns
Account-Centric
Analysis of anomaly
behavior by channel
DISCRETE ANALYSIS
1. 2. 3.
Cross Channel
Analysis of anomaly
behavior correlated
across channels
4.
Entity Linking
Analysis of relationships
to detect organized
crime and collusion
5.
29. Preventing Fraud
Networks of People Processes and Transactions Ownership
E.g. e-commerce Fraud,
AML
E.g. detecting fraud rings,
finding connections and
shortest paths
E.g. AML, tax fraud, legal
entities
Data connections assist the business by identifying patterns
30. The Power of Cypher
Fraud Ring:
MATCH ring = (suspect:AccountHolder)-[*]->(contactInformation)<-[*..5]-(:AccountHolder)-[*]->(suspect)
RETURN ring
31. Top Tier Electronic
Payment Services
Case studyApply to AML regulations
Challenge
• Needed to apply to AML regulation
• Unability to provide reports out of RDBMS leading
systems
Transactions fragmented and transfered „from rings
to rings“
• Neo4j is used to store and report on transaction
over previous 24 months
• Business Users / Fraud Analysts are enabled to
investigate data and detect patters
Use of Neo4j
• Complies to Regulations
• Neo4j also enabled the company to detect potential
AML usage early and act against them
“We have been unable to detect
AML fraud patterns in the SQL
based operational systems.
Graphs and Graph visualisation
is a key enabler technology.”
– Top Tier Payment Service
Result/Outcome
33. What about Machine Learning?
Neo4j is an enabler technology:
• Automized detection of Fraud patterns via Cypher
• Detecting Paths
• Graph Algorithms (eg Centrality, Community)
• Algorithms as background tasks -> mark
corresponding nodes
• Automatically cancel Business Transactions
• Score identified patterns and weigh
• ….
34. Why Graph is Superior for Fraud
DetectionFraud Requirement Traditional Approaches Neo4j Approach
Find connected data patterns over
unlimited amount of „hops“
Complex queries with hundreds of join
tables
Simple single query traverses all
enterprise systems
Real-time acting on incoming events in
ever changing formats for potential fraud
Limitations inherited from SQL Database
Schema
Schema free database enables to
connect any nodes with each other
Effort required to add new data and
systems
Days to weeks to rewrite schema and
queries
Draw new data connections on the spot
Time to deployment Months to years Weeks to months
Response time to Fraud requests Minutes to hours per query Milliseconds per query
Form of Fraud Incidents / Investigations Text reports that are not visual and
prove very little
Visuals patterns and the path to follow
through your system
Bottom line Long, ineffective and expensive Easy, fast and affordable
37. Money
Transferring
Purchases Bank
Services Relational
database
Data Lake
+ Good for Map Reduce
+ Good for Analytical Workloads
– No holistic view
– Non-operational workloads
– Weeks-to-months processes Develop Patterns
Data Science-team
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
38. Money
Transferring
Purchases Bank
Services
Neo4j powers
360° view of
transactions in
real-time
Neo4j
Cluster
SENSE
Transaction
stream
RESPOND
Alerts &
notification
LOAD RELEVANT DATA
Relational
database
Data Lake
Visualization UI
Fine Tune Patterns
Develop Patterns
Data Science-team
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
39. Money
Transferring
Purchases Bank
Services
Neo4j powers
360° view of
transactions in
real-time
Neo4j
Cluster
SENSE
Transaction
stream
RESPOND
Alerts &
notification
LOAD RELEVANT DATA
Relational
database
Data Lake
Visualization UI
Fine Tune Patterns
Develop Patterns
Data Science-team
Merchant
Data
Credit
Score
Data
Other 3rd
Party
Data
Data-set used
to explore
new insights
41. Neo4j Database Cluster
Data Visualization
Neo4j APOC Fraud
Detection
Algorithms
Management
Dashboard
Neo4j Bolt Driver
Data Ingest
Mgmt.
…
Customer Data Sources / Systems / Applications
Legend:
Neo4j Provided Components
Custom built Neo4j/Customer
Customer/SI
Fraud Reports
Real Time Alerts
Batch
Data Buffering
(Queue)
Real-Time
Neo4j BrowserAdmin UI
UI for Fraud
Analysis
System Specific Adapters / Scripts / Connecters
Fraud Analysts Admin / SuperuserFraud Analysts Fraud Analysts
42. Neo4j powered Fraud Solution
Characteristic Benefit for Fraud Solution
• Agility • Constant catch up with fraudster techniques supported
• Enabled for Future Requirements
• Solution can be built iteratively
• Fast implementation cycles
• Schema free DB supports “connect anything”
• Intuitiveness • Enable Fraud Analysts to use Technology
• Using visualization to detect pattern
• Drilling into suspicious patterns
• Speed • Unlimited number of traversals to detect complex connections within the
data
• Response time enables fraud prevention
• Leverage Data Connections • 360 degree customer view enabled / provided
• Scalability • Hardware efficiency with real-time patterns
• TCO/ROI • Adding on top of existing infrastructure protects investments
44. Who can help?
Neo4j Platform
Graph Transactions Graph Analytics
Data Integration
Development &
Admin
Analytics Tooling
Drivers & APIs Discovery & Visualization
Developers
Admins
Applications Business Users
Data Analysts
Data Scientists
3rd Party Tools
“The Graph Advantage”
Domain know-how Professional Services PS Packages
Graph Based Solution
Professional Services:
- Extend and leverage Domain Expertise
- Best Practices
- Using Building Blocks
- Don’t “re-invent the wheel”
- Speed up development and deployment
- Access to Neo4j infrastructure
(Development, Support, Product
management)
45. How to build Next Generation Solutions
with Neo4j
Stefan Kolmar, VP Field Engineering
May 2018
Editor's Notes
Good afternoon,
Welcome to the talk “Next Generation Solutions built on Neo4j” where I would like to show you some possibilities how to take advantage of Graphs building Solutions.
My name is Stefan Kolmar and I’m running the Field Engineering for Neo4j in EMEA and APAC
For the agenda I would like to start with some general remarks how Neo4j fits building next generation solutions.Then I would like to get a bit more specific on the advantages for Fraud and Recommendations
And end with conclusions
Question is: with all changes in technology, new abilities to work with data and the ever increasind amount of data: is a new thinking required?
I would say: „Yes“
Yestredays application have been quite often static and have been designed to fulfill the current requirements. Performance access the data was a limiting factor – quite often there was a diconnect between domain experts and the ones who implement it afterwards: the IT experts
Click
what we need tomorrow is flexible applications, designed to fulfill ever changing requirements. If you use the right technology and you use it accordingly, performance is not a limiting factor any more.
Thhe goal is that domain experts work hand in hand with IT experts.
How can we get to those solutions?
Let me first highlight the evolution using Neo4:
With Neo4j as the graph database, we are building the Graph Platform as an entire infrastructure. Together with 3rd party tools you can build what I would call the graph advantage – an ecosystem based on graph to be used.
If you know use this Ecosystem with the Donain know how skillset, use Professional Services and potential Professional Services Packages, you can build up a graph based solution
Neo4j enables Graph Based Solutions with a need for:
Agility -- constantly changing requirements
Intuitiveness – so thhat everybody in your organsation can understand and influence the soluton
High Performance to support connected data scenarios
Scalable on traversing through connected data rather than building ad-hoc sql queries
Neo4j enables Graph Based Solutions with a need for:
Agility -- constantly changing requirements
Intuitiveness – so thhat everybody in your organsation can understand and influence the soluton
High Performance to support connected data scenarios
Scalable on traversing through connected data rather than building ad-hoc sql queries
Some categories of solutions we have seen and we have worked on with customers are in
Telco OSS and BSS
GDPR architectures as foundations for solutions,
Recommendations,
Fraud
And MDM
Key features of these solutions are
Just to set the stage… We’re not going to give you the complete market analysis on Fraud, you guys probably know this better than us. But what you can say, is that it’s a significant cost of business, and it’s getting increasingly complex.
Who are today’s fraudsters? Well, I’m sorry to say, but often they are more sophisticated than most systems one would have to prevent them. The biggest fraudsters don’t operate as lone anomalies today….
Who are today’s fraudsters?
Instead of an individual, fraudsters are typically organized in groups - also called Fraud rings
Instead of using their own identify, they manufacture identities - also called synthetic identities
They also perpetrate fraud using stolen identities and hijacked devices
Of course, there are many different types of fraud — and all with their own sets of complexity.
It can be Credit Card (which we will adress a little later)
The merchants themselves can be fraudulent
Fraud rings are a huge problem, and very hard to detect and stop with traditional systems.
Insurance fraud is of course very wide spread, as well as ecommerce fraud
But perhaps the most important fraud, is the fraud we don’t know about yet…
The challenge with fraud — especially from a data-perspective, is that it is so many things at the same time. First, it’s constantly evolving. It can appear both simple and complex at the same time. Sometimes a scheme involves few or many players, it appears both digitized and in analog forms, and there’s always that sensation that fraud is always “one step ahead”.
So let’s talk briefly about Fraud from a data-modeling perspective.
Because of the complex and varied nature of Fraud, all the detection efforts we have in place stores enormous amounts of data — For example — you think about it it terms of storing every transaction being made over a period of time. For example.
And finding clear anomalies, is one, sort of traditional way to go about detecting fraud. — A credit card couldn’t be in two different locations at the same time. For example
These patterns can occur in many different ways and complexity — however, the job of the fraud prevention-team is basically to react to patterns, by first 1) detecting them, and then 2) respond — and doing all this as fast as possible.
The key to achieve this is very much a question of the underlying technology.
If you store your data like this, as you would in a relational database store — you will probably have pretty good success with your discrete analysis — finding the anomalies we talked about earlier.
However, if you’re a looking to detect and respond to patterns, this structure wont be anywhere near sufficient. Instead what you need to do is re-imagine your data in the way it’s connected.
…which of course is data modelled as a graph. And this is how data i stored and queried in a graph database like Neo4j
Its obvious that traditional technologies which were aimed at individuals and their behavior are inadequate to detect and prevent sophisticated fraud rings. So why is that?
Let’s look at what traditional Fraud Detection looks like.
Endpoint-centric solution that analyzes the characteristics of the PC, mobile or telephony device used to access the enterprise system.
2) Navigation- and network-centric solution that analyzes the navigation of a session. - usually by IP address and user ID, to see if it looks anomalous relative to normal user or peer group behavior
3) — User- or entity-centric solution, in which transactions are compared to what is expected of the user or entity. To support identity prong, this layer also includes integration of external and internal data to help vet an identity, especially in a risky transaction (such as a new account application), or verify a suspect authentication or high-risk transaction.
What all these methods are examples of is discrete analysis, which is very effective if you know what to look for.
The weaknesses though, are that discrete analysis do a poor job when you want to detect Fraud rings, when someone is using fake IP-adresses,… the list…
The challenge with systems that focus on discrete analysis for fraud detection is their inability to detect fraud rings or synthetic identities as this requires additional context. Let’s take an example
[In this simple fraud detection approach to detect credit card fraud, it is relatively easy to spot outliers. But what if the fraudster commits fraud while still exhibiting normal behavior. Well - this is exactly how fraud rings operate]
[A fraud ring rarely strays outside the normal behavior band. Instead they operate within normal limits and commit widespread fraud. This is very hard to detect by systems that are looking for outliers or activities outside the normal band.]
Today, financial services firms need to augment their discrete analysis capability with connected analysis. Whether it is a fraud ring or a stolen/synthetic identities, its is powerful to use a graph database
So if you want to avoid Fraud to happen, you have a good chance to use the connections within the data,
Such as for a network of People -> detecting Fraud Rings
Or
Detect connections within processes and transactions
Or
Detecting connections within ownerships
To identify patterns for money laudry, tax fraud and legal entities
Cypher is your friend: with Cypher you get the abilities to traverse and use within your connections to easily identify connections/relationships
Who is using neo4j to identify fraud:
The first customer I wanted to highlight is a top tier payment service:
They were basically forced by reagulations to put systems in place to identify potential ant money laundry.
Their SQL based system were unable to handle the requirements and therefore they have built up a system based on graphs Neo4j
As a result they are able to group on one side several accounts connect as „senders“ of money and on the othher side thhe group of receivers.
If then as an example thousands of transactions are done from a group of people in lets say tel Aviv are sending to a group of people in Columbia, it is detected and can be investigated....
Can Neo4j also be used to support machine learning?
Neo4j is an enabler technology:
Automized detection of Fraud patterns via Cypher queries
Automized detection of Fraud Rings
Shortest Paths / Paths existing
Using Graph Algorithms as the foundation (eg Centrality, Community)
Run Pattern detecting Algorithms as background tasks and mark corresponding nodes
Automatically cancel Business Transactions (eg CC)
….
If we now compare using Neo4j with traditional approaches, we see that
Neo4j supports fnding connected patterns over unlimited amounts of hops … versus traditional approaches are limited by sql join limitations
Neo4j helps to support the ever changing needs of fraudsters changing the way they commit fraud by allowing to connect anything with anything versus SQL to rely on a schema with schema changes to be hard at best
New data systems can be rapidly integrated
With the Neo4j agility, deployment time can be greatly reduced
As a bootom line: Neo4j helps you to build an easy, fast and affordable solution.
If you see existing environments, you most often see relational databases assisting data science teams to support investigations. This is good for dicrete analysis, but does not provide a holistic view of data relationships.
To integrate the data, this data sources are then often pushed into a data lake.This is good for Map reduce algorithms, and it is good for analytical workloads.Still provides no holistic vies and is not good for operational workloads. Typically you talk here about processes to detect fraud which takes several days, if now more.
Now if you load the relevant data into Neo4j, you can provide a 360 degree view on the data in real-time. With appropriate visualisation you give data scientists the ability to access the data in real-time.Ecisting transactional system will be connected in a way that the transaction stream is loaded to get sense out of the transactions in conjunction with the existing connected data, and you can respond in real time with alerts and notifications.
The data set can be used to explore new insights and find and detect patterns.
So this is an example architecture with the building blocks you need to build a Fraud Solution based on Neo4j.In the center of this architecture you see Neo4j database as a cluster. Fraud detection algorithms can work automatically directly on the database to detect fraud patterns and derive connection. Applications on top designed for Fraud Analysts or Managers with dashboards could access the database via the Bolt driver. The Neo4j database is directly connected in batch and/or realtime with all other source databases to import the relevant data intially, and, feed detetctions from fraud investigations into othher systems. With that a CRM system can get relevant information so that as an example in a CRM system a suspicious pattern/customer is marked, or, a transaction such as a credit card transaction can be cancelled.
So if we summarize the characteristics of a Neo4j powered Fraud Solution, we can conclude that this adds value with
agility: you can constantly catch up with fraudster activities and you are enabled for future reuirements
Intuitiveness: Fraud analsysts are enabled to use the technology and can use visualisation techniques to detetct patterns
Speed: traversals are cheap and provide performance so thhat you can detect complex connections
You can Lvereage data connections to get a 360 degeree view on customers
Hardware is eficiently used with real time patterns
Who can help –> additional help? Move to conclusions
Good afternoon,
Welcome to the talk “Next Generation Solutions built on Neo4j” where I would like to show you some possibilities how to take advantage of Graphs building Solutions.
My name is Stefan Kolmar and I’m running the Field Engineering for Neo4j in EMEA and APAC