SlideShare a Scribd company logo
1 of 5
Download to read offline
Syoncloud Big Data for Retail Banking | Syoncloud

14/10/2013

Big Data Analytics

News and Events

Retail Banking

Risk Management

About Us

Contact

Syoncloud Big Data for Retail Banking
Syoncloud offers comprehensive Big Data / Data Science solution for retail banks.
We cover areas such as:
Individualization of product offers to existing clients
Early fraud detection and fraud damage mitigation
Prediction of products cancellations and client's defections
Optimal allocation of cash to ATMs and bank branches
Minimization of usage of expensive bank channels such as branch visits
Reliable assessment of clients for debt products

Common Datasets
Common Datasets are used as a foundation for complex analysis.

Creation of Common Datasets for Analysis Related to Bank's Clients
We create a dataset of monthly expenses and incomes categories for all clients, all their accounts and complete history. This dataset is
created from bank accounts movements, direct debits and standing orders. Each account movement is usually accompanied with type of
movement code such as electricity, phone bill, restaurant type code and so on. We also use merchant's name, description and comment
fields to categorize each transaction. Direct debits and standing orders are also accompanied with type codes.
We recognize several categories of expenses such as housing expenses (rent or mortgage), energy expenses (gas and electricity), food and
household related expenses, education (schools, books, courses), car expenses (fuel and repairs), restaurants, big ticket items (TV, furniture),
taxes, recreation and hobby, credit card and loan payments, luxury items and so on.
Income categories are salaries, dividends, tax refunds, social benefits, rental income, sales and so on. Simple regression analysis of this
dataset gives us overall trends for total expenses, incomes and savings as well as detail trends for each category of incomes and expenses for
each client.

Machine Learning and Predictions
We use full range of machine learning algorithms and models to make predictions. There are two broad categories supervised and
unsupervised algorithms.
Supervised learning algorithms use historical data to learn that certain combinations and values of inputs cause certain outputs. We create
models that are trained and verified on samples of historical data. Sample data can be chosen randomly but we have seen better results if we
categorize our datasets first. In case of customer dataset we create categories such as age, income, location based on town size, education
and savings. Each category is split into brackets. For example age category is split into 20 five years age brackets. We know how many
customers is in each age bracket so we can sample certain percentage of records from each age bracket. The same way we sample other
categories. These samples are ideal to see what category make largest contribution to overall results. For example we can see that education
makes largest contribution to accept certain investment product.
Unsupervised machine learning algorithms look for unknown patterns in available data.
For example we find patterns of unusual behaviour of clients to find early signs of frauds. In past we were limited by statistical analysis of
behaviour that was common for all clients all large groups of clients. We unsupervised learning models we can find patterns that surface
only in small number of records.

Individualization of Product Offers
Individualization of product offers to existing clients. Banks save money on expensive broad marketing campaigns for bank products.
Products will be offered only to customers that need them and are likely to accept them. Customers should see less of irrelevant offers. This
requires deep knowledge who accepted given products in past.

http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking

1/5
Syoncloud Big Data for Retail Banking | Syoncloud

14/10/2013

As an input for our models we use dataset of subscriptions to bank products and service for each client. This dataset includes previous
subscriptions and cancellation dates. We also use common dataset of incomes and expenses categories for each client and CRM data about
clients. We have created separate models for each product and subscription. In order to prepare suitable models we have to not only chose
and verify the best learning algorithm but also to find which categories and variables do have the biggest influence.

Early fraud detection and fraud damage mitigation
This includes detection of identity frauds, credit card frauds, wire frauds, attacks on internet and mobile banking and money laundering.
New types of frauds and new schemes require flexible and fast detection algorithms. In past banks used only statistical and rule based
algorithms to find if suspicious activity is taken place on customer's account. These algorithms were limited because they can only recognize
known frauds, they require expensive maintenance, they do not work with full history of each client and they have high level of false
positives.
We utilized dataset of known fraud cases. We have created several categories of these frauds such as overdraft fraud with stolen identity,
stolen credit card, consumer loan fraud, credit card top up with fraudulent check, stolen checks, skimming with card duplication, attacks on
online banking with stolen customer's credential and/or security devices, rogue online merchant frauds using credit cards and so on. We use
neuronal networks with back propagation, decision tree algorithms and classification to find patterns and unknown occurrences of these
frauds in our existing data.

Prediction of Product Cancellations and Client's defections
A prediction of bank products cancellations and client's defections is very time sensitive. Bank has just days to act before client irreversibly
decide to cancel a product or move to competition. Bank needs to identify clients who are likely to defect, contact them and pro-activelly
offer alternative products or solve client's issues. It is much cheaper to retain highly profitable clients than to attract them back.
We have used account movements, debit and credit card movements, clients dataset from CRM, product subscription dataset, call centre
and branch visits transactions and log information as primary data sources for our analysis. We have also utilized common datasets of
incomes and expenses.
We have prepared timeseries of key events such as direct debits cancellations, income to the account from salaries, dividends and rents,
transfers to client's accounts at different banks, call centre and branch contacts made by the client separated into categories, cancellations of
credit cards and so on.
We have prepared another set of clients that do match categories such as age, income, saving and location for the same time interval but
who still remain clients. We have prepared matching timeseries for these clients as well.
Based on this data we were able to create models that are able to predict behaviour of clients before they irreversibly decide to move to
competitors. We have used several supervised learning algorithms such as Support Vector Machines for binary classification and Neural
Network with Backpropagation for predictions. From unsupervised machine learning algorithms we have utilized K-Means and Mean Shift
Clustering after Principal Component Analysis was applied to reduce dimensions of input data.
We have identified several hundreds profitable clients in recent data who match patterns of clients who moved their accounts to
competitors. These clients should be contacted by their respective bank branches.

Optimal Allocation of Cash for ATMs and Bank Branches
Demand for cash is highly variable during year at many ATMs and bank branch locations. The variability is caused by weather, local events,
vacations, tourism and so on. It is important to predict right amount cash that needs to be deposited into ATMs as well as bank branches. It
is costly to service ATMs too often, it is also costly to have cash machines out of order due lack of cash. In the same time we want to limit
amount of unnecessary cash that is stored for long times in ATMs and bank branches. It leads to suboptimal cash allocation as well as it
attracts crime.
As the primary datasets we have used ATM service logs, geographic locations of ATMs and bank branches, withdraws dataset for each ATM,
weather reports for ATMs and bank branch locations, schedules of sports, cultural or other events as well as holidays for all locations. We
have utilized credit and debit card movements to assess demand for cash at various locations and during different times of the year. We
have used common datasets of incomes to see when salaries, social benefits and other incomes arrived to client's accounts at different
locations.
We have created dataset of median amounts of cash withdraws for each day of the year and hour of day for all ATMs. This dataset is used to
calculate influence of weather, events, day of the week or holidays on demands for cash at given location.

http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking

2/5
Syoncloud Big Data for Retail Banking | Syoncloud

14/10/2013

We have prepared dataset of significant cultural, sport and other events during past 4 years with location coordinates. We have calculated
influence of each event on cash demand for all ATMs that are in 300m radius of given event. We were able to sort all events based on
influence on cash demand. This dataset is used for predictions of influence of similar events.
We have also calculated correlation between local weather parameters such precipitation, temperature and wind at location of each ATM
with cash demand.
We have created correlation dataset between days when clients receive incomes, such as salaries and social benefits, and cash demands at
different locations.
We have prepared models that can predict cash demand for each day of the year for each ATM and bank branch location. This model takes
into results from historical datasets as well as weather forecast data and schedules of events. We have utilized algorithms such as Restricted
Boltzmann Machine, Perceptron and Gaussian Discriminative Analysis.

Minimize Use of Expensive Channels
We can minimize the use of expensive bank channels such as over-the-counter operations and other visits of bank branches as well as calls
to call centres.
This can be achieve by optimizations of online banking and mobile banking applications, help pages and wizards as well as optimization of
web pages on bank's websites. Another way to encourage reluctant clients to switch to cheaper channels is by targeted campaigns.
Our primary sources of data for analysis were web log files from online banking application as well as mobile banking applications. We have
also used bank accounts movements with codes of bank channels, dataset of call centre transactions, CRM dataset with information about
customers and dataset of transactions from bank branches.
An important dataset was complains and enquiries from call centre, emails, letters and branches. We have sorted this datasets by areas of
interest and correlated them with help web pages. We were able to identify help pages that were unclear and caused confusion and
unnecessary calls to call centre. We have also identified several operations in online banking that were complex and generated higher
amount of complains. We have uncovered several areas related to exchange rates during credit cards payments that were not covered by
help pages but were often discussed over the phone or even by bank branch visits. Changes made to bank products related web pages, self
helps, search optimizations, online banking operations and mobile banking applications can bring quick savings on outsourced call centres
and bank branch visits.
We have analysed results from marketing campaigns to move reluctant clients to online and mobile banking or self serving kiosks. We have
used correlation analysis and we have seen that broad marketing campaigns were not efficient. We have analyse patterns of bank clients
who recently moved most of the operations online. This gave us a tool to select portion of clients that are more likely to move online. These
customers should be targeted by personalized marketing campaigns or by demonstration of advantages at bank branches.

Assessment of Clients for Debt Products
In order to reliably assess risks and approve debt products to existing clients we need take into account not just current credit scores and
current disposable income of the clients but also complete history of the client as well as social context. This decreases risk for the bank as
well increase income from valuable clients who would be otherwise rejected.
As a primary source of data we have used common dataset of incomes and expenses, complete history of payment morale for credit cards,
consumer loans, mortgages, overdrafts and other debt products and CRM information about clients.
We have used Markov Chain stochastic process to assess debt and payment morale related behaviour of clients. This model was tested on
historical data of profitable and defaulted loans, credit cards and other debt products. We have noticed improved of reliability of credit scores
and we were able to suggest suitable alternative debt products for rejected clients.

Overview of Primary Datasets and Sizing Example
These are examples of primary datasets and sizing calculations. Each project is specific and not all datasets are available but data
sizing calculations are likely to be similar.
Account movements for all active and former clients. Given dataset includes complete history of account movements for all current and
savings accounts. This dataset contains 6 millions unique clients and 23 millions active and closed accounts. An average size of movements
per account is 1MB this give us 23TB of uncompressed de-normalized CSV files.

http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking

3/5
Syoncloud Big Data for Retail Banking | Syoncloud

14/10/2013

Dataset of debit and credit card movements contains 25 millions unique card Ids. We have on an average 3 thousand transactions per
single card number. Total number of records is 75 billions. Each record in uncompressed CSV form has 1kB. The total size of this dataset is
75TB.
Technical log files from internet and mobile banking applications have 50TB. These files include front-end Apache log files as well
as applications logs.
Bank transactions, requests for help and complains from call centre. This datasets contains bank transactions, requests for help
and complains from 1 million unique customers. An average number of interactions per customer is 35. Typical size of an interaction is
10kB. The total size of the dataset is 350GB.
CRM information about clients with historical values include personal information about customers such as employment, education,
age, family status. Dataset includes current and historical information for about 6 millions clients with typical size 100kB per client. Total
size is 600GB.
Direct debits and standing orders of bank clients with historical values. The typical number of standing orders and direct debits
per client with historical values is 50. A size of single record is 1kB. The total size of dataset for 6 millions clients and 50 records per client is
300GB.
Product subscriptions data for all clients with complete history. A typical number of current and historical subscriptions per single
client is 12. This includes accounts, mortgages, loans, credit cards and other bank products. We have 6 millions clients multiplied by 12
average number of subscriptions per client and multiplied by 1kB per subscription is 72GB.
Customer's data from branch visits. This dataset includes over-the-counter bank transactions, help requests, product subscriptions and
cancellations and complains. Typical number of interactions per client is 10. We do have large differences in utilization of branch services
among clients. 3 millions clients and 10kB per interaction means 300GB.
Dataset of debtors and dataset of failed applications for debt products. The total size of 1 million records in these datasets is 1GB
Help files usage from mobile and internet banking. 6 millions users multiplied by 1000 average number of clicks to help files
multiplied by 1kB an average size of the record is 6TB
The total size of all primary datasets is 156TB. The result is calculated as a simple sum such as: 75TB + 50TB + 23TB + 6TB + 600GB
+ 350GB + 300GB + 300GB + 72GB + 1GB = 156TB. We can reduce overall size by using compression and we can remove technical fields
that do not carry any business meaning from the datasets. Log files are also reduced by removing lines with no business meaning.

Implementation Steps
Isolation of sensitive data from Big Data analytics
In order to isolate Big Data analytics from sensitive data we remove clients' names, addresses, telephone numbers and emails during data
export processes.
The next step is to create process that replaces real credit and debit card numbers, account numbers and customer's Ids by randomly
generated numbers. These randomly generated numbers must be identical for the same entity across different datasets to enable analytics.
This process stores pairs of matching real numbers and randomly generated numbers into tables. These tables are stored in separate secure
relational database that is continuously updated. This database is also used to match randomly generated numbers with real numbers after
Big Data analysis are performed. This enables isolation of data scientists and administrators from sensitive information that is only
accessible to authorized bank's employees.

Extraction, Transformation and Loading of Primary Datasets
We do have initial ETL (Extraction, Transformation and Loading) of data and continuous processes of daily or hourly updates and imports
of recent data from production systems of the bank.
Initial extraction was performed by bank's production and backup systems. Data was extracted in denormalized text form in CSV or fix
length field formats. This form is an ideal for bulk uploads into Big Data systems. Denormalized form uses concrete values instead of
reference Ids as in relational databases.
Continuous data exports are channeled via JMS, MQ Series, CSV files and via Sqoop. Exported data are picked up by Big Data scripts such as
Pig or Hive. These scripts are triggered via Oozie processes.

Transformation of Input Data
Transformation rules and scripts are shared by initial and continuous ETL processes. We have used Pig and Hive scripts and Java written
UDF (User Defined Functions) to perform transformation steps. Oozie workflows were used to chain transformation steps.
We have used several practical rules for data transformations:
Various file formats are separated into its own directories inside HDFS (Hadoop file system)
Unprocessed and failed records are written into specific directories for manual investigation.
Intermediate result files are deleted after all transformation steps are successfully performed. This saves HDFS space as well as enable
to investigate and re-run incomplete transformations.
http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking

4/5
Syoncloud Big Data for Retail Banking | Syoncloud

14/10/2013

Pig and Hive scripts are kept simple and single purpose. This enables easy debugging and re-use.
Java UDFs are only used if given function was not available in standard library or in PiggyBank library.
Transformation scripts are reused for processing updates.

Pow ered by Drupal

http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking

5/5

More Related Content

What's hot

FinQLOUD platform for digital banking
FinQLOUD platform for digital bankingFinQLOUD platform for digital banking
FinQLOUD platform for digital bankingMaxim Orlovsky
 
World of digital banking v 2.0
World of digital banking v 2.0World of digital banking v 2.0
World of digital banking v 2.0Muthu Siva
 
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...Capgemini
 
Digital Banking vs. Branch Banking (Ashish Kumar)
Digital Banking vs. Branch Banking (Ashish Kumar)Digital Banking vs. Branch Banking (Ashish Kumar)
Digital Banking vs. Branch Banking (Ashish Kumar)2K13A19
 
How CGI is accelerating banks' digital transformation programs
How CGI is accelerating banks' digital transformation programsHow CGI is accelerating banks' digital transformation programs
How CGI is accelerating banks' digital transformation programsCGI Suomi
 
New trends in indian banking system
New trends in indian banking systemNew trends in indian banking system
New trends in indian banking systemRoy Thomas
 
Transformation and reconstruction of banks in the digital era
Transformation and reconstruction of banks in the digital eraTransformation and reconstruction of banks in the digital era
Transformation and reconstruction of banks in the digital eraAntonio Mazzone
 
Banking in the Digital Era - Microsoft India Perspective
Banking in the Digital Era - Microsoft India PerspectiveBanking in the Digital Era - Microsoft India Perspective
Banking in the Digital Era - Microsoft India PerspectiveMicrosoft India
 
Retail Banking Trends
Retail Banking TrendsRetail Banking Trends
Retail Banking Trendsguest14fb65
 
Innovations on Banking - Digital Banking Security in the Age of Open Banking
Innovations on Banking - Digital Banking Security in the Age of Open BankingInnovations on Banking - Digital Banking Security in the Age of Open Banking
Innovations on Banking - Digital Banking Security in the Age of Open BankingPetr Dvorak
 
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18Vladimir Ljubibratic
 
Selling Strategies Of Retail Banking
Selling Strategies Of Retail BankingSelling Strategies Of Retail Banking
Selling Strategies Of Retail BankingSuresh Singh
 
Retail banking challenges
Retail banking challenges Retail banking challenges
Retail banking challenges SCG International
 
Opportunities and challenges in Digital Banking pub
Opportunities and challenges in Digital Banking  pubOpportunities and challenges in Digital Banking  pub
Opportunities and challenges in Digital Banking pubYair Jacob Porat
 
Digital Disruption Nordic Retail Banking_10june_digital
Digital Disruption Nordic Retail Banking_10june_digitalDigital Disruption Nordic Retail Banking_10june_digital
Digital Disruption Nordic Retail Banking_10june_digitalIlkka Ruotsila
 
Bank of the future - Scenarios for online banking for millennials
Bank of the future - Scenarios for online banking for millennialsBank of the future - Scenarios for online banking for millennials
Bank of the future - Scenarios for online banking for millennialsEnhancers
 
Accenture banking 2016
Accenture banking 2016Accenture banking 2016
Accenture banking 2016ruttens.com
 
Retail banking
Retail banking Retail banking
Retail banking Syed Imbesat
 
Akanksha chawla XIMB
Akanksha chawla XIMBAkanksha chawla XIMB
Akanksha chawla XIMBING Vysya Bank
 

What's hot (20)

FinQLOUD platform for digital banking
FinQLOUD platform for digital bankingFinQLOUD platform for digital banking
FinQLOUD platform for digital banking
 
World of digital banking v 2.0
World of digital banking v 2.0World of digital banking v 2.0
World of digital banking v 2.0
 
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...
Experience-Led Digital Banking: Getting Customers to Buy with Low Cost Digita...
 
Digital Banking vs. Branch Banking (Ashish Kumar)
Digital Banking vs. Branch Banking (Ashish Kumar)Digital Banking vs. Branch Banking (Ashish Kumar)
Digital Banking vs. Branch Banking (Ashish Kumar)
 
How CGI is accelerating banks' digital transformation programs
How CGI is accelerating banks' digital transformation programsHow CGI is accelerating banks' digital transformation programs
How CGI is accelerating banks' digital transformation programs
 
New trends in indian banking system
New trends in indian banking systemNew trends in indian banking system
New trends in indian banking system
 
Transformation and reconstruction of banks in the digital era
Transformation and reconstruction of banks in the digital eraTransformation and reconstruction of banks in the digital era
Transformation and reconstruction of banks in the digital era
 
Banking in the Digital Era - Microsoft India Perspective
Banking in the Digital Era - Microsoft India PerspectiveBanking in the Digital Era - Microsoft India Perspective
Banking in the Digital Era - Microsoft India Perspective
 
Retail Banking Trends
Retail Banking TrendsRetail Banking Trends
Retail Banking Trends
 
Innovations on Banking - Digital Banking Security in the Age of Open Banking
Innovations on Banking - Digital Banking Security in the Age of Open BankingInnovations on Banking - Digital Banking Security in the Age of Open Banking
Innovations on Banking - Digital Banking Security in the Age of Open Banking
 
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18
Addiko Bank Digital Transformation Experience - Microsoft Sinergija 18
 
Retail banking v1.1
Retail banking v1.1Retail banking v1.1
Retail banking v1.1
 
Selling Strategies Of Retail Banking
Selling Strategies Of Retail BankingSelling Strategies Of Retail Banking
Selling Strategies Of Retail Banking
 
Retail banking challenges
Retail banking challenges Retail banking challenges
Retail banking challenges
 
Opportunities and challenges in Digital Banking pub
Opportunities and challenges in Digital Banking  pubOpportunities and challenges in Digital Banking  pub
Opportunities and challenges in Digital Banking pub
 
Digital Disruption Nordic Retail Banking_10june_digital
Digital Disruption Nordic Retail Banking_10june_digitalDigital Disruption Nordic Retail Banking_10june_digital
Digital Disruption Nordic Retail Banking_10june_digital
 
Bank of the future - Scenarios for online banking for millennials
Bank of the future - Scenarios for online banking for millennialsBank of the future - Scenarios for online banking for millennials
Bank of the future - Scenarios for online banking for millennials
 
Accenture banking 2016
Accenture banking 2016Accenture banking 2016
Accenture banking 2016
 
Retail banking
Retail banking Retail banking
Retail banking
 
Akanksha chawla XIMB
Akanksha chawla XIMBAkanksha chawla XIMB
Akanksha chawla XIMB
 

Viewers also liked

Retail banking
Retail bankingRetail banking
Retail bankingDharmik
 
Zeng Ming of Alibaba: "Making Sense of Big Data"
Zeng Ming of Alibaba: "Making Sense of Big Data"Zeng Ming of Alibaba: "Making Sense of Big Data"
Zeng Ming of Alibaba: "Making Sense of Big Data"sprie-stanford
 
Top 10 head of retail banking interview questions and answers
Top 10 head of retail banking interview questions and answersTop 10 head of retail banking interview questions and answers
Top 10 head of retail banking interview questions and answersbustteent
 
Enterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive TechnologiesEnterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive TechnologiesDataWorks Summit
 
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...IES VE
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
HeteroscedasticityGeethu Rangan
 
Big-Data in Health Care: Patient data analyses has great potential and risks
Big-Data in Health Care: Patient data analyses has great potential and risksBig-Data in Health Care: Patient data analyses has great potential and risks
Big-Data in Health Care: Patient data analyses has great potential and risksDr. Jonathan Mall
 
Big data in health care
Big data in health careBig data in health care
Big data in health careyogita gaikwad
 
HouseCanary - PCBC Presentation
HouseCanary - PCBC PresentationHouseCanary - PCBC Presentation
HouseCanary - PCBC PresentationHouseCanary
 
Types of Banks in India
Types of Banks in IndiaTypes of Banks in India
Types of Banks in IndiaAMRIT8721073830
 
Big data in real estate
Big data in real estateBig data in real estate
Big data in real estatecutmytaxes
 
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...Urjanet
 
Caiib retail banking sample questions by murugan
Caiib retail banking sample questions by muruganCaiib retail banking sample questions by murugan
Caiib retail banking sample questions by muruganVinayak Kamath
 
Autocorrelation
AutocorrelationAutocorrelation
AutocorrelationMuhammad Ali
 
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Perficient, Inc.
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health CareJeffrey Funk
 

Viewers also liked (20)

RETAIL BANKING
RETAIL BANKING RETAIL BANKING
RETAIL BANKING
 
Retail banking
Retail bankingRetail banking
Retail banking
 
Customer Journey Analytics and Big Data
Customer Journey Analytics and Big DataCustomer Journey Analytics and Big Data
Customer Journey Analytics and Big Data
 
Banking interview tips
Banking interview tipsBanking interview tips
Banking interview tips
 
Zeng Ming of Alibaba: "Making Sense of Big Data"
Zeng Ming of Alibaba: "Making Sense of Big Data"Zeng Ming of Alibaba: "Making Sense of Big Data"
Zeng Ming of Alibaba: "Making Sense of Big Data"
 
Top 10 head of retail banking interview questions and answers
Top 10 head of retail banking interview questions and answersTop 10 head of retail banking interview questions and answers
Top 10 head of retail banking interview questions and answers
 
Enterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive TechnologiesEnterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive Technologies
 
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...
IES Faculty - Intelligent Big Data: Opportunities for Real Estate Asset Manag...
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
Heteroscedasticity
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
Heteroscedasticity
 
Big-Data in Health Care: Patient data analyses has great potential and risks
Big-Data in Health Care: Patient data analyses has great potential and risksBig-Data in Health Care: Patient data analyses has great potential and risks
Big-Data in Health Care: Patient data analyses has great potential and risks
 
Big data in health care
Big data in health careBig data in health care
Big data in health care
 
HouseCanary - PCBC Presentation
HouseCanary - PCBC PresentationHouseCanary - PCBC Presentation
HouseCanary - PCBC Presentation
 
Types of Banks in India
Types of Banks in IndiaTypes of Banks in India
Types of Banks in India
 
Big data in real estate
Big data in real estateBig data in real estate
Big data in real estate
 
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...
SPARK16 Presentation: Connecting Facilities Performance Data with Your Real E...
 
Caiib retail banking sample questions by murugan
Caiib retail banking sample questions by muruganCaiib retail banking sample questions by murugan
Caiib retail banking sample questions by murugan
 
Autocorrelation
AutocorrelationAutocorrelation
Autocorrelation
 
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...Big Data in Financial Services: How to Improve Performance with Data-Driven D...
Big Data in Financial Services: How to Improve Performance with Data-Driven D...
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 

Similar to Syoncloud big data for retail banking, Syoncloud

Profile_ScoreMe_Solutions_BFSI.pdf
Profile_ScoreMe_Solutions_BFSI.pdfProfile_ScoreMe_Solutions_BFSI.pdf
Profile_ScoreMe_Solutions_BFSI.pdfKetanZaveri4
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking servicesMariyageorge
 
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...Kavika Roy
 
Applications of Data Science in Banking and Financial sector.pptx
Applications of Data Science in Banking and Financial sector.pptxApplications of Data Science in Banking and Financial sector.pptx
Applications of Data Science in Banking and Financial sector.pptxkarnika21
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analyticsPrasad Narasimhan
 
Data analytics in finance broucher
Data analytics in finance broucher Data analytics in finance broucher
Data analytics in finance broucher Daniel Thomas
 
Business analytics in banking sector
Business analytics in banking sectorBusiness analytics in banking sector
Business analytics in banking sectorVikhilSonna
 
Introduction To Banking Operations
Introduction To Banking OperationsIntroduction To Banking Operations
Introduction To Banking Operationsshubhamshete13
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bankPankaj Jha
 
IRJET- Prediction of Credit Risks in Lending Bank Loans
IRJET- Prediction of Credit Risks in Lending Bank LoansIRJET- Prediction of Credit Risks in Lending Bank Loans
IRJET- Prediction of Credit Risks in Lending Bank LoansIRJET Journal
 
Uses of analytics in the field of Banking
Uses of analytics in the field of BankingUses of analytics in the field of Banking
Uses of analytics in the field of BankingNiveditasri N
 
InData Labs. How we leverage Big Data - 5 use cases
InData Labs. How we leverage Big Data - 5 use casesInData Labs. How we leverage Big Data - 5 use cases
InData Labs. How we leverage Big Data - 5 use casesInData Labs
 
Data Science Use Cases in The Banking and Finance Sector
Data Science Use Cases in The Banking and Finance SectorData Science Use Cases in The Banking and Finance Sector
Data Science Use Cases in The Banking and Finance SectorSofiaCarter4
 
Credit Card Business Plan
Credit Card Business PlanCredit Card Business Plan
Credit Card Business PlanRaghavendra L Rao
 
Credit Card Customer Segmentation
Credit Card Customer SegmentationCredit Card Customer Segmentation
Credit Card Customer SegmentationBerkin Ozmen
 
Forte wares--credit-card-segmentation en
Forte wares--credit-card-segmentation enForte wares--credit-card-segmentation en
Forte wares--credit-card-segmentation enDaniella Varga
 
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...IRJET Journal
 

Similar to Syoncloud big data for retail banking, Syoncloud (20)

Artificial Intelligence in Banking
Artificial Intelligence in BankingArtificial Intelligence in Banking
Artificial Intelligence in Banking
 
Artificial Intelligence in Banking
Artificial Intelligence in BankingArtificial Intelligence in Banking
Artificial Intelligence in Banking
 
Profile_ScoreMe_Solutions_BFSI.pdf
Profile_ScoreMe_Solutions_BFSI.pdfProfile_ScoreMe_Solutions_BFSI.pdf
Profile_ScoreMe_Solutions_BFSI.pdf
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking services
 
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...
5 Applications of Data Science in FinTech: The Tech Behind the Booming FinTec...
 
Applications of Data Science in Banking and Financial sector.pptx
Applications of Data Science in Banking and Financial sector.pptxApplications of Data Science in Banking and Financial sector.pptx
Applications of Data Science in Banking and Financial sector.pptx
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
 
Fintech - MSME lending score card template for flow based lending
Fintech - MSME lending score card template for flow based lendingFintech - MSME lending score card template for flow based lending
Fintech - MSME lending score card template for flow based lending
 
Data analytics in finance broucher
Data analytics in finance broucher Data analytics in finance broucher
Data analytics in finance broucher
 
Business analytics in banking sector
Business analytics in banking sectorBusiness analytics in banking sector
Business analytics in banking sector
 
Introduction To Banking Operations
Introduction To Banking OperationsIntroduction To Banking Operations
Introduction To Banking Operations
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bank
 
IRJET- Prediction of Credit Risks in Lending Bank Loans
IRJET- Prediction of Credit Risks in Lending Bank LoansIRJET- Prediction of Credit Risks in Lending Bank Loans
IRJET- Prediction of Credit Risks in Lending Bank Loans
 
Uses of analytics in the field of Banking
Uses of analytics in the field of BankingUses of analytics in the field of Banking
Uses of analytics in the field of Banking
 
InData Labs. How we leverage Big Data - 5 use cases
InData Labs. How we leverage Big Data - 5 use casesInData Labs. How we leverage Big Data - 5 use cases
InData Labs. How we leverage Big Data - 5 use cases
 
Data Science Use Cases in The Banking and Finance Sector
Data Science Use Cases in The Banking and Finance SectorData Science Use Cases in The Banking and Finance Sector
Data Science Use Cases in The Banking and Finance Sector
 
Credit Card Business Plan
Credit Card Business PlanCredit Card Business Plan
Credit Card Business Plan
 
Credit Card Customer Segmentation
Credit Card Customer SegmentationCredit Card Customer Segmentation
Credit Card Customer Segmentation
 
Forte wares--credit-card-segmentation en
Forte wares--credit-card-segmentation enForte wares--credit-card-segmentation en
Forte wares--credit-card-segmentation en
 
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
 

Recently uploaded

Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfGale Pooley
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure servicePooja Nehwal
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spiritegoetzinger
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfGale Pooley
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designsegoetzinger
 
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...ssifa0344
 
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...Suhani Kapoor
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdfFinTech Belgium
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130Suhani Kapoor
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfGale Pooley
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...ssifa0344
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdf
 
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spirit
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdf
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designs
 
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
 
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...
VIP Call Girls LB Nagar ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With Room...
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdf
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 

Syoncloud big data for retail banking, Syoncloud

  • 1. Syoncloud Big Data for Retail Banking | Syoncloud 14/10/2013 Big Data Analytics News and Events Retail Banking Risk Management About Us Contact Syoncloud Big Data for Retail Banking Syoncloud offers comprehensive Big Data / Data Science solution for retail banks. We cover areas such as: Individualization of product offers to existing clients Early fraud detection and fraud damage mitigation Prediction of products cancellations and client's defections Optimal allocation of cash to ATMs and bank branches Minimization of usage of expensive bank channels such as branch visits Reliable assessment of clients for debt products Common Datasets Common Datasets are used as a foundation for complex analysis. Creation of Common Datasets for Analysis Related to Bank's Clients We create a dataset of monthly expenses and incomes categories for all clients, all their accounts and complete history. This dataset is created from bank accounts movements, direct debits and standing orders. Each account movement is usually accompanied with type of movement code such as electricity, phone bill, restaurant type code and so on. We also use merchant's name, description and comment fields to categorize each transaction. Direct debits and standing orders are also accompanied with type codes. We recognize several categories of expenses such as housing expenses (rent or mortgage), energy expenses (gas and electricity), food and household related expenses, education (schools, books, courses), car expenses (fuel and repairs), restaurants, big ticket items (TV, furniture), taxes, recreation and hobby, credit card and loan payments, luxury items and so on. Income categories are salaries, dividends, tax refunds, social benefits, rental income, sales and so on. Simple regression analysis of this dataset gives us overall trends for total expenses, incomes and savings as well as detail trends for each category of incomes and expenses for each client. Machine Learning and Predictions We use full range of machine learning algorithms and models to make predictions. There are two broad categories supervised and unsupervised algorithms. Supervised learning algorithms use historical data to learn that certain combinations and values of inputs cause certain outputs. We create models that are trained and verified on samples of historical data. Sample data can be chosen randomly but we have seen better results if we categorize our datasets first. In case of customer dataset we create categories such as age, income, location based on town size, education and savings. Each category is split into brackets. For example age category is split into 20 five years age brackets. We know how many customers is in each age bracket so we can sample certain percentage of records from each age bracket. The same way we sample other categories. These samples are ideal to see what category make largest contribution to overall results. For example we can see that education makes largest contribution to accept certain investment product. Unsupervised machine learning algorithms look for unknown patterns in available data. For example we find patterns of unusual behaviour of clients to find early signs of frauds. In past we were limited by statistical analysis of behaviour that was common for all clients all large groups of clients. We unsupervised learning models we can find patterns that surface only in small number of records. Individualization of Product Offers Individualization of product offers to existing clients. Banks save money on expensive broad marketing campaigns for bank products. Products will be offered only to customers that need them and are likely to accept them. Customers should see less of irrelevant offers. This requires deep knowledge who accepted given products in past. http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking 1/5
  • 2. Syoncloud Big Data for Retail Banking | Syoncloud 14/10/2013 As an input for our models we use dataset of subscriptions to bank products and service for each client. This dataset includes previous subscriptions and cancellation dates. We also use common dataset of incomes and expenses categories for each client and CRM data about clients. We have created separate models for each product and subscription. In order to prepare suitable models we have to not only chose and verify the best learning algorithm but also to find which categories and variables do have the biggest influence. Early fraud detection and fraud damage mitigation This includes detection of identity frauds, credit card frauds, wire frauds, attacks on internet and mobile banking and money laundering. New types of frauds and new schemes require flexible and fast detection algorithms. In past banks used only statistical and rule based algorithms to find if suspicious activity is taken place on customer's account. These algorithms were limited because they can only recognize known frauds, they require expensive maintenance, they do not work with full history of each client and they have high level of false positives. We utilized dataset of known fraud cases. We have created several categories of these frauds such as overdraft fraud with stolen identity, stolen credit card, consumer loan fraud, credit card top up with fraudulent check, stolen checks, skimming with card duplication, attacks on online banking with stolen customer's credential and/or security devices, rogue online merchant frauds using credit cards and so on. We use neuronal networks with back propagation, decision tree algorithms and classification to find patterns and unknown occurrences of these frauds in our existing data. Prediction of Product Cancellations and Client's defections A prediction of bank products cancellations and client's defections is very time sensitive. Bank has just days to act before client irreversibly decide to cancel a product or move to competition. Bank needs to identify clients who are likely to defect, contact them and pro-activelly offer alternative products or solve client's issues. It is much cheaper to retain highly profitable clients than to attract them back. We have used account movements, debit and credit card movements, clients dataset from CRM, product subscription dataset, call centre and branch visits transactions and log information as primary data sources for our analysis. We have also utilized common datasets of incomes and expenses. We have prepared timeseries of key events such as direct debits cancellations, income to the account from salaries, dividends and rents, transfers to client's accounts at different banks, call centre and branch contacts made by the client separated into categories, cancellations of credit cards and so on. We have prepared another set of clients that do match categories such as age, income, saving and location for the same time interval but who still remain clients. We have prepared matching timeseries for these clients as well. Based on this data we were able to create models that are able to predict behaviour of clients before they irreversibly decide to move to competitors. We have used several supervised learning algorithms such as Support Vector Machines for binary classification and Neural Network with Backpropagation for predictions. From unsupervised machine learning algorithms we have utilized K-Means and Mean Shift Clustering after Principal Component Analysis was applied to reduce dimensions of input data. We have identified several hundreds profitable clients in recent data who match patterns of clients who moved their accounts to competitors. These clients should be contacted by their respective bank branches. Optimal Allocation of Cash for ATMs and Bank Branches Demand for cash is highly variable during year at many ATMs and bank branch locations. The variability is caused by weather, local events, vacations, tourism and so on. It is important to predict right amount cash that needs to be deposited into ATMs as well as bank branches. It is costly to service ATMs too often, it is also costly to have cash machines out of order due lack of cash. In the same time we want to limit amount of unnecessary cash that is stored for long times in ATMs and bank branches. It leads to suboptimal cash allocation as well as it attracts crime. As the primary datasets we have used ATM service logs, geographic locations of ATMs and bank branches, withdraws dataset for each ATM, weather reports for ATMs and bank branch locations, schedules of sports, cultural or other events as well as holidays for all locations. We have utilized credit and debit card movements to assess demand for cash at various locations and during different times of the year. We have used common datasets of incomes to see when salaries, social benefits and other incomes arrived to client's accounts at different locations. We have created dataset of median amounts of cash withdraws for each day of the year and hour of day for all ATMs. This dataset is used to calculate influence of weather, events, day of the week or holidays on demands for cash at given location. http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking 2/5
  • 3. Syoncloud Big Data for Retail Banking | Syoncloud 14/10/2013 We have prepared dataset of significant cultural, sport and other events during past 4 years with location coordinates. We have calculated influence of each event on cash demand for all ATMs that are in 300m radius of given event. We were able to sort all events based on influence on cash demand. This dataset is used for predictions of influence of similar events. We have also calculated correlation between local weather parameters such precipitation, temperature and wind at location of each ATM with cash demand. We have created correlation dataset between days when clients receive incomes, such as salaries and social benefits, and cash demands at different locations. We have prepared models that can predict cash demand for each day of the year for each ATM and bank branch location. This model takes into results from historical datasets as well as weather forecast data and schedules of events. We have utilized algorithms such as Restricted Boltzmann Machine, Perceptron and Gaussian Discriminative Analysis. Minimize Use of Expensive Channels We can minimize the use of expensive bank channels such as over-the-counter operations and other visits of bank branches as well as calls to call centres. This can be achieve by optimizations of online banking and mobile banking applications, help pages and wizards as well as optimization of web pages on bank's websites. Another way to encourage reluctant clients to switch to cheaper channels is by targeted campaigns. Our primary sources of data for analysis were web log files from online banking application as well as mobile banking applications. We have also used bank accounts movements with codes of bank channels, dataset of call centre transactions, CRM dataset with information about customers and dataset of transactions from bank branches. An important dataset was complains and enquiries from call centre, emails, letters and branches. We have sorted this datasets by areas of interest and correlated them with help web pages. We were able to identify help pages that were unclear and caused confusion and unnecessary calls to call centre. We have also identified several operations in online banking that were complex and generated higher amount of complains. We have uncovered several areas related to exchange rates during credit cards payments that were not covered by help pages but were often discussed over the phone or even by bank branch visits. Changes made to bank products related web pages, self helps, search optimizations, online banking operations and mobile banking applications can bring quick savings on outsourced call centres and bank branch visits. We have analysed results from marketing campaigns to move reluctant clients to online and mobile banking or self serving kiosks. We have used correlation analysis and we have seen that broad marketing campaigns were not efficient. We have analyse patterns of bank clients who recently moved most of the operations online. This gave us a tool to select portion of clients that are more likely to move online. These customers should be targeted by personalized marketing campaigns or by demonstration of advantages at bank branches. Assessment of Clients for Debt Products In order to reliably assess risks and approve debt products to existing clients we need take into account not just current credit scores and current disposable income of the clients but also complete history of the client as well as social context. This decreases risk for the bank as well increase income from valuable clients who would be otherwise rejected. As a primary source of data we have used common dataset of incomes and expenses, complete history of payment morale for credit cards, consumer loans, mortgages, overdrafts and other debt products and CRM information about clients. We have used Markov Chain stochastic process to assess debt and payment morale related behaviour of clients. This model was tested on historical data of profitable and defaulted loans, credit cards and other debt products. We have noticed improved of reliability of credit scores and we were able to suggest suitable alternative debt products for rejected clients. Overview of Primary Datasets and Sizing Example These are examples of primary datasets and sizing calculations. Each project is specific and not all datasets are available but data sizing calculations are likely to be similar. Account movements for all active and former clients. Given dataset includes complete history of account movements for all current and savings accounts. This dataset contains 6 millions unique clients and 23 millions active and closed accounts. An average size of movements per account is 1MB this give us 23TB of uncompressed de-normalized CSV files. http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking 3/5
  • 4. Syoncloud Big Data for Retail Banking | Syoncloud 14/10/2013 Dataset of debit and credit card movements contains 25 millions unique card Ids. We have on an average 3 thousand transactions per single card number. Total number of records is 75 billions. Each record in uncompressed CSV form has 1kB. The total size of this dataset is 75TB. Technical log files from internet and mobile banking applications have 50TB. These files include front-end Apache log files as well as applications logs. Bank transactions, requests for help and complains from call centre. This datasets contains bank transactions, requests for help and complains from 1 million unique customers. An average number of interactions per customer is 35. Typical size of an interaction is 10kB. The total size of the dataset is 350GB. CRM information about clients with historical values include personal information about customers such as employment, education, age, family status. Dataset includes current and historical information for about 6 millions clients with typical size 100kB per client. Total size is 600GB. Direct debits and standing orders of bank clients with historical values. The typical number of standing orders and direct debits per client with historical values is 50. A size of single record is 1kB. The total size of dataset for 6 millions clients and 50 records per client is 300GB. Product subscriptions data for all clients with complete history. A typical number of current and historical subscriptions per single client is 12. This includes accounts, mortgages, loans, credit cards and other bank products. We have 6 millions clients multiplied by 12 average number of subscriptions per client and multiplied by 1kB per subscription is 72GB. Customer's data from branch visits. This dataset includes over-the-counter bank transactions, help requests, product subscriptions and cancellations and complains. Typical number of interactions per client is 10. We do have large differences in utilization of branch services among clients. 3 millions clients and 10kB per interaction means 300GB. Dataset of debtors and dataset of failed applications for debt products. The total size of 1 million records in these datasets is 1GB Help files usage from mobile and internet banking. 6 millions users multiplied by 1000 average number of clicks to help files multiplied by 1kB an average size of the record is 6TB The total size of all primary datasets is 156TB. The result is calculated as a simple sum such as: 75TB + 50TB + 23TB + 6TB + 600GB + 350GB + 300GB + 300GB + 72GB + 1GB = 156TB. We can reduce overall size by using compression and we can remove technical fields that do not carry any business meaning from the datasets. Log files are also reduced by removing lines with no business meaning. Implementation Steps Isolation of sensitive data from Big Data analytics In order to isolate Big Data analytics from sensitive data we remove clients' names, addresses, telephone numbers and emails during data export processes. The next step is to create process that replaces real credit and debit card numbers, account numbers and customer's Ids by randomly generated numbers. These randomly generated numbers must be identical for the same entity across different datasets to enable analytics. This process stores pairs of matching real numbers and randomly generated numbers into tables. These tables are stored in separate secure relational database that is continuously updated. This database is also used to match randomly generated numbers with real numbers after Big Data analysis are performed. This enables isolation of data scientists and administrators from sensitive information that is only accessible to authorized bank's employees. Extraction, Transformation and Loading of Primary Datasets We do have initial ETL (Extraction, Transformation and Loading) of data and continuous processes of daily or hourly updates and imports of recent data from production systems of the bank. Initial extraction was performed by bank's production and backup systems. Data was extracted in denormalized text form in CSV or fix length field formats. This form is an ideal for bulk uploads into Big Data systems. Denormalized form uses concrete values instead of reference Ids as in relational databases. Continuous data exports are channeled via JMS, MQ Series, CSV files and via Sqoop. Exported data are picked up by Big Data scripts such as Pig or Hive. These scripts are triggered via Oozie processes. Transformation of Input Data Transformation rules and scripts are shared by initial and continuous ETL processes. We have used Pig and Hive scripts and Java written UDF (User Defined Functions) to perform transformation steps. Oozie workflows were used to chain transformation steps. We have used several practical rules for data transformations: Various file formats are separated into its own directories inside HDFS (Hadoop file system) Unprocessed and failed records are written into specific directories for manual investigation. Intermediate result files are deleted after all transformation steps are successfully performed. This saves HDFS space as well as enable to investigate and re-run incomplete transformations. http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking 4/5
  • 5. Syoncloud Big Data for Retail Banking | Syoncloud 14/10/2013 Pig and Hive scripts are kept simple and single purpose. This enables easy debugging and re-use. Java UDFs are only used if given function was not available in standard library or in PiggyBank library. Transformation scripts are reused for processing updates. Pow ered by Drupal http://www.syoncloud.com/Syoncloud_Big_Data_for_Retail_Banking 5/5