Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander

Serverless Architectures in Banking:
Apache OpenWhisk on IBM Bluemix at Santander
IBM InterConnect 2017 – March 21, 2017

1
About the speakers
Daniel Krook
Software Architect/Engineer
& Developer Advocate at IBM
krook@us.ibm.com
Luis Enriquez
Head of Platform Engineering &
Architecture at Santander Group
luis.enriquez@gruposantander.com

2
Agenda
1 2 3 4
Results, conclusions,
future directions
Serverless
architectures
Apache OpenWhisk
on IBM Bluemix
Check processing
overview and solution

3
Santander is one of the world’s largest banks

4
Goals and results of the OpenWhisk Proof of Concept
Goals & Principles
• Hybrid solution
• Greater deployment choices
• Avoid vendor lock-in
• Scalability and elasticity
• Respond to workload peaks
• Asynchronous and event-driven
• Developer-friendly solution
• Efficiency
Results
• Automated process, reducing time
and error avoidance
• Elasticity, bursting into the cloud
• Simple and easy to maintain
technical solution
• Significant cost saving potential

5
Agenda
2 3 4
future directions
Serverless
architectures
Apache OpenWhisk
on IBM Bluemix
Check processing
1

6
With a serverless platform developers focus
more on code, less on infrastructure
Bare
metal
Virtual
machines
Containers
Functions
Decreasing concern (and control) over stack implementation
Increasingfocusonbusinesslogic

7
Serverless platforms address 12 Factors for developers
I Codebase Handled by developer (Manage versioning of functions themeselves)
II Dependencies Handled by developer, facilitated by serverless platform (Runtimes and packages)
III Config Handled by platform (Environment variables or injected event parameters)
IV Backing services Handled by platform (Connection information injected as event parameters)
V Build, release, run Handled by platform (Deployed resources immutable and internally versioned)
VI Processes Handled by platform (Single stateless containers used)
VII Port binding Handled by platform (Actions or functions automatically discovered)
VIII Concurrency Handled by platform (Process model hidden and scales in response to demand)
IX Disposability Handled by platform (Lifecycle hidden from user, fast startup and elastic scale prioritized)
X Dev/prod parity Handled by developer (Developer is deployer. Scope of what differs narrower)
XI Logs Handled by platform (Developer writes to console.log, platform streams logs)
XII Admin processes Handled by developer (No distinction between one off processes and long running)

8
Emerging workloads are a good fit for event driven programming
Execute app logic in response to database change
Perform edge analytics in response to sensor input
Provide cognitive computing via a conversational bot
Schedule tasks according to a specific timetable
Invoke autoscaled mobile backend services

9
New cost models more accurately charge for compute time
While many applications must still be deployed in an always on model,
serverless architectures provide an alternative that can result in substantial
cost savings for a variety of event driven workloads.
Applications billed by
compute time (millisecond)
rather than reserved memory
(GB/hour).
Means a greater linkage
between cloud resources
used and business
operations executed.

10
Technological and business factors make serverless compelling
Serverless architectures are gaining traction
Cost models getting more granular and efficient
Growth of event driven workloads that need automated scale
Platforms to facilitate cloud native design for developers

11
Agenda
3 4
future directions
Apache OpenWhisk
on IBM Bluemix
Check processing
21
Serverless
architectures

12
OpenWhisk enables these serverless, event-driven workloads
Serverless deployment and operations model
Optimized utilization, fine grained metering at any scale
Flexible, extensible, polyglot programming model
Open source and open ecosystem (Apache Incubator)
Ability to run in public, private, and hybrid models
Apache
OpenWhisk
a cloud platform
that executes code
in response to events

13
Developers work with triggers, actions, rules, and packages
Data sources define
events they emit as
Triggers.
Developers map
Actions to Triggers
via Rules.
Packages provide
integration with
external services.
T
A
P
R

14
OpenWhisk
Comparison to traditional PaaS or IaaS models
Traditional Model Serverless Model
• Continuous polling often used
• Charged even when idling
• No auto-scaling in response to load
• Introduces event-driven programming model
• Charges only for what is used
• Auto-scales in response to current load
Request Polling
Application
CF Container VM
Trigger
OpenWhisk
Engine
Running
Action
Running
Action
Running
Action
Idle compute
resources
Deploy action within
milliseconds, run it, free up
resources
Pool of Actions
JS Swift Docker

15
Agenda
3 4
future directions
Check processing
1
Serverless
architectures
2
Apache OpenWhisk
on IBM Bluemix

16
Business Drivers at Santander for a Serverless Architecture – 1/2
What value do microservices and serverless architectures provide?
Compared to a PaaS offering, FaaS charges the customer
based on the actual time used by the service itself. Server
uptime is not billed (serverless).
Independent scalability, integration and delivery pipelines,
testability and development flows make it more streamlined and
automated, resulting into less maintenance efforts and
savings on operations and development costs.
Provides a great way to quickly and reliably connect or relay
private/public/hybrid SOA or Cloud APIs at low cost
$ ¥
€ £
Billing Model
Low Complexity
Integration
Capability

17
• However, outcome depends on each scenario
• Not everything can or should rely on FaaS. E.g: very active back-ends, complex
front-end applications etc. would simply underperform
• OpenWhisk in particular are excellent to design a web of microservices whose
purpose is to relay or orchestrate other services (e.g. IoT, reactive post-
processes applied on other Cloud feeds etc.)
• Microservices are another tool for architects to support the general IT Cloud
transition, and as such should be used in conjunction with other solutions.
Business Drivers at Santander for a Serverless Architecture – 2/2
What value do microservices and serverless architectures provide?

18
Scenario:
This PoC intends to present how OpenWhisk could improve the following business process:
Bank clerks manual entry of routing and account numbers when cashing Santander Bank
customers’ checks.
The purpose of this proof of concept is to show how OpenWhisk can be used for an event-driven,
serverless architecture, that processes the deposit of checks to a bank account using optical
character recognition (OCR), replacing manual inputs and avoid correlated human errors.
Proof of Concept: “OpenChecks” check processing
OpenWhisk by the example: service enablement and orchestration

19
Check data parsing with OCR overview
OCR will be used to parse the
data at the bottom of the
check representing:
• The routing number
• The account number
If this information is not
readable or does not follow
the presented format, the
check will be considered
invalid.
Routing number Deposit from account number
The hand-written amount data is not currently parsable nor is the deposit to account information
provided on a check itself. This data needs to be passed as metadata (that is, encoded in the file
name as supplied by the bank clerk).

20
Deployment model approaches evaluated
This proof of concept had three different deployment models, each one with its advantages and
disadvantages.
Deployment of the
computing engine on
Cloud
• Serverless
computing
Deployment of the
computing engine on
premises
• Sensitive data
• Avoid Vendor lock-in
Deployment of the
computing model on
both Cloud and on-
premises
• Total cost of
ownership
Cloud Local Cloud Bursting

21
Logical architecture
Architectural Diagram as deployed on the cloud (Apache OpenWhisk on IBM Bluemix)

22
Workload split between public and private OpenWhisk
and Cloudant instances

23
Workload split between public and private OpenWhisk
and Cloudant instances (with hybrid scheduling)

24
• Checks are scanned and uploaded by the front-office clerks
• OpenWhisk Bluemix resizes the scans in smaller sizes and stores them along with
the originals into remote databases
• Databases are replicated over to on-prem servers
• On-prem OpenWhisk kickstarts the OCR, parses the checks and stores the result
into a local database
• Statistics such as the total amount processed, total checks that could not be parsed
with success etc. are calculated by either the local or remote OpenWhisk systems,
alternatively, based on an arbitrary dispatching method. These stats are stored in a
remote database, which is replicated over a local instance continuously.
• Clerks connect a local front-end to consult these statistics from the local
database.
“OpenChecks” OCR in a hybrid environment
Hybrid Deployment with Cloud Bursting: Workflow Highlights

25
Proof of Concept: OpenChecks OCR
Hybrid Deployment with Cloud Bursting: Demo Front-end Statistics Screenshot

26
• No data resides only in the cloud, there’s always a local replica. If necessary
(regulatory concerns), there’s a way to use only on-prem storage.
• Tasks are split in a hybrid way: part of the flow is done on-prem, the rest on the
cloud.
• This simulation stresses on the versatile nature of OpenWhisk: it both
orchestrates (e.g. database change feeds handlers, statistics computation
dispatcher) and processes (e.g. image resize, statistics computation). It is both
the foreman and the laborer.
• General deployment as well as DevOps integration is quick and should not
disrupt other services
• Big data document-based CouchDB (or Cloudant) is used in anticipation of large
data volumes
• Communication relies entirely on HTTPS REST APIs
Hybrid Deployment with Cloud Bursting: Architectural Highlights

27
Agenda
4
future directions
1
Serverless
architectures
2
Apache OpenWhisk
on IBM Bluemix
3
Check processing

28
Cost savings estimation from a check processing use case
1 https://www.federalreserve.gov/paymentsystems/check_govcheckprocannual.htm
Estimating that
• Number of USA check transactions in 2016: 60 million1
• Average time of execution in seconds: 7 seconds
• Allocated memory per execution in GB: 0.256 GB
• Cost per GB-second of execution: 0.000017 USD
With these estimations we can predict that the
total yearly cost to process every paper check in 2016
would be approximately $1,830 USD if based on OpenWhisk.
Yearly Cost = # of Executions
x Average Time (in seconds)
x Allocated Memory per Execution
x $ per GB/second

29
• As of today, the OCR service can only cover efficiently Account and Routing numbers
for US bank checks. In the future, other technologies should be surveyed in order to
handle the amounts on the checks.
• The front-end is currently showing the checks statistics, for demonstration purposes.
It should be enhanced to allow for data correction and validation from the clerks.
• Beyond use by the clerks at the banks, the same logic could be used to support
mobile check deposit (with deposit to acount information inferred from the user, and
amount data input manually.
• Another OpenWhisk function could be created in a similar fashion to integrate with
Santander Bank internal systems of record.
Hybrid Deployment with Cloud Bursting: Challenges and Potential Improvements

30
• The full cost of an on premises cluster of virtual machines or containers to run
OpenWhisk (and CouchDB) in a highly available configuration should be weighed
against the lower cost of using it on a hosted instance in Bluemix. There is cost
versus risk, but the key thing is you have flexibility to decide with OpenWhisk.
• Use the new release of the Watson text analysis service rather than packaging
Tesseract with MICR training data in a Docker container (or using Tesseract.js).
• With serverless, cost is tightly bound to value gained, so code optimizations are
very important at scale.
• There has been a lot of work done to make OpenWhisk actions runnable and
testable locally outside the OpenWhisk environment. This is key to a an end to
end workflow that requires versioning of functions.
• OpenWhisk native sequences and triggers/feeds should be preferred over manual
programmatic action chaining in order to support composability.
Hybrid Deployment with Cloud Bursting: Challenges and Potential Improvements

31
Why use OpenWhisk on IBM Bluemix?
Provides a rich ecosystem of building blocks from various domains.
Supports an open ecosystem that allows sharing microservices via packages.
Takes care of low-level details such as scaling, load balancing, logging and fault tolerance.
Hides infrastructural complexity allowing developers to focus on business logic.
Allows developers to compose solutions using modern abstractions and chaining.
Charges only for code that runs.
Is open and designed to support an open community.
Supports multiple runtimes and arbitrary binary programs encapsulate in Docker containers.

32
Try Apache OpenWhisk on IBM Bluemix
bit.ly/ibm-ow

Thank you
Our purpose is to help people and
businesses prosper.
Our culture is based on the belief that
everything we do should be

Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander

Similar to Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander (20)

More from Daniel Krook

More from Daniel Krook (20)

Recently uploaded

Recently uploaded (20)

Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander