In the first webinar of the series we covered the importance of caching in microservice-based application architectures—in addition to improving performance it also aids in making content available from legacy systems, promotes loose coupling and team autonomy, and provides air gaps that can limit failures from cascading through a system.
To reap these benefits, though, the right caching patterns must be employed. In this webinar, we will examine various caching patterns and shed light on how they deliver the capabilities needed by our microservices. What about rapidly changing data, and concurrent updates to data? What impact do these and other factors have to various use cases and patterns?
Understanding data access patterns, covered in this webinar, will help you make the right decisions for each use case. Beyond the simplest of use cases, caching can be tricky business—join us for this webinar to see how best to use them.
Jagdish Mirani, Cornelia Davis, Michael Stolz, Pulkit Chandra, Pivotal
3. Why Microservices Architectures Need a Cache
(covered in webinar #1)
Autonomy
$
$
Cost ReductionPerformance Availability
In-memory performance
Horizontally scalable
4. Why Microservices Architectures Need a Cache
(covered in webinar #1)
Autonomy
$
$
Cost ReductionPerformance Availability
Resilient to Server or Availability Zone failures
Recovery process protects against any data loss
Increases the overall availability of data across the system
5. Why Microservices Architectures Need a Cache
(covered in webinar #1)
Autonomy
$
$
Cost ReductionPerformance Availability
Data APIs and versioning can provide teams with tailored views of data
without changing the backing store(s)
Versioning of Data APIs provides an easier evolutionary path for
changes to the data layer
6. Why Microservices Architectures Need a Cache
(covered in webinar #1)
Autonomy
$
$
Cost ReductionPerformance Availability
Economical alternative to expensive data access from legacy systems
Provides an on-ramp to a modern architecture while preserving
investments in legacy systems
8. Is There a Pattern To Caching Patterns?
Business use case driven
Online shopping
Ticketing
Risk computations
Isolation vs. Consolidation
Single shared database (anti-pattern)
Database per service
Types of data
Session state caching
Application and user data caching
Data consistency
Synchronous
Asynchronous
Active lifespan of data
Static data
Fast data
Type of integration
Data APIs
Embedded in applications
Separate process
Data Management
Materialized views
Data mirroring
Data integration
Data streams
Query and search
Architecture
Event driven
CQRS
Lifecycle Management
Versioning
Parallel Deployments
Caching Styles
Look-aside
Read-through
Write through
Write behind
9. If it happens often, and it solves a problem, it’s a pattern.
If it works against solving the problem then it’s an anti-pattern.
What are Software Design Patterns?
Wikipedia for ‘software design patterns’
“... general reusable solution to a commonly occurring problem within a given
context in software design”
10. Patterns We Will Cover Today
● Session state caching
● Caching application generated data, e.g. user data
● Caching reference data, product data, pricing
● Cache warming and freshness
● High availability across servers and availability zones
12. Session state caching: a common starting point
▪ Data generated by the application and users -
populates the app’s database
▪ Data about the user such as preferences
▪ Data that is used by the application to do its
job such as product data or pricing
Session State Data Application or User Data
● Caching session or application state
● Metadata about the application’s
interactions
● Useful for business continuity of
operations, and resilience to failures
● Fast access to state information is critical
for elastic operations
● Session state caching: session data
available within the scope of a session
13. Benefits both, user experience, and infrastructure elasticity
Why Use Session State Caching
▪ Capturing and externalizing state information for
improving the user experience
- State information can be saved in a highly
available cache
▪ Elastic infrastructure for scaling up/down Microservices
- Multiple application instances share the same
user session context
- Multiple disparate applications can share the
same user session context?
▪ Improves the infrastructure’s availability
17. Application Data Cache
▪ Works well when ...
- Same data accessed frequently (read intensive)
- Large number of concurrent sessions
- Data that is slow or expensive to access from the backing store
- No need for handling concurrent updates to the same data
- User experience driven (performance and availability)
▪ Example: Online shopping
- Customer preferences
- Shopping carts
- Recent history
- Recommendation engine data
18. This is how an in-memory cache can horizontally scale
Partitioning (aka Sharding)
Take advantage of the memory and network bandwidth of all members of the cluster
19. Reference Data is a Prime Candidate for Caching
Application Data Cache
▪ Slowly changing
▪ Modest volumes
▪ Examples
- Product catalog information: pricing, description, …
- Geo information
- Profile information: name, address, preferences
- HR information
▪ Slowly changing data benefits from cache warming
21. Streams reduce data latency
Batch vs. Data Streams
Batch Streams
Data scope Queries or processing over all or most
of the data
Queries or processing over data on a
rolling window or most recent data record
Data size Large batches Individual records or micro batches
Performance Latencies in minutes or hours Latencies in order of seconds or
milliseconds
Analytics Complex analytics Simple response functions, aggregates,
and rolling metrics
Tools Batch ETL, monolithic, brittle expensive Scalable, adaptable, cloud-native
22. Copyright (C) 2017 451 Research LLC
Continuous Data Integration
The key concepts of a
continuous integration and
delivery process can be
applied to the process of
developing, deploying and
managing data integration
pipelines, resulting in
pipelines that are responsive
to changing business and
data processing
requirements.
23. for Data Integration
Before Cloud
● Servers were expensive.
● Software was hard to acquire
and slow to deploy.
● Nightly batches
● Integration in large rollouts and
required specialized skills.
Today
● Lower compute cost
● Software is core to business value,
developed in-house.
● Higher data volumes
● Scale-out architectures
● Streaming data sources
Why Microservices
Integration flows can leverage this by operating as microservices
that scale independently, and are reusable, and interoperate over
message queues.
24. Spring Cloud Data Flow (Spring Cloud Stream)
Binding Abstraction Layer
Transport Options
Data Pipeline
Data
Sources
IoT
Pivotal
Cloud
Cache
Cache Warming and Data Freshness
RabbitMQ
25. Drag and drop visual editor for data pipeline authoring and orchestration
Spring Cloud Data Flow
26. Platform Leverage for Streaming Data
Backing Services
Transport Options
Pivotal Cloud Foundry
Auto Scaling
Auto Healing
Aggregated
Logging
Integrated
Metrics
Transport
Transparency
Infrastructure
Transparency
Data Pipeline
Visual Design
Integrated
Monitoring
RabbitMQ
30. Summary
▪ Session state caching
- “De-anonymizes” users during a session to improve user experience
- Is a critical part of how cloud-native apps achieve elasticity
▪ Application data caches can relieve pressure from applications that need
low-latency, high-frequency access to data (hot data)
▪ Reference data is prime candidate for caching
▪ Data streams
- Can feed a caching layer for use cases that need low-latency access to
fresh data
- Provide a uniform mechanism for both cache warming as well as the
ongoing freshness of cached data
▪ High availability for applications requires component level availability as well
as system wide mechanisms for resilience after failures