SlideShare a Scribd company logo
1 of 44
Download to read offline
Building resilient applications
with Polly
netponto - 2019-01-26
Nuno Caneco
/@nuno.caneco
/nunocaneco
nuno.caneco@gmail.com
Engineering Manager / Cluster Lead
@nunocaneco
Nuno Caneco
Availability
Availability is the probability that a system will work as
required when required during the period of a mission.
The mission could be the 18-hour span of an aircraft flight. The mission period could
also be the 3 to 15-month span of a military deployment.
Availability includes non-operational periods associated with reliability, maintenance,
and logistics.
Availability levels
Nines Unavailable period / year
1 nine - 90% 36.5 days
1,5 nines - 95% 18.2 days
2 nines - 99% 3.7 days
3 nines - 99.9% 8.8 hours
4 nines - 99.99% 52.6 minutes
5 nines - 99.999% 5.3 minutes
6 nines - 99.9999% 30 seconds
Availability
Cost
What is resilience?
Resiliency is the ability of a system to
gracefully handle and recover from failures.
Source MSDN - https://docs.microsoft.com/en-us/azure/architecture/patterns/category/resiliency
Ecosystem
User interfacePublic API
Backend Services
Publish / Subscribe
Databases
BLOB Storage
3rd party systems
Crashing
You're working here →
This dependency crashes →
Things can will go wrong
As system's complexity grows, the amount and types of issues that might occur and
that affect the system availability also increases:
Code Library
● Coding errors
● Edge cases
Hardware
● Disk
● Network card
External systems / APIs
● Coding errors
● Edge cases
● Network issues
● Degradation of service
● Request overload
● Unavailability
● Protocol issues (HTTPS, …)
Databases
● Coding errors
● Resource exhaustion
● Network issues
● Degradation of service
● Request overload
● Unavailability
Trust zones
Application Code
Clients
App
A
App
B DB
Transitive
dependencies
Network
Trusted
Untrusted
Untrusted
The big question now is
How does the Application Code deals with
failure from dependencies?
What are we looking?
● Choose your availability level:
Not every application has high availability requirements
● Reduce exposure to dependencies failures:
if a dependency fails, the application should do its best to behave
● Assume chaos:
Things will go wrong at some point. Be prepared!
● Beware of misbehaved clients:
Your clients might be evil.
● Fail fast:
In case of failure, the application must fail ASAP and report the problem
Introducing Polly
Polly
Polly is a .NET resilience and transient-fault-handling library that allows developers
to express policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and
Fallback in a fluent and thread-safe manner.
Polly targets .NET Standard 1.1 (coverage: .NET Framework 4.5-4.6.1, .NET Core 1.0,
Mono, Xamarin, UWP, WP8.1+) and .NET Standard 2.0+ (coverage: .NET Framework
4.6.1, .NET Core 2.0+, and later Mono, Xamarin and UWP targets).
https://github.com/App-vNext/Polly
Usage
// Create a policy
var policy = Policy
.Handle<SomeExceptionType>()
.Retry()
// Execute action with void return within a policy
policy.Execute(() => SomeAction());
// Execute action with return value within a policy
var result = policy.Execute(() => SomeAction()); // Implicit return type
Retry
"Maybe it's just a blip"
Automatically retry an operation in case of exception.
Timing:
● Immediate retry
● Wait and Retry
○ Constant backoff (e.g wait 10 seconds before retry)
○ Dynamic backoff (e.g. exponential backoff)
Perseverance:
● Retry forever
● Give up after n attempts
https://github.com/App-vNext/Polly/wiki/Retry
Retry
Retry - code
// Retry once
Policy
.Handle<SomeExceptionType>()
.Retry()
// Retry multiple times
Policy
.Handle<SomeExceptionType>()
.Retry(3)
// Retry multiple times, calling an action on each retry
// with the current exception and retry count
Policy
.Handle<SomeExceptionType>()
.Retry(3, (exception, retryCount) =>
{
// do something
});
Wait and retry - code
// Retry, waiting a specified duration between each retry.
// (The wait is imposed on catching the failure, before making the next try.)
Policy
.Handle<SomeExceptionType>()
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(3)
});
// Retry a specified number of times, using a function to calculate the duration to wait between retries
// based on the current retry attempt (allows for exponential backoff)
Policy
.Handle<SomeExceptionType>()
.WaitAndRetry(5, retryAttempt =>
TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))
);
// In this case will wait for
// 2 ^ 1 = 2 seconds then
// 2 ^ 2 = 4 seconds then
// 2 ^ 3 = 8 seconds then
// 2 ^ 4 = 16 seconds then
// 2 ^ 5 = 32 seconds
Timeout
"Don't wait forever"
Optimistic timeout
Optimistic timeout operates via CancellationToken and assumes delegates you
execute support co-operative cancellation. You must use Execute/Async(...) overloads
taking a CancellationToken, and the executed delegate must honor that
CancellationToken.
Pessimistic timeout
Pessimistic timeout allows calling code to 'walk away' from waiting for an executed
delegate to complete, even if it does not support cancellation. In synchronous
Fallback
"Degrade gracefully"
Provide a substitute value or substitute action in the event of failure.
Triggers:
● Exception
e.g. if user action threw a HttpRequestException, return "??"
● Specific return values
e.g. if user action returned -1, return "??"
Fallback - code
Policy
.Handle<Whatever>()
.Fallback<UserAvatar>(UserAvatar.Blank)
// Specify a func to provide a substitute value, if execution faults.
Policy
.Handle<Whatever>()
.Fallback<UserAvatar>(() => UserAvatar.GetRandomAvatar())
// Specify a substitute value or func, calling an action (eg for logging) if the fallback is invoked.
Policy
.Handle<Whatever>()
.Fallback<UserAvatar>(UserAvatar.Blank, onFallback: (exception, context)=>
{
// do something
});
Fallback
Start: execution
requested
Execute user delegate
Exception?
Fallback
value?
End
Return result Return fallback value
End
No
No
Yes
Yes
Cache
"You've asked that one before"
Provides a response from cache if known.
● Multiple cache providers
● Absolute expiration: expire after a given amount of time
● Sliding expiration: keep items that are being hit
Cache
Start: execution
requested
Calculate cache key
On Cache?
End
Return result
Return cached value
End
Yes
No
Invoke user delegate
Put result in cache
Cache - code
// Define a cache Policy in the .NET Framework, using the Polly.Caching.Memory nuget package.
var memoryCacheProvider = new MemoryCacheProvider(MemoryCache.Default);
var cachePolicy = Policy.Cache(memoryCacheProvider, TimeSpan.FromMinutes(5));
// Define a cache policy with absolute expiration at midnight tonight.
var cachePolicy = Policy.Cache(memoryCacheProvider, new AbsoluteTtl(DateTimeOffset.Now.Date.AddDays(1));
// Define a cache policy with sliding expiration: items remain valid for another 5 minutes each time the cache
item is used.
var cachePolicy = Policy.Cache(memoryCacheProvider, new SlidingTtl(TimeSpan.FromMinutes(5));
Demo I
Circuit breaker
"Stop doing it if it hurts"
Breaks the circuit (i.e. blocks executions) for a period, when faults exceed some
pre-configured threshold.
Circuit breaker state machine
Closed:
● Executes user action and returns the result
● Initial state
Open:
● User action will NOT be executed
● Fail fast by throwing a BrokenCircuitException
● Will remain open until durationOfBreak elapses
Half-open:
● Next call will treated as a trial to determine the
circuit health
● If throws an exception, circuit will remain Opened
● If no exception, circuit will transition to Closed
Circuit Breaker - code
// Break the circuit after the specified number of consecutive exceptions
// and keep circuit broken for the specified duration.
Policy
.Handle<SomeExceptionType>()
.CircuitBreaker(2, TimeSpan.FromMinutes(1));
// Break the circuit after the specified number of consecutive exceptions
// and keep circuit broken for the specified duration,
// calling an action on change of circuit state.
Action<Exception, TimeSpan> onBreak = (exception, timespan) => { ... };
Action onReset = () => { ... };
CircuitBreakerPolicy breaker = Policy
.Handle<SomeExceptionType>()
.CircuitBreaker(2, TimeSpan.FromMinutes(1), onBreak, onReset);
Manually breaking the circuit
// Monitor the circuit state, for example for health reporting.
CircuitState state = breaker.CircuitState;
/*
CircuitState.Closed - Normal operation. Execution of actions allowed.
CircuitState.Open - The automated controller has opened the circuit. Execution of actions blocked.
CircuitState.HalfOpen - Recovering from open state, after the automated break duration has expired.
Execution of actions permitted. Success of subsequent action/s controls onward transition to Open or Closed
state.
CircuitState.Isolated - Circuit held manually in an open state. Execution of actions blocked.
*/
// Manually open (and hold open) a circuit breaker - for example to manually isolate a downstream service.
breaker.Isolate();
// Reset the breaker to closed state, to start accepting actions again.
breaker.Reset();
Demo II
Bulkhead Isolation
"One fault shouldn't sink the whole ship"
Constrains the governed actions to a fixed-size resource pool, isolating their potential
to affect others.
PolicyWrap
"Defense in depth"
Allows multiple policies to be combined.
Demo III
ANY QUESTIONS?
/@nuno.caneco
/nunocaneco
nuno.caneco@gmail.com
THANK YOU
@nunocaneco
Patrocinadores “GOLD”
Twitter: @PremiumMinds https://www.premium-minds.com
Patrocinadores “Silver”
Patrocinadores “Bronze”
http://bit.ly/netponto-aval-79
* Para quem não puder preencher durante a reunião,
iremos enviar um email com o link à tarde
Próximas reuniões presenciais
07/04/2018 – Lisboa
23/06/2018 – Lisboa
15/09/2018 – Lisboa
24/11/2018 – Lisboa
Reserva estes dias na agenda! :)

More Related Content

What's hot

Postman: An Introduction for Developers
Postman: An Introduction for DevelopersPostman: An Introduction for Developers
Postman: An Introduction for DevelopersPostman
 
Circuit Breaker Pattern
Circuit Breaker PatternCircuit Breaker Pattern
Circuit Breaker PatternTung Nguyen
 
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)Brian Hong
 
Preparing for SRE Interviews
Preparing for SRE InterviewsPreparing for SRE Interviews
Preparing for SRE InterviewsShivam Mitra
 
Updated: Should you be using an Event Driven Architecture
Updated: Should you be using an Event Driven ArchitectureUpdated: Should you be using an Event Driven Architecture
Updated: Should you be using an Event Driven ArchitectureJeppe Cramon
 
Rate limits and Performance
Rate limits and PerformanceRate limits and Performance
Rate limits and Performancesupergigas
 
Synthetic Monitoring Deep Dive - AppSphere16
Synthetic Monitoring Deep Dive - AppSphere16Synthetic Monitoring Deep Dive - AppSphere16
Synthetic Monitoring Deep Dive - AppSphere16AppDynamics
 
Agile Testing Framework - The Art of Automated Testing
Agile Testing Framework - The Art of Automated TestingAgile Testing Framework - The Art of Automated Testing
Agile Testing Framework - The Art of Automated TestingDimitri Ponomareff
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?Wojciech Barczyński
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetesrajdeep
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing ProcessIntetics
 
Practical service level objectives with error budgeting
Practical service level objectives with error budgetingPractical service level objectives with error budgeting
Practical service level objectives with error budgetingFred Moyer
 
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Amazon Web Services
 
Performance Testing using Loadrunner
Performance Testingusing LoadrunnerPerformance Testingusing Loadrunner
Performance Testing using Loadrunnerhmfive
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Site24x7
 

What's hot (20)

Postman: An Introduction for Developers
Postman: An Introduction for DevelopersPostman: An Introduction for Developers
Postman: An Introduction for Developers
 
GCP IAM.pptx
GCP IAM.pptxGCP IAM.pptx
GCP IAM.pptx
 
Circuit Breaker Pattern
Circuit Breaker PatternCircuit Breaker Pattern
Circuit Breaker Pattern
 
Amazon API Gateway
Amazon API GatewayAmazon API Gateway
Amazon API Gateway
 
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)
AWS와 함께 한 쿠키런 서버 Re-architecting 사례 (Gaming on AWS)
 
Preparing for SRE Interviews
Preparing for SRE InterviewsPreparing for SRE Interviews
Preparing for SRE Interviews
 
Updated: Should you be using an Event Driven Architecture
Updated: Should you be using an Event Driven ArchitectureUpdated: Should you be using an Event Driven Architecture
Updated: Should you be using an Event Driven Architecture
 
Rate limits and Performance
Rate limits and PerformanceRate limits and Performance
Rate limits and Performance
 
Synthetic Monitoring Deep Dive - AppSphere16
Synthetic Monitoring Deep Dive - AppSphere16Synthetic Monitoring Deep Dive - AppSphere16
Synthetic Monitoring Deep Dive - AppSphere16
 
Kubernetes 101 Workshop
Kubernetes 101 WorkshopKubernetes 101 Workshop
Kubernetes 101 Workshop
 
Gatling
Gatling Gatling
Gatling
 
Agile Testing Framework - The Art of Automated Testing
Agile Testing Framework - The Art of Automated TestingAgile Testing Framework - The Art of Automated Testing
Agile Testing Framework - The Art of Automated Testing
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing Process
 
Practical service level objectives with error budgeting
Practical service level objectives with error budgetingPractical service level objectives with error budgeting
Practical service level objectives with error budgeting
 
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
 
Performance Testing using Loadrunner
Performance Testingusing LoadrunnerPerformance Testingusing Loadrunner
Performance Testing using Loadrunner
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)
 

Similar to Building resilient applications

Productionizing spark
Productionizing sparkProductionizing spark
Productionizing sparkSigmoid
 
Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Dinakar Guniguntala
 
Architecting fail safe data services
Architecting fail safe data servicesArchitecting fail safe data services
Architecting fail safe data servicesMarc Mercuri
 
Monitor(karthika)
Monitor(karthika)Monitor(karthika)
Monitor(karthika)Nagarajan
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrentRoger Xia
 
[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to TestZsolt Fabok
 
Process scheduling
Process schedulingProcess scheduling
Process schedulingHao-Ran Liu
 
Oracle real application clusters system tests with demo
Oracle real application clusters system tests with demoOracle real application clusters system tests with demo
Oracle real application clusters system tests with demoAjith Narayanan
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Ra'Fat Al-Msie'deen
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with MicronautQAware GmbH
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization toolsmukul bhardwaj
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusBol.com Techlab
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusBol.com Techlab
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAjith Narayanan
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with MicronautQAware GmbH
 
谷歌 Scott-lessons learned in testability
谷歌 Scott-lessons learned in testability谷歌 Scott-lessons learned in testability
谷歌 Scott-lessons learned in testabilitydrewz lin
 
FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)Naren Chandra
 

Similar to Building resilient applications (20)

Productionizing spark
Productionizing sparkProductionizing spark
Productionizing spark
 
Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !
 
Architecting fail safe data services
Architecting fail safe data servicesArchitecting fail safe data services
Architecting fail safe data services
 
Monitor(karthika)
Monitor(karthika)Monitor(karthika)
Monitor(karthika)
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrent
 
[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test[xp2013] Narrow Down What to Test
[xp2013] Narrow Down What to Test
 
Process scheduling
Process schedulingProcess scheduling
Process scheduling
 
Curator intro
Curator introCurator intro
Curator intro
 
Oracle real application clusters system tests with demo
Oracle real application clusters system tests with demoOracle real application clusters system tests with demo
Oracle real application clusters system tests with demo
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with Micronaut
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization tools
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methods
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with Micronaut
 
谷歌 Scott-lessons learned in testability
谷歌 Scott-lessons learned in testability谷歌 Scott-lessons learned in testability
谷歌 Scott-lessons learned in testability
 
FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)
 

More from Nuno Caneco

Stateful mock servers to the rescue on REST ecosystems
Stateful mock servers to the rescue on REST ecosystemsStateful mock servers to the rescue on REST ecosystems
Stateful mock servers to the rescue on REST ecosystemsNuno Caneco
 
Git from the trenches
Git from the trenchesGit from the trenches
Git from the trenchesNuno Caneco
 
Tuga IT 2017 - Redis
Tuga IT 2017 - RedisTuga IT 2017 - Redis
Tuga IT 2017 - RedisNuno Caneco
 
Tuga it 2017 - Event processing with Apache Storm
Tuga it 2017 - Event processing with Apache StormTuga it 2017 - Event processing with Apache Storm
Tuga it 2017 - Event processing with Apache StormNuno Caneco
 
Fullstack LX - Improving your application performance
Fullstack LX - Improving your application performanceFullstack LX - Improving your application performance
Fullstack LX - Improving your application performanceNuno Caneco
 
Running agile on a non-agile environment
Running agile on a non-agile environmentRunning agile on a non-agile environment
Running agile on a non-agile environmentNuno Caneco
 
Introducing redis
Introducing redisIntroducing redis
Introducing redisNuno Caneco
 
Tuga it 2016 improving your application performance
Tuga it 2016   improving your application performanceTuga it 2016   improving your application performance
Tuga it 2016 improving your application performanceNuno Caneco
 

More from Nuno Caneco (8)

Stateful mock servers to the rescue on REST ecosystems
Stateful mock servers to the rescue on REST ecosystemsStateful mock servers to the rescue on REST ecosystems
Stateful mock servers to the rescue on REST ecosystems
 
Git from the trenches
Git from the trenchesGit from the trenches
Git from the trenches
 
Tuga IT 2017 - Redis
Tuga IT 2017 - RedisTuga IT 2017 - Redis
Tuga IT 2017 - Redis
 
Tuga it 2017 - Event processing with Apache Storm
Tuga it 2017 - Event processing with Apache StormTuga it 2017 - Event processing with Apache Storm
Tuga it 2017 - Event processing with Apache Storm
 
Fullstack LX - Improving your application performance
Fullstack LX - Improving your application performanceFullstack LX - Improving your application performance
Fullstack LX - Improving your application performance
 
Running agile on a non-agile environment
Running agile on a non-agile environmentRunning agile on a non-agile environment
Running agile on a non-agile environment
 
Introducing redis
Introducing redisIntroducing redis
Introducing redis
 
Tuga it 2016 improving your application performance
Tuga it 2016   improving your application performanceTuga it 2016   improving your application performance
Tuga it 2016 improving your application performance
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

Building resilient applications

  • 1. Building resilient applications with Polly netponto - 2019-01-26 Nuno Caneco
  • 3.
  • 4.
  • 5. Availability Availability is the probability that a system will work as required when required during the period of a mission. The mission could be the 18-hour span of an aircraft flight. The mission period could also be the 3 to 15-month span of a military deployment. Availability includes non-operational periods associated with reliability, maintenance, and logistics.
  • 6. Availability levels Nines Unavailable period / year 1 nine - 90% 36.5 days 1,5 nines - 95% 18.2 days 2 nines - 99% 3.7 days 3 nines - 99.9% 8.8 hours 4 nines - 99.99% 52.6 minutes 5 nines - 99.999% 5.3 minutes 6 nines - 99.9999% 30 seconds Availability Cost
  • 7. What is resilience? Resiliency is the ability of a system to gracefully handle and recover from failures. Source MSDN - https://docs.microsoft.com/en-us/azure/architecture/patterns/category/resiliency
  • 8. Ecosystem User interfacePublic API Backend Services Publish / Subscribe Databases BLOB Storage 3rd party systems
  • 9. Crashing You're working here → This dependency crashes →
  • 10. Things can will go wrong As system's complexity grows, the amount and types of issues that might occur and that affect the system availability also increases: Code Library ● Coding errors ● Edge cases Hardware ● Disk ● Network card External systems / APIs ● Coding errors ● Edge cases ● Network issues ● Degradation of service ● Request overload ● Unavailability ● Protocol issues (HTTPS, …) Databases ● Coding errors ● Resource exhaustion ● Network issues ● Degradation of service ● Request overload ● Unavailability
  • 11.
  • 12. Trust zones Application Code Clients App A App B DB Transitive dependencies Network Trusted Untrusted Untrusted
  • 13. The big question now is How does the Application Code deals with failure from dependencies?
  • 14. What are we looking? ● Choose your availability level: Not every application has high availability requirements ● Reduce exposure to dependencies failures: if a dependency fails, the application should do its best to behave ● Assume chaos: Things will go wrong at some point. Be prepared! ● Beware of misbehaved clients: Your clients might be evil. ● Fail fast: In case of failure, the application must fail ASAP and report the problem
  • 16. Polly Polly is a .NET resilience and transient-fault-handling library that allows developers to express policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback in a fluent and thread-safe manner. Polly targets .NET Standard 1.1 (coverage: .NET Framework 4.5-4.6.1, .NET Core 1.0, Mono, Xamarin, UWP, WP8.1+) and .NET Standard 2.0+ (coverage: .NET Framework 4.6.1, .NET Core 2.0+, and later Mono, Xamarin and UWP targets). https://github.com/App-vNext/Polly
  • 17. Usage // Create a policy var policy = Policy .Handle<SomeExceptionType>() .Retry() // Execute action with void return within a policy policy.Execute(() => SomeAction()); // Execute action with return value within a policy var result = policy.Execute(() => SomeAction()); // Implicit return type
  • 18. Retry "Maybe it's just a blip" Automatically retry an operation in case of exception. Timing: ● Immediate retry ● Wait and Retry ○ Constant backoff (e.g wait 10 seconds before retry) ○ Dynamic backoff (e.g. exponential backoff) Perseverance: ● Retry forever ● Give up after n attempts https://github.com/App-vNext/Polly/wiki/Retry
  • 19. Retry
  • 20. Retry - code // Retry once Policy .Handle<SomeExceptionType>() .Retry() // Retry multiple times Policy .Handle<SomeExceptionType>() .Retry(3) // Retry multiple times, calling an action on each retry // with the current exception and retry count Policy .Handle<SomeExceptionType>() .Retry(3, (exception, retryCount) => { // do something });
  • 21. Wait and retry - code // Retry, waiting a specified duration between each retry. // (The wait is imposed on catching the failure, before making the next try.) Policy .Handle<SomeExceptionType>() .WaitAndRetry(new[] { TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(2), TimeSpan.FromSeconds(3) }); // Retry a specified number of times, using a function to calculate the duration to wait between retries // based on the current retry attempt (allows for exponential backoff) Policy .Handle<SomeExceptionType>() .WaitAndRetry(5, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)) ); // In this case will wait for // 2 ^ 1 = 2 seconds then // 2 ^ 2 = 4 seconds then // 2 ^ 3 = 8 seconds then // 2 ^ 4 = 16 seconds then // 2 ^ 5 = 32 seconds
  • 22. Timeout "Don't wait forever" Optimistic timeout Optimistic timeout operates via CancellationToken and assumes delegates you execute support co-operative cancellation. You must use Execute/Async(...) overloads taking a CancellationToken, and the executed delegate must honor that CancellationToken. Pessimistic timeout Pessimistic timeout allows calling code to 'walk away' from waiting for an executed delegate to complete, even if it does not support cancellation. In synchronous
  • 23. Fallback "Degrade gracefully" Provide a substitute value or substitute action in the event of failure. Triggers: ● Exception e.g. if user action threw a HttpRequestException, return "??" ● Specific return values e.g. if user action returned -1, return "??"
  • 24. Fallback - code Policy .Handle<Whatever>() .Fallback<UserAvatar>(UserAvatar.Blank) // Specify a func to provide a substitute value, if execution faults. Policy .Handle<Whatever>() .Fallback<UserAvatar>(() => UserAvatar.GetRandomAvatar()) // Specify a substitute value or func, calling an action (eg for logging) if the fallback is invoked. Policy .Handle<Whatever>() .Fallback<UserAvatar>(UserAvatar.Blank, onFallback: (exception, context)=> { // do something });
  • 25. Fallback Start: execution requested Execute user delegate Exception? Fallback value? End Return result Return fallback value End No No Yes Yes
  • 26. Cache "You've asked that one before" Provides a response from cache if known. ● Multiple cache providers ● Absolute expiration: expire after a given amount of time ● Sliding expiration: keep items that are being hit
  • 27. Cache Start: execution requested Calculate cache key On Cache? End Return result Return cached value End Yes No Invoke user delegate Put result in cache
  • 28. Cache - code // Define a cache Policy in the .NET Framework, using the Polly.Caching.Memory nuget package. var memoryCacheProvider = new MemoryCacheProvider(MemoryCache.Default); var cachePolicy = Policy.Cache(memoryCacheProvider, TimeSpan.FromMinutes(5)); // Define a cache policy with absolute expiration at midnight tonight. var cachePolicy = Policy.Cache(memoryCacheProvider, new AbsoluteTtl(DateTimeOffset.Now.Date.AddDays(1)); // Define a cache policy with sliding expiration: items remain valid for another 5 minutes each time the cache item is used. var cachePolicy = Policy.Cache(memoryCacheProvider, new SlidingTtl(TimeSpan.FromMinutes(5));
  • 30. Circuit breaker "Stop doing it if it hurts" Breaks the circuit (i.e. blocks executions) for a period, when faults exceed some pre-configured threshold.
  • 31. Circuit breaker state machine Closed: ● Executes user action and returns the result ● Initial state Open: ● User action will NOT be executed ● Fail fast by throwing a BrokenCircuitException ● Will remain open until durationOfBreak elapses Half-open: ● Next call will treated as a trial to determine the circuit health ● If throws an exception, circuit will remain Opened ● If no exception, circuit will transition to Closed
  • 32. Circuit Breaker - code // Break the circuit after the specified number of consecutive exceptions // and keep circuit broken for the specified duration. Policy .Handle<SomeExceptionType>() .CircuitBreaker(2, TimeSpan.FromMinutes(1)); // Break the circuit after the specified number of consecutive exceptions // and keep circuit broken for the specified duration, // calling an action on change of circuit state. Action<Exception, TimeSpan> onBreak = (exception, timespan) => { ... }; Action onReset = () => { ... }; CircuitBreakerPolicy breaker = Policy .Handle<SomeExceptionType>() .CircuitBreaker(2, TimeSpan.FromMinutes(1), onBreak, onReset);
  • 33. Manually breaking the circuit // Monitor the circuit state, for example for health reporting. CircuitState state = breaker.CircuitState; /* CircuitState.Closed - Normal operation. Execution of actions allowed. CircuitState.Open - The automated controller has opened the circuit. Execution of actions blocked. CircuitState.HalfOpen - Recovering from open state, after the automated break duration has expired. Execution of actions permitted. Success of subsequent action/s controls onward transition to Open or Closed state. CircuitState.Isolated - Circuit held manually in an open state. Execution of actions blocked. */ // Manually open (and hold open) a circuit breaker - for example to manually isolate a downstream service. breaker.Isolate(); // Reset the breaker to closed state, to start accepting actions again. breaker.Reset();
  • 35. Bulkhead Isolation "One fault shouldn't sink the whole ship" Constrains the governed actions to a fixed-size resource pool, isolating their potential to affect others.
  • 36. PolicyWrap "Defense in depth" Allows multiple policies to be combined.
  • 40. Patrocinadores “GOLD” Twitter: @PremiumMinds https://www.premium-minds.com
  • 43. http://bit.ly/netponto-aval-79 * Para quem não puder preencher durante a reunião, iremos enviar um email com o link à tarde
  • 44. Próximas reuniões presenciais 07/04/2018 – Lisboa 23/06/2018 – Lisboa 15/09/2018 – Lisboa 24/11/2018 – Lisboa Reserva estes dias na agenda! :)