SlideShare a Scribd company logo
1 of 33
Walter Rumsby nz.js(con) 22 June 2021
Observability
for JavaScript Developers
Hello
What is Observability?
Control theory
Observability is the ability to measure the
internal states of a system by examining
it’s outputs.
honeycomb.io
Observability allows you to answer
arbitrary questions about your distributed
system.
Why Should You Care?
Team Topologies
A Stream-aligned team is aligned to a flow of work from (usually) a
segment of the business domain.
A Platform team is a grouping of other team types that provide a
compelling internal product to accelerate delivery by
Stream-aligned teams.
A DevOps approach believes that developers will
deliver better software when they better understand
how their customers actually experience their
software. *
* Kubernetes is not required
Charity Majors
I don’t believe that you can be a senior engineer until you’ve spent enough time in
production.
Otherwise I don’t care how many data structures and algorithms you know, your
intuition will have been trained on something that isn’t real and that means I can’t
trust it.
10 PRINTLN “WALTER RULES”
20 GOTO 10
Observability tightens the
feedback loop.
How to Use Observability to Make
Decisions
Charity Majors & Liz Fong-Jones from 97 Things Every SRE Should Know
Cardinality matters for observability, because
high-cardinality information is the most useful
data for debugging or understanding a system.
Consider the usefulness of sorting by fields like
user IDs, shopping card IDs, request IDs, or
myriad other IDs such as instances, container,
build number, spans and so forth.
SELECT * FROM AwsLambdaInvocation WHERE
displayName LIKE ‘my-service-production-%’ AND userId =
‘d83efc35-ef32-474f-a749-3469fb8fe2fc’ SINCE 3 hours ago
LIMIT MAX
SELECT uniques(customAttributeKeys) FROM
AwsLambdaInvocation WHERE displayName LIKE ‘my-
service-production-%’ SINCE 1 week ago
Things to Look Out For
Quotas, reserved words & solutions
that aren’t instrumented.
In Conclusion
More Info
@wrumsby
@mipsytipsy
O11ycast
opentelemetry.io
Team Topologies
Accelerate
conqahq.com/careers

More Related Content

Similar to Observability for JavaScript Developers

Observability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationObservability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationCloudZenix LLC
 
DevOps - Automating Legacy
DevOps - Automating LegacyDevOps - Automating Legacy
DevOps - Automating LegacyDavid Tank
 
Workshop - The Little Pattern That Could.pdf
Workshop - The Little Pattern That Could.pdfWorkshop - The Little Pattern That Could.pdf
Workshop - The Little Pattern That Could.pdfTobiasGoeschel
 
Common primitives in Docker environments
Common primitives in Docker environmentsCommon primitives in Docker environments
Common primitives in Docker environmentsalexandru giurgiu
 
Agile architecture upload
Agile architecture uploadAgile architecture upload
Agile architecture uploadThe Real Dyl
 
Journey from Monolith to a Modularized Application - Approach and Key Learnin...
Journey from Monolith to a Modularized Application - Approach and Key Learnin...Journey from Monolith to a Modularized Application - Approach and Key Learnin...
Journey from Monolith to a Modularized Application - Approach and Key Learnin...mfrancis
 
Monitoring and metrics in the cloud
Monitoring and metrics in the cloudMonitoring and metrics in the cloud
Monitoring and metrics in the cloudDavid Lutz
 
Testing In Software Engineering
Testing In Software EngineeringTesting In Software Engineering
Testing In Software Engineeringkiansahafi
 
Winnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsWinnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsGene Kim
 
Fast and effective analysis of architecture diagrams
Fast and effective analysis of architecture diagrams Fast and effective analysis of architecture diagrams
Fast and effective analysis of architecture diagrams GlobalLogic Ukraine
 
Version Uncontrolled - How to Manage Your Version Control (whitepaper)
Version Uncontrolled - How to Manage Your Version Control (whitepaper)Version Uncontrolled - How to Manage Your Version Control (whitepaper)
Version Uncontrolled - How to Manage Your Version Control (whitepaper)Revelation Technologies
 
Culture is more important than competence in IT outsourcing
Culture is more important than competence in IT outsourcingCulture is more important than competence in IT outsourcing
Culture is more important than competence in IT outsourcingBJIT Ltd
 
Making software development processes to work for you
Making software development processes to work for youMaking software development processes to work for you
Making software development processes to work for youAmbientia
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Brian Brazil
 
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...NETWAYS
 
Operational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarOperational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarCanturk Isci
 
Putting Devs On-Call: How to Empower Your Team
Putting Devs On-Call: How to Empower Your TeamPutting Devs On-Call: How to Empower Your Team
Putting Devs On-Call: How to Empower Your TeamVictorOps
 
Drools Presentation for Tallink.ee
Drools Presentation for Tallink.eeDrools Presentation for Tallink.ee
Drools Presentation for Tallink.eeAnton Arhipov
 

Similar to Observability for JavaScript Developers (20)

Observability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationObservability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital Transformation
 
Build vs buy
Build vs buyBuild vs buy
Build vs buy
 
DevOps - Automating Legacy
DevOps - Automating LegacyDevOps - Automating Legacy
DevOps - Automating Legacy
 
Workshop - The Little Pattern That Could.pdf
Workshop - The Little Pattern That Could.pdfWorkshop - The Little Pattern That Could.pdf
Workshop - The Little Pattern That Could.pdf
 
Common primitives in Docker environments
Common primitives in Docker environmentsCommon primitives in Docker environments
Common primitives in Docker environments
 
Agile architecture upload
Agile architecture uploadAgile architecture upload
Agile architecture upload
 
Journey from Monolith to a Modularized Application - Approach and Key Learnin...
Journey from Monolith to a Modularized Application - Approach and Key Learnin...Journey from Monolith to a Modularized Application - Approach and Key Learnin...
Journey from Monolith to a Modularized Application - Approach and Key Learnin...
 
Monitoring and metrics in the cloud
Monitoring and metrics in the cloudMonitoring and metrics in the cloud
Monitoring and metrics in the cloud
 
Testing In Software Engineering
Testing In Software EngineeringTesting In Software Engineering
Testing In Software Engineering
 
Winnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsWinnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOps
 
Fast and effective analysis of architecture diagrams
Fast and effective analysis of architecture diagrams Fast and effective analysis of architecture diagrams
Fast and effective analysis of architecture diagrams
 
Version Uncontrolled - How to Manage Your Version Control (whitepaper)
Version Uncontrolled - How to Manage Your Version Control (whitepaper)Version Uncontrolled - How to Manage Your Version Control (whitepaper)
Version Uncontrolled - How to Manage Your Version Control (whitepaper)
 
Culture is more important than competence in IT outsourcing
Culture is more important than competence in IT outsourcingCulture is more important than competence in IT outsourcing
Culture is more important than competence in IT outsourcing
 
Making software development processes to work for you
Making software development processes to work for youMaking software development processes to work for you
Making software development processes to work for you
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...
stackconf 2023 | Better Living by Changing Less – IncrativeOps by Michael Cot...
 
How To Plan a Software Project
How To Plan a Software ProjectHow To Plan a Software Project
How To Plan a Software Project
 
Operational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarOperational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU Seminar
 
Putting Devs On-Call: How to Empower Your Team
Putting Devs On-Call: How to Empower Your TeamPutting Devs On-Call: How to Empower Your Team
Putting Devs On-Call: How to Empower Your Team
 
Drools Presentation for Tallink.ee
Drools Presentation for Tallink.eeDrools Presentation for Tallink.ee
Drools Presentation for Tallink.ee
 

Recently uploaded

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Observability for JavaScript Developers

Editor's Notes

  1. Hello. Today I want to talk to you about observability and why JavaScript developers should understand this topic. I want to cover: * What is observability? * Why should you care? * How to use observability to understand how your system works and make decisions * Where to find out more
  2. So, what is observability? The canonical definition is:
  3. but like other terms in our industry different people have different definitions, particularly if they don't fully understand the canonical definition, so a definition I like that might be easier to understand is:
  4. The two key words there are "arbitrary" and "distributed". "Arbitrary" is important because you want to be able to answer questions you haven't anticipated. You might have experience with Application Performance Monitoring tools that give you insight into diagnostic information about memory use, CPU utilisation, etc. Observability goes beyond monitoring to allow you to answer questions that are specific to your problem domain. We'll see some examples later of the sorts of arbitrary questions you might want to answer. "Distributed" is important because observability becomes more important as your system becomes more complex. If a request from a browser leads to multiple calls across multiple services and databases observability tools help you group these together as a single "distributed trace" so that you can see end-to-end what is happening. Again, we'll see some examples of distributed tracing later on.
  5. Maybe you work for an organisation where teams are structured like this. This diagram uses the team shape conventions from the book "Team Topologies". I really recommend this book if you're responsible for organising software teams. The development teams in this diagram are "Stream-aligned" teams and the operations team is a "Platform" team. Let's say you're in one of several development teams working on developing product and when you're ready to release you let the operations team know and they push that code to production, they monitor the production environment and they address any uptime issues with the system. Sounds pretty good, right? Let's go back and look at definitions that come from Team Topologies to better understand the team types:
  6. One of the motivations for a separate operations team is believing that this helps accelerate delivery of those Stream-aligned teams. So who knows the most about what's going on for your users in that situation? The operations team! Is that what you want? This means you and your teammates in the development team are disconnected from what might be frustrating the users of your software. You might even be unaware of things the operations team are doing to keep your software running. Let's say you use PostgreSQL as your application database. In which team would you want the person who has the most PostgreSQL experience working? If they're in the operations team then they might be having to plug holes in your team's solution that would have been avoided if that knowledge had been in your team as part of planning and development. If the operations team are serving multiple teams they also have to balance the needs of of those other teams. That might mean there's a gap between when your team submits code that's ready to release and when it can be released. That doesn't seem like it helps accelerate delivery? I don't mean to hate on operations people here. In my experience they're smart people doing their best to make sure systems are available for users, but there are clearly some unintended consequences of this way of working and as our tools and practices have evolved we can consider other ways of working. What might have been the best solution 10 or 20 years ago might no longer be the best solution today.
  7. Over the last 5-10 years there has been a lot of talk about DevOps, but I would argue a lot of people don't understand the point of DevOps. A bit like our definition of observability above, the meaning of DevOps has become blurred. DevOps doesn't exist so you can use Kubernetes, or write lots of Bash scripts. DevOps is about bringing developers closer to operations because operations is closer to the customers. > DevOps is about bringing developers closer to customers. A DevOps approach believes that developers will deliver better software when they better understand how their customers actually experience their software. This means that a Stream-aligned team working according to DevOps principles is responsible for all of the following:
  8. With this model a "development" team has to also handle responsibility for: * release * configure * monitor * and including operational concerns in planning
  9. Change can be pretty scary and a model that can help traditional development teams transition to development teams working with a DevOps approach is to have a DevOps team acting as an "enabling" team bringing up the skills and capabilities of development teams. I don't believe that having a DevOps team like this is the end goal - it's part of the journey, not the destination. Still, change is scary and it's easy to forget why we're doing this. We're doing this because we believe it results in better software and there's research that backs this up.
  10. Monitoring is a responsibility for development teams following a DevOps approach. Observability takes monitoring beyond traditional measures like CPU time, memory use, etc.
  11. If you're interested in Observability then Charity Majors is someone who you should get to know. Charity is the CTO at Honeycomb and Honeycomb are a vendor in the observability space. I was recently watching some presentations from the 2017 Monitorama conference when Charity made this point: Emphasising this point, in an episode of the O11ycast podcast last year Charity said:
  12. That might sound controversial and I'm probably not helping by highlighting that quote out of context, but in context it's a statement I agree with. The thing that got me hooked on computer programming when I was 8 or 9 years old is that you could write some code and see the impact of it straight away. At that point in time I was probably writing stuff like:
  13. and I think that's why so much of my career involved writing UI code. There's this fast feedback loop you can get from changing a UI - does it work better? does it look better? sometimes it goes terribly wrong even though I didn't plan for that to be true. I believe what Charity is saying is that you want to do the same sort of feedback for your non UI code. Once it's in production you want to be tell that it's working through observing it and observability tools help us do that.
  14. There's one last point I want to make before looking at some tooling. A week or two ago I heard about a 2014 study that found that effective listening was the single highest contributor to team effectiveness, i.e. teams in environments where people felt like their team listened to each other and understood what was going on performed better than those in environments where that was not the case. I'd like to suggest more broadly that teams that observe and are aware of what's going on are more effective and that's why an approach that includes observability should lead to better software.
  15. I'm going to use New Relic as an observability solution. I'm choosing New Relic because it's the system I'm most familiar with. I'm not saying it's the best system or it's the right system for you. For any of these systems that offer a Node agent you'll likely find an agent on npm with fairly standard installation instructions.
  16. If I install the agent and use it to monitor an Express application, I'll get information into HTTP requests, memory use, etc. If I'm directly calling a database that the agent knows how to instrument I'll get some information about those database queries too. New Relic also supports monitoring of AWS Lambda functions, so with that setup I can see CloudWatch metrics like cold starts, memory use, etc.
  17. Here's a summary of a lambda invocation which is really just a high-level overview of what's going on, but I can already get an idea of where the work is being done and notice if there's something that stands out in terms of how long it's taking. Out of the box these tools do a lot for you, but my recommendation is that you look at the API documentation for your agent and search for the term "custom", because "custom" is where interesting things happen.
  18. Custom instrumentation, custom transactions and custom metrics are relatively "low level". For instance, you'd use custom instrumentation if you wanted to support monitoring for a database that the standard agent doesn't currently support. Custom events and in particular custom attributes are what's generally most useful. Other tools support similar ways of tracking custom information, maybe they call this "custom data" or something else. Your key word here is "custom" which indicates measuring something particular about your solution that the standard measures aren't aware of. What sort of something might you want to track?
  19. In the book [97 Things Every SRE Should Know](https://www.oreilly.com/library/view/97-things-every/9781492081487/ch14.html) Charity Majors and Honeycomb developer advocate Liz Fong-Jones talk about the concept of "cardinality" which refers to the number of unique items in a set. If there are a high number of unqiue items in a set then that set has high cardinality. From the book: … So, if I track user ID or shopping cart ID as a custom attribute of a lambda function invocation, when New Relic records the invocation I now have these additional attributes available. Out of the box New Relic doesn't know that user ID or shopping cart ID are important facets of my code, custom attributes let me add that information.
  20. New Relic has its own query language, NRQL, that allows you to query browser events, lambda invocations, even infrastructure events like available disk space on a server, e.g. I can write a query like: … `AwsLambdaInvocation` is a standard transaction provided by New Relic and the standard attributes captured by New Relic are documented in the [attribute dictionary](https://docs.newrelic.com/attribute-dictionary/?event=AwsLambdaInvocation). `userId` is a custom attribute that has been added to the standard attributes. If I'm aware that a particular user has been experiencing issues or I'm just trying to understand what's going on from my own testing of the system this attribute provides a way of filtering down the transactions to just those I'm interested in.
  21. I can even query the `customAttributeKeys` attribute on a transaction to work out which custom attributes are being tracked. NRQL can also be used to create alerts and dashboards. Alerts tell you when something's not right and can be set up to send a notification to Slack, or an incident response tool like PagerDuty, OpsGenie or VictorOps.
  22. The other key word in our definition of observability above was "distributed". Distributed tracing helps give you visibility into distributed systems - systems where there's more than one service doing the work. Let's start with some simple traces, ones that aren't distributed, we'll look at how they can help you ask questions, then we'll look at distributed tracing. Here is a trace of a Lambda function invocation. This is actually from our production environment last week. … What questions does this raise? Does anything look wrong here? What stands out for me is that there are 44 spans, none of them seem anomalous, the total duration here arguably isn't so bad, but do we know what's leading to those 44 spans? Are there situations where this function might perform poorly or generate additional spans? It turns out, yes, there is.
  23. If you have Lambda functions triggered by a REST API call API Gateway will timeout after 29 seconds, so we have alerts set up to warn up if functions take longer than 24 seconds to run. This particular alert prompted someone on our product engineering team to respond to the event and look into the cause of the issue. Changing the operation to perform a bulk get rather than multiple individual gets changes things so they look like this:
  24. Now there are only 5 spans, the function invocation is about 1/2 of what it was before. With this change we're spending less on DynamoDB, our customers are getting responses faster. Our software is better. Tracing can help you identify issues and resolve them. _Distributed_ tracing helps you understand calls when multiple services are involved.
  25. Here we can see how a call to one service invokes 2 other services. Different colours denote different services which means we can tell that service B runs a single query against DynamoDB that takes 21ms. Where we have an outline is where there is a discrepancy in the time taken measured from the perspective of the caller vs the perspective of the called function. This can be for a number of reasons: * DNS resolution * cold start * time taken to receive the request or send the response (we are talking about ms here) * other causes network latency Once you get understand these visualisations they help you spot things that aren't right, ask questions and can prompt you to try alternative solutions to address any issues.
  26. A DevOps mindset means development teams are responsible for building and running software (aka "you build it you run it"). By giving that responsibility to development teams those teams can make decisions that are more appropriate for their context. Part of running software is understanding how that software works in production and observability tools help you better understand what's going on with your software by allowing you to answer arbitrary questions of distributed systems.