SlideShare a Scribd company logo
1 of 54
Download to read offline
~ @gianarb - https://gianarb.it ~
Debug like a pro on
Kubernetes
Golab - Florence - 2018
~ @gianarb - https://gianarb.it ~
Gianluca Arbezzano
Site Reliability Engineer @InfluxData
● http://gianarb.it
● @gianarb
What I like:
● I make dirty hacks that look awesome
● I grow my vegetables 🍅🌻🍆
● Travel for fun and work
1. Yo n !
Your team knows
and use Docker for
local development
and testing
2. Kub te !
Everyone speaks
about kubernetes.
3. Hir !
You don’t know why
but you hired a
DevOps that kind of
know k8s.
3. Ex i m !
You are moving
everything and
everyone to
kubernetes
Inspired by a true story
You need a good book
1. Short
2. Driven by experiences
3. Practical
4. Easy
We need to make our
hands dirty
Spin up a cluster that you
can break
Bring developers in the loop
Deploy CI on Kubernetes
Bring developers in the loop
Run your code in prod
Bring developers in the loop
Don’t be scared and write your
own tools!
K8s as code: From YAML to code (golang)
1. You have the ability to use Golang autocomplete as documentation, reference for every
kubernetes resources
2. You feel less a YAML engineer (great feeling btw)
3. Code is better than YAML! You can reuse it, compile it, embed it in other projects.
K8s as code: From YAML to code (golang)
Tiny cli
to make
the
migration
to golang
Some
manual
refactoring
K8s as code: From YAML to code (golang)
Tiny cli
to make
the
migration
to golang
Some
manual
refactoring
● Continue to improve our CI to validate that YAML and Go file are the same,
and the resources in Kubernetes are like the Go file.
● Maybe we will be able to remove the YAML at some point.
Examples
● Everything has an API because you should USE it TO make something good!
(cURL is good but you can make something better)
● Some of our tools:
○ Backup and Restore Operator for Persistent Volumes
○ We have a service to create runtime isolated environment to allow devs to test or product
people to have a safe environment to demo, try. We also use it for the integration and smoke
tests.
○ We have a tool to replicate environment locally on minikube to install and configure all the
dependencies
Instrumentation and
Observability
We need to have processes and tools that give us
the ability to take a real time picture of our system
Observability
It is all about how we collect and aggregate the data
Normal state vs Current state
Bring developers in the loop
You need knowledgeable devs to drive the team
Observability
Events, metrics, logs and traces
Golang and Kubernetes: OpenCensus
● Open Source project sponsored by Google
● It is a SPEC plus a set of libraries in different languages to instrument your
application
● To collect metrics, traces and events.
Golang and Kubernetes: OpenCensus
Common
Interface to
get stats and
traces from
your app
Different
exporters to
persist your
data
gianarb.it ~ @gianarb
# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{method="post",code="200"} 1027 1395066363000
http_requests_total{method="post",code="400"} 3 1395066363000
# Escaping in label values:
msdos_file_access_time_seconds{path="C:DIRFILE.TXT",error="Cannot find file:n"FILE.TXT""}
1.458255915e9
# Minimalistic line:
metric_without_timestamp_and_labels 12.47
# A weird metric from before the epoch:
something_weird{problem="division by zero"} +Inf -3982045
# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320
OpenMetrics
v2 Prometheus exposition format
gianarb.it ~ @gianarb
gianarb.it ~ @gianarb
func FetchMetricFamilies(url string, ch chan<- *dto.MetricFamily, certificate string, key string, skipServerCertCheck bool) error {
defer close(ch)
var transport *http.Transport
if certificate != "" && key != "" {
cert, err := tls.LoadX509KeyPair(certificate, key)
if err != nil {
return err
}
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{cert},
InsecureSkipVerify: skipServerCertCheck,
}
tlsConfig.BuildNameToCertificate()
transport = &http.Transport{TLSClientConfig: tlsConfig}
} else {
transport = &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: skipServerCertCheck},
}
}
client := &http.Client{Transport: transport}
return decodeContent(client, url, ch)
}
https://github.com/prometheus/prom2json/blob/master/prom2json.go#L123
gianarb.it ~ @gianarb
More to read
● OpenMetrics: https://github.com/OpenObservability/OpenMetrics
● OpenMetrics mailing list: https://groups.google.com/forum/#protectkern-.
1667emrelaxforum/openmetrics
● WIP branch for Python library
https://github.com/prometheus/client_python/tree/openmetrics
● Thanks RICHARD for your work! I got some slides from here:
https://promcon.io/2018-munich/slides/openmetrics-transforming-the-prometh
eus-exposition-format-into-a-global-standard.pdf
Tracing
How do you “tell stories” about concurrent
systems?
How do you “tell stories” about
concurrent systems?
OpenTracing
gianarb.it ~ @gianarb
OpenTracing
I was just waiting for a new
standard!
cit. Troll
© 2017 InfluxData. All rights reserved.34
Typical problems with logs
¨ Which library do I need to use?
¨ Every library has a different format
¨ Every languages exposes a different format
© 2017 InfluxData. All rights reserved.35
Tracing is not something new
¨ There are vendors
¨ Every vendor has their own format
© 2017 InfluxData. All rights reserved.36
log log log
log log
log
Parent Span Span Context / Baggage
Child
Child
Child Span
¨ Spans - Basic unit of timing and causality. Can be tagged with
key/value pairs.
¨ Logs - Structured data recorded on a span.
¨ Span Context - serializable format for linking spans across network
boundaries. Carries baggage, such as a request and client IDs.
¨ Tracers - Anything that plugs into the OpenTracing API to record
information.
¨ ZipKin, Jaeger, LightStep, others
¨ Also metrics (Prometheus) and logging
© 2017 InfluxData. All rights reserved.37
OpenTracing
API
application logic
µ-service frameworks
Lambda functions
RPC & control-flow frameworks
existing instrumentation
tracing infrastructure
main()
I N S T A N A
J a e g e r
microservice process
© 2017 InfluxData. All rights reserved.38
import "github.com/opentracing/opentracing-go"
import ".../some_tracing_impl"
func main() {
opentracing.SetGlobalTracer(
// tracing impl specific:
some_tracing_impl.New(...),
)
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Configure the GlobalTracer
© 2017 InfluxData. All rights reserved.39
func xyz(ctx context.Context, ...) {
...
span, ctx := opentracing.StartSpanFromContext(ctx, "operation_name")
defer span.Finish()
span.LogFields(
log.String("event", "soft error"),
log.String("type", "cache timeout"),
log.Int("waited.millis", 1500))
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Create a Span from the Context
© 2017 InfluxData. All rights reserved.40
func xyz(parentSpan opentracing.Span, ...) {
...
sp := opentracing.StartSpan(
"operation_name",
opentracing.ChildOf(parentSpan.Context()))
defer sp.Finish()
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Create a Child Span
Golang and Kubernetes: pprof
● It is the Golang native profiler
● You can use it via the `go pprof` command
● `import "runtime/pprof"` writes runtime profiling data
● `import "net/http/pprof"` serves via HTTP server runtime profiling data
Golang and Kubernetes: pprof
package main
import (
"log"
"net/http"
_ "net/http/pprof"
)
func main() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}
$ go tool pprof http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in
/home/gianarb/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.009.pb.gz
File: main
Type: inuse_space
Time: Oct 15, 2018 at 4:22pm (CEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) png
Generating report in profile001.png
Golang and Kubernetes: pprof
https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go
// handleProfiles determines which profile to return to the requester.
func (h *Handler) handleProfiles(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/debug/pprof/cmdline":
httppprof.Cmdline(w, r)
case "/debug/pprof/profile":
httppprof.Profile(w, r)
case "/debug/pprof/symbol":
httppprof.Symbol(w, r)
case "/debug/pprof/all":
h.archiveProfilesAndQueries(w, r)
default:
httppprof.Index(w, r)
}
}
Golang and Kubernetes: pprof
https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go
// archiveProfilesAndQueries collects the following profiles:
// - goroutine profile
// - heap profile
// - blocking profile
// - mutex profile
// - (optionally) CPU profile
//
// It also collects the following query results:
//
// - SHOW SHARDS
// - SHOW STATS
// - SHOW DIAGNOSTICS
//
// All information is added to a tar archive and then compressed, before being
// returned to the requester as an archive file.
Golang and Kubernetes: pprof
https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go
var allProfs = []*prof{
{Name: "goroutine", Debug: 1},
{Name: "block", Debug: 1},
{Name: "mutex", Debug: 1},
{Name: "heap", Debug: 1},
}
gz := gzip.NewWriter(&resp)
tw := tar.NewWriter(gz)
// Collect and write out profiles.
for _, profile := range allProfs {
if profile.Name == "cpu" {
if err := pprof.StartCPUProfile(&buf); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
sleep(w, time.Duration(profile.Debug)*time.Second)
pprof.StopCPUProfile()
} else {
prof := pprof.Lookup(profile.Name)
if prof == nil {
http.Error(w, "unable to find profile "+profile.Name, http.StatusInternalServerError)
return
}
if err := prof.WriteTo(&buf, int(profile.Debug)); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
}
Golang and Kubernetes: pprof
https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go
// Collect and write out the queries.
var allQueries = []struct {
name string
fn func() ([]*models.Row, error)
}{
{"shards", h.showShards},
{"stats", h.showStats},
{"diagnostics", h.showDiagnostics},
}
tabW := tabwriter.NewWriter(&buf, 8, 8, 1, 't', 0)
for _, query := range allQueries {
rows, err := query.fn()
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
func (h *Handler) showDiagnostics() ([]*models.Row, error) {
diags, err := h.Monitor.Diagnostics()
if err != nil {
return nil, err
}
// Get a sorted list of diagnostics keys.
sortedKeys := make([]string, 0, len(diags))
for k := range diags {
sortedKeys = append(sortedKeys, k)
}
sort.Strings(sortedKeys)
rows := make([]*models.Row, 0, len(diags))
for _, k := range sortedKeys {
row := &models.Row{Name: k}
row.Columns = diags[k].Columns
row.Values = diags[k].Rows
rows = append(rows, row)
}
return rows, nil
}
Remote Debugging
in Kubernetes
Delve, GDB
Remote Debugging
In Kubernetes
DOESN’T APPLY!
That’s not totally true. You can expose the debugger
endpoint. It works. It doesn’t apply for production
workload.
~ @gianarb - https://gianarb.it ~
Thanks
@gianarb

More Related Content

What's hot

Using kubernetes to lose your fear of using containers
Using kubernetes to lose your fear of using containersUsing kubernetes to lose your fear of using containers
Using kubernetes to lose your fear of using containersjosfuecas
 
Kubernetes stack reliability
Kubernetes stack reliabilityKubernetes stack reliability
Kubernetes stack reliabilityOleg Chunikhin
 
WTF Do We Need a Service Mesh?
WTF Do We Need a Service Mesh? WTF Do We Need a Service Mesh?
WTF Do We Need a Service Mesh? Anton Weiss
 
Lifecycle of a pod
Lifecycle of a podLifecycle of a pod
Lifecycle of a podHarshal Shah
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPassPeter Vandenabeele
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
Kubernetes &amp; the 12 factor cloud apps
Kubernetes &amp; the 12 factor cloud appsKubernetes &amp; the 12 factor cloud apps
Kubernetes &amp; the 12 factor cloud appsAna-Maria Mihalceanu
 
Building Portable Applications with Kubernetes
Building Portable Applications with KubernetesBuilding Portable Applications with Kubernetes
Building Portable Applications with KubernetesKublr
 
Effective Building your Platform with Kubernetes == Keep it Simple
Effective Building your Platform with Kubernetes == Keep it Simple Effective Building your Platform with Kubernetes == Keep it Simple
Effective Building your Platform with Kubernetes == Keep it Simple Wojciech Barczyński
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibilityDocker, Inc.
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingSreenivas Makam
 
K8s best practices from the field!
K8s best practices from the field!K8s best practices from the field!
K8s best practices from the field!DoiT International
 
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an Enterprise
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an EnterpriseKubeCon EU 2016: ITNW (If This Now What): Orchestrating an Enterprise
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an EnterpriseKubeAcademy
 
Network services on Kubernetes on premise
Network services on Kubernetes on premiseNetwork services on Kubernetes on premise
Network services on Kubernetes on premiseHans Duedal
 
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepSetting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepOleg Chunikhin
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB
 

What's hot (20)

Using kubernetes to lose your fear of using containers
Using kubernetes to lose your fear of using containersUsing kubernetes to lose your fear of using containers
Using kubernetes to lose your fear of using containers
 
Kubernetes stack reliability
Kubernetes stack reliabilityKubernetes stack reliability
Kubernetes stack reliability
 
WTF Do We Need a Service Mesh?
WTF Do We Need a Service Mesh? WTF Do We Need a Service Mesh?
WTF Do We Need a Service Mesh?
 
Lifecycle of a pod
Lifecycle of a podLifecycle of a pod
Lifecycle of a pod
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Kubernetes Logging
Kubernetes LoggingKubernetes Logging
Kubernetes Logging
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
Kubernetes &amp; the 12 factor cloud apps
Kubernetes &amp; the 12 factor cloud appsKubernetes &amp; the 12 factor cloud apps
Kubernetes &amp; the 12 factor cloud apps
 
Building Portable Applications with Kubernetes
Building Portable Applications with KubernetesBuilding Portable Applications with Kubernetes
Building Portable Applications with Kubernetes
 
Effective Building your Platform with Kubernetes == Keep it Simple
Effective Building your Platform with Kubernetes == Keep it Simple Effective Building your Platform with Kubernetes == Keep it Simple
Effective Building your Platform with Kubernetes == Keep it Simple
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibility
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
 
K8s best practices from the field!
K8s best practices from the field!K8s best practices from the field!
K8s best practices from the field!
 
Ingress overview
Ingress overviewIngress overview
Ingress overview
 
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an Enterprise
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an EnterpriseKubeCon EU 2016: ITNW (If This Now What): Orchestrating an Enterprise
KubeCon EU 2016: ITNW (If This Now What): Orchestrating an Enterprise
 
Network services on Kubernetes on premise
Network services on Kubernetes on premiseNetwork services on Kubernetes on premise
Network services on Kubernetes on premise
 
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepSetting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 

Similar to Kubernetes debug like a pro

Fullstack workshop
Fullstack workshopFullstack workshop
Fullstack workshopAssaf Gannon
 
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...DevOps_Fest
 
202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUPRonald Hsu
 
Why we chose Argo Workflow to scale DevOps at InVision
Why we chose Argo Workflow to scale DevOps at InVisionWhy we chose Argo Workflow to scale DevOps at InVision
Why we chose Argo Workflow to scale DevOps at InVisionNebulaworks
 
GraalVM and MicroProfile - A Polyglot Microservices Solution
GraalVM and MicroProfile - A Polyglot Microservices SolutionGraalVM and MicroProfile - A Polyglot Microservices Solution
GraalVM and MicroProfile - A Polyglot Microservices SolutionRoberto Cortez
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringGianluca Arbezzano
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoNETWAYS
 
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...César Hernández
 
Splunk n-box-splunk conf-2017
Splunk n-box-splunk conf-2017Splunk n-box-splunk conf-2017
Splunk n-box-splunk conf-2017Mohamad Hassan
 
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...Flink Forward
 
OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersJavan Rasokat
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Marco Pas
 
Declarative Infrastructure Tools
Declarative Infrastructure Tools Declarative Infrastructure Tools
Declarative Infrastructure Tools Yulia Shcherbachova
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using SwiftDiego Freniche Brito
 
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaKerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaSpark Summit
 
Release with confidence
Release with confidenceRelease with confidence
Release with confidenceJohn Congdon
 
Async programming and python
Async programming and pythonAsync programming and python
Async programming and pythonChetan Giridhar
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingMax Kleiner
 
Writing multi-language documentation using Sphinx
Writing multi-language documentation using SphinxWriting multi-language documentation using Sphinx
Writing multi-language documentation using SphinxMarkus Zapke-Gründemann
 
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...Andrey Devyatkin
 

Similar to Kubernetes debug like a pro (20)

Fullstack workshop
Fullstack workshopFullstack workshop
Fullstack workshop
 
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
 
202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP
 
Why we chose Argo Workflow to scale DevOps at InVision
Why we chose Argo Workflow to scale DevOps at InVisionWhy we chose Argo Workflow to scale DevOps at InVision
Why we chose Argo Workflow to scale DevOps at InVision
 
GraalVM and MicroProfile - A Polyglot Microservices Solution
GraalVM and MicroProfile - A Polyglot Microservices SolutionGraalVM and MicroProfile - A Polyglot Microservices Solution
GraalVM and MicroProfile - A Polyglot Microservices Solution
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
 
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...
2018 (codeone) Graal VM and MicroProfile a polyglot microservices solution [d...
 
Splunk n-box-splunk conf-2017
Splunk n-box-splunk conf-2017Splunk n-box-splunk conf-2017
Splunk n-box-splunk conf-2017
 
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...
Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Fli...
 
OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA Testers
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)
 
Declarative Infrastructure Tools
Declarative Infrastructure Tools Declarative Infrastructure Tools
Declarative Infrastructure Tools
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
 
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaKerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
 
Release with confidence
Release with confidenceRelease with confidence
Release with confidence
 
Async programming and python
Async programming and pythonAsync programming and python
Async programming and python
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage coding
 
Writing multi-language documentation using Sphinx
Writing multi-language documentation using SphinxWriting multi-language documentation using Sphinx
Writing multi-language documentation using Sphinx
 
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
 

More from Gianluca Arbezzano

Value of your metrics: goodbye monitoring, welcome observability
Value of your metrics: goodbye monitoring, welcome observabilityValue of your metrics: goodbye monitoring, welcome observability
Value of your metrics: goodbye monitoring, welcome observabilityGianluca Arbezzano
 
InfluxCloudi craft container orchestrator
InfluxCloudi craft container orchestratorInfluxCloudi craft container orchestrator
InfluxCloudi craft container orchestratorGianluca Arbezzano
 
Orbiter and how to extend Docker Swarm
Orbiter and how to extend Docker SwarmOrbiter and how to extend Docker Swarm
Orbiter and how to extend Docker SwarmGianluca Arbezzano
 
Overview and Opentracing in theory by Gianluca Arbezzano
Overview and Opentracing in theory by Gianluca ArbezzanoOverview and Opentracing in theory by Gianluca Arbezzano
Overview and Opentracing in theory by Gianluca ArbezzanoGianluca Arbezzano
 
Monitoring Pull vs Push, InfluxDB and Prometheus
Monitoring Pull vs Push, InfluxDB and PrometheusMonitoring Pull vs Push, InfluxDB and Prometheus
Monitoring Pull vs Push, InfluxDB and PrometheusGianluca Arbezzano
 
Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017Gianluca Arbezzano
 
Security Tips to run Docker in Production
Security Tips to run Docker in ProductionSecurity Tips to run Docker in Production
Security Tips to run Docker in ProductionGianluca Arbezzano
 
Jenkins in the real world - DevOpsCon 2017
Jenkins in the real world - DevOpsCon 2017Jenkins in the real world - DevOpsCon 2017
Jenkins in the real world - DevOpsCon 2017Gianluca Arbezzano
 
Monitor your application and sleep
Monitor your application and sleepMonitor your application and sleep
Monitor your application and sleepGianluca Arbezzano
 
Tick Stack - Listen your infrastructure and please sleep
Tick Stack - Listen your infrastructure and please sleepTick Stack - Listen your infrastructure and please sleep
Tick Stack - Listen your infrastructure and please sleepGianluca Arbezzano
 
Docker Novosibirsk Meetup #3 - Docker in Production
Docker Novosibirsk Meetup #3 - Docker in ProductionDocker Novosibirsk Meetup #3 - Docker in Production
Docker Novosibirsk Meetup #3 - Docker in ProductionGianluca Arbezzano
 
DockerDublin Meetup - News about Docker 1.13
DockerDublin Meetup -  News about Docker 1.13DockerDublin Meetup -  News about Docker 1.13
DockerDublin Meetup - News about Docker 1.13Gianluca Arbezzano
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick StackGianluca Arbezzano
 
Queue System and Zend\Queue implementation
Queue System and Zend\Queue implementationQueue System and Zend\Queue implementation
Queue System and Zend\Queue implementationGianluca Arbezzano
 
ZfDayIt 2014 - There is a module for everything
ZfDayIt 2014 - There is a module for everythingZfDayIt 2014 - There is a module for everything
ZfDayIt 2014 - There is a module for everythingGianluca Arbezzano
 

More from Gianluca Arbezzano (18)

Value of your metrics: goodbye monitoring, welcome observability
Value of your metrics: goodbye monitoring, welcome observabilityValue of your metrics: goodbye monitoring, welcome observability
Value of your metrics: goodbye monitoring, welcome observability
 
InfluxCloudi craft container orchestrator
InfluxCloudi craft container orchestratorInfluxCloudi craft container orchestrator
InfluxCloudi craft container orchestrator
 
Orbiter and how to extend Docker Swarm
Orbiter and how to extend Docker SwarmOrbiter and how to extend Docker Swarm
Orbiter and how to extend Docker Swarm
 
Overview and Opentracing in theory by Gianluca Arbezzano
Overview and Opentracing in theory by Gianluca ArbezzanoOverview and Opentracing in theory by Gianluca Arbezzano
Overview and Opentracing in theory by Gianluca Arbezzano
 
Monitoring Pull vs Push, InfluxDB and Prometheus
Monitoring Pull vs Push, InfluxDB and PrometheusMonitoring Pull vs Push, InfluxDB and Prometheus
Monitoring Pull vs Push, InfluxDB and Prometheus
 
Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017
 
Security Tips to run Docker in Production
Security Tips to run Docker in ProductionSecurity Tips to run Docker in Production
Security Tips to run Docker in Production
 
Jenkins in the real world - DevOpsCon 2017
Jenkins in the real world - DevOpsCon 2017Jenkins in the real world - DevOpsCon 2017
Jenkins in the real world - DevOpsCon 2017
 
Monitor your application and sleep
Monitor your application and sleepMonitor your application and sleep
Monitor your application and sleep
 
Tick Stack - Listen your infrastructure and please sleep
Tick Stack - Listen your infrastructure and please sleepTick Stack - Listen your infrastructure and please sleep
Tick Stack - Listen your infrastructure and please sleep
 
Docker Novosibirsk Meetup #3 - Docker in Production
Docker Novosibirsk Meetup #3 - Docker in ProductionDocker Novosibirsk Meetup #3 - Docker in Production
Docker Novosibirsk Meetup #3 - Docker in Production
 
DockerDublin Meetup - News about Docker 1.13
DockerDublin Meetup -  News about Docker 1.13DockerDublin Meetup -  News about Docker 1.13
DockerDublin Meetup - News about Docker 1.13
 
Docker 1.12 and SwarmKit
Docker 1.12 and SwarmKitDocker 1.12 and SwarmKit
Docker 1.12 and SwarmKit
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Queue System and Zend\Queue implementation
Queue System and Zend\Queue implementationQueue System and Zend\Queue implementation
Queue System and Zend\Queue implementation
 
ZfDayIt 2014 - There is a module for everything
ZfDayIt 2014 - There is a module for everythingZfDayIt 2014 - There is a module for everything
ZfDayIt 2014 - There is a module for everything
 
Vagrant - PugMI
Vagrant - PugMIVagrant - PugMI
Vagrant - PugMI
 
Silex, iniziamo
Silex, iniziamoSilex, iniziamo
Silex, iniziamo
 

Recently uploaded

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

Kubernetes debug like a pro

  • 1. ~ @gianarb - https://gianarb.it ~ Debug like a pro on Kubernetes Golab - Florence - 2018
  • 2. ~ @gianarb - https://gianarb.it ~ Gianluca Arbezzano Site Reliability Engineer @InfluxData ● http://gianarb.it ● @gianarb What I like: ● I make dirty hacks that look awesome ● I grow my vegetables 🍅🌻🍆 ● Travel for fun and work
  • 3. 1. Yo n ! Your team knows and use Docker for local development and testing 2. Kub te ! Everyone speaks about kubernetes. 3. Hir ! You don’t know why but you hired a DevOps that kind of know k8s. 3. Ex i m ! You are moving everything and everyone to kubernetes
  • 4. Inspired by a true story
  • 5. You need a good book 1. Short 2. Driven by experiences 3. Practical 4. Easy
  • 6. We need to make our hands dirty
  • 7. Spin up a cluster that you can break Bring developers in the loop
  • 8. Deploy CI on Kubernetes Bring developers in the loop
  • 9. Run your code in prod Bring developers in the loop
  • 10. Don’t be scared and write your own tools!
  • 11. K8s as code: From YAML to code (golang) 1. You have the ability to use Golang autocomplete as documentation, reference for every kubernetes resources 2. You feel less a YAML engineer (great feeling btw) 3. Code is better than YAML! You can reuse it, compile it, embed it in other projects.
  • 12. K8s as code: From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring
  • 13. K8s as code: From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring ● Continue to improve our CI to validate that YAML and Go file are the same, and the resources in Kubernetes are like the Go file. ● Maybe we will be able to remove the YAML at some point.
  • 14. Examples ● Everything has an API because you should USE it TO make something good! (cURL is good but you can make something better) ● Some of our tools: ○ Backup and Restore Operator for Persistent Volumes ○ We have a service to create runtime isolated environment to allow devs to test or product people to have a safe environment to demo, try. We also use it for the integration and smoke tests. ○ We have a tool to replicate environment locally on minikube to install and configure all the dependencies
  • 16. We need to have processes and tools that give us the ability to take a real time picture of our system
  • 17. Observability It is all about how we collect and aggregate the data
  • 18. Normal state vs Current state
  • 19. Bring developers in the loop You need knowledgeable devs to drive the team
  • 21. Golang and Kubernetes: OpenCensus ● Open Source project sponsored by Google ● It is a SPEC plus a set of libraries in different languages to instrument your application ● To collect metrics, traces and events.
  • 22. Golang and Kubernetes: OpenCensus Common Interface to get stats and traces from your app Different exporters to persist your data
  • 23. gianarb.it ~ @gianarb # HELP http_requests_total The total number of HTTP requests. # TYPE http_requests_total counter http_requests_total{method="post",code="200"} 1027 1395066363000 http_requests_total{method="post",code="400"} 3 1395066363000 # Escaping in label values: msdos_file_access_time_seconds{path="C:DIRFILE.TXT",error="Cannot find file:n"FILE.TXT""} 1.458255915e9 # Minimalistic line: metric_without_timestamp_and_labels 12.47 # A weird metric from before the epoch: something_weird{problem="division by zero"} +Inf -3982045 # A histogram, which has a pretty complex representation in the text format: # HELP http_request_duration_seconds A histogram of the request duration. # TYPE http_request_duration_seconds histogram http_request_duration_seconds_bucket{le="0.05"} 24054 http_request_duration_seconds_bucket{le="0.1"} 33444 http_request_duration_seconds_bucket{le="0.2"} 100392 http_request_duration_seconds_bucket{le="0.5"} 129389 http_request_duration_seconds_bucket{le="1"} 133988 http_request_duration_seconds_bucket{le="+Inf"} 144320 http_request_duration_seconds_sum 53423 http_request_duration_seconds_count 144320
  • 26. gianarb.it ~ @gianarb func FetchMetricFamilies(url string, ch chan<- *dto.MetricFamily, certificate string, key string, skipServerCertCheck bool) error { defer close(ch) var transport *http.Transport if certificate != "" && key != "" { cert, err := tls.LoadX509KeyPair(certificate, key) if err != nil { return err } tlsConfig := &tls.Config{ Certificates: []tls.Certificate{cert}, InsecureSkipVerify: skipServerCertCheck, } tlsConfig.BuildNameToCertificate() transport = &http.Transport{TLSClientConfig: tlsConfig} } else { transport = &http.Transport{ TLSClientConfig: &tls.Config{InsecureSkipVerify: skipServerCertCheck}, } } client := &http.Client{Transport: transport} return decodeContent(client, url, ch) } https://github.com/prometheus/prom2json/blob/master/prom2json.go#L123
  • 27. gianarb.it ~ @gianarb More to read ● OpenMetrics: https://github.com/OpenObservability/OpenMetrics ● OpenMetrics mailing list: https://groups.google.com/forum/#protectkern-. 1667emrelaxforum/openmetrics ● WIP branch for Python library https://github.com/prometheus/client_python/tree/openmetrics ● Thanks RICHARD for your work! I got some slides from here: https://promcon.io/2018-munich/slides/openmetrics-transforming-the-prometh eus-exposition-format-into-a-global-standard.pdf
  • 28. Tracing How do you “tell stories” about concurrent systems?
  • 29. How do you “tell stories” about concurrent systems?
  • 30.
  • 33. I was just waiting for a new standard! cit. Troll
  • 34. © 2017 InfluxData. All rights reserved.34 Typical problems with logs ¨ Which library do I need to use? ¨ Every library has a different format ¨ Every languages exposes a different format
  • 35. © 2017 InfluxData. All rights reserved.35 Tracing is not something new ¨ There are vendors ¨ Every vendor has their own format
  • 36. © 2017 InfluxData. All rights reserved.36 log log log log log log Parent Span Span Context / Baggage Child Child Child Span ¨ Spans - Basic unit of timing and causality. Can be tagged with key/value pairs. ¨ Logs - Structured data recorded on a span. ¨ Span Context - serializable format for linking spans across network boundaries. Carries baggage, such as a request and client IDs. ¨ Tracers - Anything that plugs into the OpenTracing API to record information. ¨ ZipKin, Jaeger, LightStep, others ¨ Also metrics (Prometheus) and logging
  • 37. © 2017 InfluxData. All rights reserved.37 OpenTracing API application logic µ-service frameworks Lambda functions RPC & control-flow frameworks existing instrumentation tracing infrastructure main() I N S T A N A J a e g e r microservice process
  • 38. © 2017 InfluxData. All rights reserved.38 import "github.com/opentracing/opentracing-go" import ".../some_tracing_impl" func main() { opentracing.SetGlobalTracer( // tracing impl specific: some_tracing_impl.New(...), ) ... } https://github.com/opentracing/opentracing-go Opentracing: Configure the GlobalTracer
  • 39. © 2017 InfluxData. All rights reserved.39 func xyz(ctx context.Context, ...) { ... span, ctx := opentracing.StartSpanFromContext(ctx, "operation_name") defer span.Finish() span.LogFields( log.String("event", "soft error"), log.String("type", "cache timeout"), log.Int("waited.millis", 1500)) ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Span from the Context
  • 40. © 2017 InfluxData. All rights reserved.40 func xyz(parentSpan opentracing.Span, ...) { ... sp := opentracing.StartSpan( "operation_name", opentracing.ChildOf(parentSpan.Context())) defer sp.Finish() ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Child Span
  • 41. Golang and Kubernetes: pprof ● It is the Golang native profiler ● You can use it via the `go pprof` command ● `import "runtime/pprof"` writes runtime profiling data ● `import "net/http/pprof"` serves via HTTP server runtime profiling data
  • 42. Golang and Kubernetes: pprof package main import ( "log" "net/http" _ "net/http/pprof" ) func main() { log.Println(http.ListenAndServe("localhost:6060", nil)) }
  • 43. $ go tool pprof http://localhost:6060/debug/pprof/heap Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap Saved profile in /home/gianarb/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.009.pb.gz File: main Type: inuse_space Time: Oct 15, 2018 at 4:22pm (CEST) Entering interactive mode (type "help" for commands, "o" for options) (pprof) png Generating report in profile001.png
  • 44.
  • 45. Golang and Kubernetes: pprof https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go // handleProfiles determines which profile to return to the requester. func (h *Handler) handleProfiles(w http.ResponseWriter, r *http.Request) { switch r.URL.Path { case "/debug/pprof/cmdline": httppprof.Cmdline(w, r) case "/debug/pprof/profile": httppprof.Profile(w, r) case "/debug/pprof/symbol": httppprof.Symbol(w, r) case "/debug/pprof/all": h.archiveProfilesAndQueries(w, r) default: httppprof.Index(w, r) } }
  • 46. Golang and Kubernetes: pprof https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go // archiveProfilesAndQueries collects the following profiles: // - goroutine profile // - heap profile // - blocking profile // - mutex profile // - (optionally) CPU profile // // It also collects the following query results: // // - SHOW SHARDS // - SHOW STATS // - SHOW DIAGNOSTICS // // All information is added to a tar archive and then compressed, before being // returned to the requester as an archive file.
  • 47. Golang and Kubernetes: pprof https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go var allProfs = []*prof{ {Name: "goroutine", Debug: 1}, {Name: "block", Debug: 1}, {Name: "mutex", Debug: 1}, {Name: "heap", Debug: 1}, }
  • 48. gz := gzip.NewWriter(&resp) tw := tar.NewWriter(gz) // Collect and write out profiles. for _, profile := range allProfs { if profile.Name == "cpu" { if err := pprof.StartCPUProfile(&buf); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } sleep(w, time.Duration(profile.Debug)*time.Second) pprof.StopCPUProfile() } else { prof := pprof.Lookup(profile.Name) if prof == nil { http.Error(w, "unable to find profile "+profile.Name, http.StatusInternalServerError) return } if err := prof.WriteTo(&buf, int(profile.Debug)); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } }
  • 49. Golang and Kubernetes: pprof https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/services/httpd/pprof.go // Collect and write out the queries. var allQueries = []struct { name string fn func() ([]*models.Row, error) }{ {"shards", h.showShards}, {"stats", h.showStats}, {"diagnostics", h.showDiagnostics}, } tabW := tabwriter.NewWriter(&buf, 8, 8, 1, 't', 0) for _, query := range allQueries { rows, err := query.fn() if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) }
  • 50. func (h *Handler) showDiagnostics() ([]*models.Row, error) { diags, err := h.Monitor.Diagnostics() if err != nil { return nil, err } // Get a sorted list of diagnostics keys. sortedKeys := make([]string, 0, len(diags)) for k := range diags { sortedKeys = append(sortedKeys, k) } sort.Strings(sortedKeys) rows := make([]*models.Row, 0, len(diags)) for _, k := range sortedKeys { row := &models.Row{Name: k} row.Columns = diags[k].Columns row.Values = diags[k].Rows rows = append(rows, row) } return rows, nil }
  • 53. That’s not totally true. You can expose the debugger endpoint. It works. It doesn’t apply for production workload.
  • 54. ~ @gianarb - https://gianarb.it ~ Thanks @gianarb