SlideShare a Scribd company logo
1 of 44
Download to read offline
FLUENTD 101
BOOTSTRAP OF UNIFIED LOGGING
Open Source Summit Japan 2017
Fluentd Mini Summit / June 1, 2017
Satoshi Tagomori (@tagomoris)
Treasure Data, Inc.
Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
Fluentd 101
What is Fluentd?
Fluentd 101
Fluentd 101
"Fluentd is an open source data collector
for unified logging layer."
https://www.fluentd.org/
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by
providing a unified logging layer in between."
SQL
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by
providing a unified logging layer in between."
SQL
Unified
Logging
Layer
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by
providing a unified logging layer in between."
SQL
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by
providing a unified logging layer in between."
SQL
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
Simple Core w/ Plugin System
+
Various Plugins
Buffering, Retries, Failover
# logs from a file
<source>
@type tail
path /var/log/httpd.log
pos_file /tmp/pos_file
tag web.access
<parse>
@type apache2
</parse>
</source>
# logs from client libraries
<source>
@type forward
port 24224
</source>
# store logs to ES and HDFS
<match web.*>
@type copy
<store>
@type elasticsearch
logstash_format true
</store>
<store>
@type webhdfs
host namenode.local
port 50070
path /path/on/hdfs
<format>
@type json
</format>
</store>
</match>
Configuration (v0.14)
Implementation / Performance
• Fluentd is written in Ruby
• C extension libraries for performance requirement
• cool.io: Asynchronous I/O
• msgpack: Serialization/Deserialization for MessagePack
• Plugins in Ruby - easy to join the community
• Scaling for CPU cores
• multiprocess plugin (~ v0.12)
• multi process workers (v0.14 ~)
Package / Deployment
• Fluentd released on RubyGems.org
• rpm/deb package
• td-agent by Treasure Data
• Fluentd + some widely-used plugins
• td-agent2: Fluentd v0.12
• td-agent3 (beta now): Fluentd v0.14 (or v1)
• + msi package for Windows
• Docker images
• https://hub.docker.com/r/fluent/fluentd/
For Users in the Enterprise Sector
More security features and some others
https://fluentd.treasuredata.com/
Plugin System
3rd party input plugins
dstat
df AMQL
munin
jvmwatcher
SQL
3rd party output plugins
Graphite
Buffer OutputParserInput FormatterFilter
“output-ish”“input-ish”
“output-ish”“input-ish”
Storage
Helper
Buffer OutputParserInput FormatterFilter
Fluentd v0.14
Fluentd v0.12
Plugins
• Built-in plugins
• tail, forward, file, exec, exec_filter, copy, ...
• 3rd party plugins (755 plugins at May 19)
• fluent-plugin-xxx via rubygems.org
• webhdfs, kafka, elasticsearch, redshift, bigquery, ...
• Plugin script
• .rb script files on /etc/fluent/plugin
Ecosystem
Fluentd 101
Fluentd 101
Fluentd 101
Events
Events: Structured Logs
ec_service.shopping_cart
2017-03-30 16:35:37 +0100
{
"container_id": "bfdd5b9....",
"container_name": "/infallible_mayer",
"source": "stdout",
"event": "put an item to cart",
"item_id": 101,
"items": 10,
"client": "web"
}
tag
timestamp
record
Event: tag, timestamp and record
• Tag
• A dot-separated string
• to show what the event is / where the event from
• Timestamp
• An integer of unix time (~ v0.12)
• A structured timestamp with nano seconds (v0.14 ~)
• Record
• Key-value pairs
Data Source
router
input plugin
read / receive
raw data
eventeventeventevent
parser plugin
parse data into key-values
parse timestamp from record
add tags
output plugin
with buffering
eventeventeventevent
format plugin
buffer plugin
formatted data
Data Destination
write / send
Buffers and Retries
Buffer & Retry for Micro-Try&Error
Retry
Retry
Batch
Stream Error
Retry
Retry
Controlled Recovery from Long Outage
Buffer
(on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + flow control
queued chunks
Last Resort: Secondary Output
Error
queued chunks
# store logs to ES, or file if it goes down
<match web.*>
@type elasticsearch
logstash_format true
<secondary>
@type secondary_file
path /data/backup/web
</secondary>
</match>
Configuration: Secondary Output (v0.14)
Forwarding Data via Network
Forward Plugin: Forwarding Data via Network
• Built-in plugin, TCP port 24224 (in default)
• with heartbeat protocol (default UDP in v0.12, TCP/TLS in v0.14)
• Transfer data via TCP
• From Fluentd to Fluentd
• From logger libraries to Fluentd
• From Fluent-bit to Fluentd
• From Docker logging driver to Fluentd
• Standard protocol

https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1

(Spec v0 at Fluentd v0.12 or earlier -> Spec v1 at Fluentd 0.14)
Forward: Features
• Load Balancing / High Availability
• Servers with weights
• Standby servers
• DNS round robin
• Controlling Data Transferring Semantics
• at-most-once / at-least-once transferring
• Spec v1 Features
• TLS support
• Simple authentication/authorization
• Efficient data transferring: Gzip Compression (v0.14)
Forward Plugin: Load Balancing
60
60
60
60
60
60
60
20
weight: 60 (default)
weight: 60 (default)
weight: 60 (default)
weight: 20
Balance Data with Configured Weight
Forward Plugin: Handling Server Failure
Detecting Server Down/Up using Heartbeat
Error
Forward Plugin: Standby Server
Detecting Server Down/Up using Heartbeat
standby: true
Error
standby: true
Forward Plugin: Without ACK (at-most-once)
forward output
buffer
forward input any output
forward output forward input any output
bufferbuffer
Without any troubles
With troubles about buffers in destination
forward output
buffer
forward input any output
forward output forward input any output
buffer buffer
Events will be lost in this case :(
Forward Plugin: With ACK (at-least-once)
forward output
buffer
forward input any output
forward output forward input any output
buffer
Forward output:
require_ack_response: true
forward output forward input any output
buffer
buffer
buffer ACK
with chunk id
forward output forward input any output
buffer
buffer
with chunk id
Forward Plugin: With ACK (at-least-once)
forward output
buffer
forward input any output
forward output forward input any output
Forward output:
require_ack_response: true
forward output forward input any output
buffer
buffer ACK missing
forward output forward input any output
with chunk id
buffer
buffer
with chunk id
retry
Forward output ensure to transfer
buffers to any living destinations :D
Fluentd has many good stuffs

for logging.
Discover more on docs.fluentd.org!
Happy Logging!
@tagomoris

More Related Content

What's hot

Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconN Masahiro
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELKGeert Pante
 
Logs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackLogs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackJosef Karásek
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...NoSQLmatters
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Marco Pas
 
Lisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-onLisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-onGluster.org
 
まずやっとくPostgreSQLチューニング
まずやっとくPostgreSQLチューニングまずやっとくPostgreSQLチューニング
まずやっとくPostgreSQLチューニングKosuke Kida
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINXNGINX, Inc.
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Building IAM for OpenStack
Building IAM for OpenStackBuilding IAM for OpenStack
Building IAM for OpenStackSteve Martinelli
 
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナーNGINX, Inc.
 
Stephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large StateStephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large StateFlink Forward
 
Dynamic Reconfiguration of Apache ZooKeeper
Dynamic Reconfiguration of Apache ZooKeeperDynamic Reconfiguration of Apache ZooKeeper
Dynamic Reconfiguration of Apache ZooKeeperDataWorks Summit
 
Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Bilgin Ibryam
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
 
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)I Goo Lee
 
Delivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSDelivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSNGINX, Inc.
 
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요Jo Hoon
 

What's hot (20)

Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at Kubecon
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 
Logs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackLogs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK Stack
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)
 
Lisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-onLisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-on
 
まずやっとくPostgreSQLチューニング
まずやっとくPostgreSQLチューニングまずやっとくPostgreSQLチューニング
まずやっとくPostgreSQLチューニング
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINX
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Building IAM for OpenStack
Building IAM for OpenStackBuilding IAM for OpenStack
Building IAM for OpenStack
 
OCIコンテナサービス関連の技術詳細
OCIコンテナサービス関連の技術詳細OCIコンテナサービス関連の技術詳細
OCIコンテナサービス関連の技術詳細
 
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
 
Stephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large StateStephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large State
 
Dynamic Reconfiguration of Apache ZooKeeper
Dynamic Reconfiguration of Apache ZooKeeperDynamic Reconfiguration of Apache ZooKeeper
Dynamic Reconfiguration of Apache ZooKeeper
 
Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
 
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)
오픈소스로 만드는 DB 모니터링 시스템 (w/graphite+grafana)
 
Delivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSDelivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWS
 
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요
[네전따] 네트워크 엔지니어에게 쿠버네티스는 어떤 의미일까요
 

Similar to Fluentd 101

Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOSconN Masahiro
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container EraSadayuki Furuhashi
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesTimothy Spann
 
Experiences with Microservices at Tuenti
Experiences with Microservices at TuentiExperiences with Microservices at Tuenti
Experiences with Microservices at TuentiAndrés Viedma Peláez
 
Bootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxBootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxMaximiliano Accotto
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin Kuberton
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoringVinay Krishna
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Docker and Fluentd (revised)
Docker and Fluentd (revised)Docker and Fluentd (revised)
Docker and Fluentd (revised)SATOSHI TAGOMORI
 
Metasploitation part-1 (murtuja)
Metasploitation part-1 (murtuja)Metasploitation part-1 (murtuja)
Metasploitation part-1 (murtuja)ClubHack
 
Docker intro
Docker introDocker intro
Docker introspiddy
 
Inside Triton, July 2015
Inside Triton, July 2015Inside Triton, July 2015
Inside Triton, July 2015Casey Bisson
 

Similar to Fluentd 101 (20)

Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
Experiences with Microservices at Tuenti
Experiences with Microservices at TuentiExperiences with Microservices at Tuenti
Experiences with Microservices at Tuenti
 
Bootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxBootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on Linux
 
Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod
 
Fluentd meetup in japan
Fluentd meetup in japanFluentd meetup in japan
Fluentd meetup in japan
 
SQL on linux
SQL on linuxSQL on linux
SQL on linux
 
upload test 1
upload test 1upload test 1
upload test 1
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
JavaCro'14 - Continuous deployment tool – Aleksandar Dostić and Emir Džaferović
JavaCro'14 - Continuous deployment tool – Aleksandar Dostić and Emir DžaferovićJavaCro'14 - Continuous deployment tool – Aleksandar Dostić and Emir Džaferović
JavaCro'14 - Continuous deployment tool – Aleksandar Dostić and Emir Džaferović
 
Fluentd and docker monitoring
Fluentd and docker monitoringFluentd and docker monitoring
Fluentd and docker monitoring
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Docker and Fluentd (revised)
Docker and Fluentd (revised)Docker and Fluentd (revised)
Docker and Fluentd (revised)
 
Metasploitation part-1 (murtuja)
Metasploitation part-1 (murtuja)Metasploitation part-1 (murtuja)
Metasploitation part-1 (murtuja)
 
Docker intro
Docker introDocker intro
Docker intro
 
Inside Triton, July 2015
Inside Triton, July 2015Inside Triton, July 2015
Inside Triton, July 2015
 

More from SATOSHI TAGOMORI

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speedSATOSHI TAGOMORI
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsSATOSHI TAGOMORI
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of RubySATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)SATOSHI TAGOMORI
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script ConfusingSATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubySATOSHI TAGOMORI
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsSATOSHI TAGOMORI
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the WorldSATOSHI TAGOMORI
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamSATOSHI TAGOMORI
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessSATOSHI TAGOMORI
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage SystemsSATOSHI TAGOMORI
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd SeasonSATOSHI TAGOMORI
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToSATOSHI TAGOMORI
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In RubySATOSHI TAGOMORI
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldSATOSHI TAGOMORI
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceSATOSHI TAGOMORI
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra PerfectSATOSHI TAGOMORI
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraSATOSHI TAGOMORI
 
Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"SATOSHI TAGOMORI
 

More from SATOSHI TAGOMORI (20)

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud Service
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"
 

Recently uploaded

Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
How to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareHow to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareNYGGS Automation Suite
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?AmeliaSmith90
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptkinjal48
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Understanding Native Mobile App Development
Understanding Native Mobile App DevelopmentUnderstanding Native Mobile App Development
Understanding Native Mobile App DevelopmentMobulous Technologies
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfTobias Schneck
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024ThousandEyes
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Jonathan Katz
 

Recently uploaded (20)

Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
How to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareHow to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS Software
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.ppt
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Understanding Native Mobile App Development
Understanding Native Mobile App DevelopmentUnderstanding Native Mobile App Development
Understanding Native Mobile App Development
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
 

Fluentd 101

  • 1. FLUENTD 101 BOOTSTRAP OF UNIFIED LOGGING Open Source Summit Japan 2017 Fluentd Mini Summit / June 1, 2017 Satoshi Tagomori (@tagomoris) Treasure Data, Inc.
  • 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  • 7. "Fluentd is an open source data collector for unified logging layer." https://www.fluentd.org/
  • 8. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  • 9. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL Unified Logging Layer
  • 10. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  • 11. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  • 12. AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL Simple Core w/ Plugin System + Various Plugins Buffering, Retries, Failover
  • 13. # logs from a file <source> @type tail path /var/log/httpd.log pos_file /tmp/pos_file tag web.access <parse> @type apache2 </parse> </source> # logs from client libraries <source> @type forward port 24224 </source> # store logs to ES and HDFS <match web.*> @type copy <store> @type elasticsearch logstash_format true </store> <store> @type webhdfs host namenode.local port 50070 path /path/on/hdfs <format> @type json </format> </store> </match> Configuration (v0.14)
  • 14. Implementation / Performance • Fluentd is written in Ruby • C extension libraries for performance requirement • cool.io: Asynchronous I/O • msgpack: Serialization/Deserialization for MessagePack • Plugins in Ruby - easy to join the community • Scaling for CPU cores • multiprocess plugin (~ v0.12) • multi process workers (v0.14 ~)
  • 15. Package / Deployment • Fluentd released on RubyGems.org • rpm/deb package • td-agent by Treasure Data • Fluentd + some widely-used plugins • td-agent2: Fluentd v0.12 • td-agent3 (beta now): Fluentd v0.14 (or v1) • + msi package for Windows • Docker images • https://hub.docker.com/r/fluent/fluentd/
  • 16. For Users in the Enterprise Sector More security features and some others https://fluentd.treasuredata.com/
  • 18. 3rd party input plugins dstat df AMQL munin jvmwatcher SQL
  • 19. 3rd party output plugins Graphite
  • 21. Plugins • Built-in plugins • tail, forward, file, exec, exec_filter, copy, ... • 3rd party plugins (755 plugins at May 19) • fluent-plugin-xxx via rubygems.org • webhdfs, kafka, elasticsearch, redshift, bigquery, ... • Plugin script • .rb script files on /etc/fluent/plugin
  • 27. Events: Structured Logs ec_service.shopping_cart 2017-03-30 16:35:37 +0100 { "container_id": "bfdd5b9....", "container_name": "/infallible_mayer", "source": "stdout", "event": "put an item to cart", "item_id": 101, "items": 10, "client": "web" } tag timestamp record
  • 28. Event: tag, timestamp and record • Tag • A dot-separated string • to show what the event is / where the event from • Timestamp • An integer of unix time (~ v0.12) • A structured timestamp with nano seconds (v0.14 ~) • Record • Key-value pairs
  • 29. Data Source router input plugin read / receive raw data eventeventeventevent parser plugin parse data into key-values parse timestamp from record add tags output plugin with buffering eventeventeventevent format plugin buffer plugin formatted data Data Destination write / send
  • 31. Buffer & Retry for Micro-Try&Error Retry Retry Batch Stream Error Retry Retry
  • 32. Controlled Recovery from Long Outage Buffer (on-disk or in-memory) Error Overloaded!! recovery recovery + flow control queued chunks
  • 33. Last Resort: Secondary Output Error queued chunks
  • 34. # store logs to ES, or file if it goes down <match web.*> @type elasticsearch logstash_format true <secondary> @type secondary_file path /data/backup/web </secondary> </match> Configuration: Secondary Output (v0.14)
  • 36. Forward Plugin: Forwarding Data via Network • Built-in plugin, TCP port 24224 (in default) • with heartbeat protocol (default UDP in v0.12, TCP/TLS in v0.14) • Transfer data via TCP • From Fluentd to Fluentd • From logger libraries to Fluentd • From Fluent-bit to Fluentd • From Docker logging driver to Fluentd • Standard protocol
 https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1
 (Spec v0 at Fluentd v0.12 or earlier -> Spec v1 at Fluentd 0.14)
  • 37. Forward: Features • Load Balancing / High Availability • Servers with weights • Standby servers • DNS round robin • Controlling Data Transferring Semantics • at-most-once / at-least-once transferring • Spec v1 Features • TLS support • Simple authentication/authorization • Efficient data transferring: Gzip Compression (v0.14)
  • 38. Forward Plugin: Load Balancing 60 60 60 60 60 60 60 20 weight: 60 (default) weight: 60 (default) weight: 60 (default) weight: 20 Balance Data with Configured Weight
  • 39. Forward Plugin: Handling Server Failure Detecting Server Down/Up using Heartbeat Error
  • 40. Forward Plugin: Standby Server Detecting Server Down/Up using Heartbeat standby: true Error standby: true
  • 41. Forward Plugin: Without ACK (at-most-once) forward output buffer forward input any output forward output forward input any output bufferbuffer Without any troubles With troubles about buffers in destination forward output buffer forward input any output forward output forward input any output buffer buffer Events will be lost in this case :(
  • 42. Forward Plugin: With ACK (at-least-once) forward output buffer forward input any output forward output forward input any output buffer Forward output: require_ack_response: true forward output forward input any output buffer buffer buffer ACK with chunk id forward output forward input any output buffer buffer with chunk id
  • 43. Forward Plugin: With ACK (at-least-once) forward output buffer forward input any output forward output forward input any output Forward output: require_ack_response: true forward output forward input any output buffer buffer ACK missing forward output forward input any output with chunk id buffer buffer with chunk id retry Forward output ensure to transfer buffers to any living destinations :D
  • 44. Fluentd has many good stuffs
 for logging. Discover more on docs.fluentd.org! Happy Logging! @tagomoris