2. About Me
• A recovering software & QA engineer turned digital
artist once interested in fractals;
• now into data visualization based on large datasets
rendered directly to GPU (RGL, various Python GL
libraries, etc.)
• github: jammink2; twitter: rijksband
4. WHAT’S FLUENTD?
An extensible & reliable data collection
tool
simple core + plugins
buffering, HA (failover),
load balancing, etc.
like syslogd
5. What’s Fluentd?
> Data collector for unified logging layer
> Streaming data transfer based on JSON
> Written in Ruby
> Gem based various plugins
> http://www.fluentd.org/plugins
> Working in production
> http://www.fluentd.org/testimonials
11. CORE PLUGINS
• Divide & Conquer
• Buffering & Retries
• Error Handling
• Message Routing
• Parallelism
• Read Data
• Parse Data
• Buffer Data
• Write Data
• Format Data
Common
Concerns
Use Case
Specific
25. M X N → M + N
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
buffer/filter/route
28. # logs from a file
<source>
type tail
path /var/log/httpd.log
format apache2
tag backend.apache
</source>
# logs from client libraries
<source>
type forward
port 24224
</source>
# store logs to ES and HDFS
<match backend.*>
type mongo
database fluent
collection test
</match>
31. # logs from a file
<source>
type tail
path /var/log/httpd.log
format apache2
tag web.access
</source>
# logs from client libraries
<source>
type forward
port 24224
</source>
# store logs to ES and HDFS
<match *.*>
type copy
<store>
type elasticsearch
logstash_format true
</store>
<store>
type webhdfs
host namenode
port 50070
path /path/on/hdfs/
</store>
</match>