More Related Content
Similar to Night owl by Boyd Meyer of PROS
Similar to Night owl by Boyd Meyer of PROS (20)
More from Mark Kerzner (20)
Night owl by Boyd Meyer of PROS
- 1. Night Owl
Log Monitoring using Elasticsearch and Hadoop
Boyd Meier (bmeier@pros.com)
Hadoop Meetup – October 16, 2013
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 3. Application Performance Monitoring
● Many servers
● Many applications
● Many log formats
● Many places to go look for information
● What if we could just look in one place and see everything?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 4. Advanced Analysis
● The logs are too low-level
● The servers need the existing capacity
● The amount of data to be analyzed is huge
● Some analysis needs to be across multiple servers
● What if we want to change the analysis algorithms?
● How we can do analysis in the most flexible way possible?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 5. Proactive Support
● See problems coming before they become crises
● Watch for errors and exceptions
● Track performance of the application
● Track usage of the application
● Enable checks we haven’t thought of yet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 6. Some Analysis Questions
● What errors happen, and how often?
● Who did what, when?
● How long did it take to do a task?
● What else was happening on the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 7. Constraints
● Very little budget – as much free stuff as possible
● Can’t use client machines
● Communications need to be secure
● Large amounts of data (Gb/day/client)
● Minimize support’s dependence on client IT
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 9. Hadoop
● We have a lot of data (~2 GB day with 3 clients)
● We need to process it in reasonable time
● We can’t afford a big machine for this
● We have lots of old machines lying around
● Sounds like a job for the elephant! But what about query?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 10. Elasticsearch
● Query performance on base Hadoop is painful
● Ad-hoc queries are required
● Hadoop integration
● Cluster deployment
● Looks promising! How do we get the data into the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 11. Logstash
● Handle many sources, not just logs
● Fan-in architecture to server
● Compressed, SSL encrypted data
● Can offload some logic on the client if desired
● Massively configurable
● Output to Elasticsearch
● Great! Now how about visualization?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 12. Kibana
● Backed by Elasticsearch
● Supports dynamic queries
● View information over time
● Built-in support for Logstash
● Configurable, shareable dashboards
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 14. Hadoop Processing
● Pig scripts process the data
● Wonderdog from InfoChimps to integrate Pig and Elasticsearch
– There are issues:
• Cluster stability using Wonderdog
• Wonderdog Pig interface has not been updated in a while
• Currently evaluating elasticsearch-hadoop project from Elasticsearch.org
● Analysis results are stored in Elasticsearch for ease of access
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 18. Software
● Ubuntu 12.04.2 LTS (Precise)
● Cloudera CDH 4.3.1
– Hadoop 2.0.0
– Hbase 0.94
– Hive 0.10
– Pig 0.11
● Elasticsearch 0.90.3
● Logstash 1.1.12
● Kibana 3 M3
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 19. Hardware Architecture
● 27 node cluster of commodity machines
● 42 TB of disk space
● Connected via 10 gigabit switch
● Each machine has:
– 8 GB RAM
– 2 TB SATA HDD
– Gigabit Ethernet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 20. Performance
● Over the month of September:
– 188 million events ingested from 3 clients
– 57.5 GB storage used (1.92 GB / day)
● At that rate, 42 TB is enough space for:
– 142 billion events
– 60 years of data from these clients
– 1 year of data from 180 clients at the same volume per client
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 21. Resources
● Elasticsearch - http://www.elasticsearch.org/overview/
• http://github.com/elasticsearch/elasticsearch
● Logstash - http://www.elasticsearch.org/overview/logstash/
• https://github.com/logstash/logstash
● Kibana - http://www.elasticsearch.org/overview/kibana/
• https://github.com/elasticsearch/kibana
● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/
• http://github.com/elasticsearch/elasticsearch-hadoop
● Cloudera - http://www.cloudera.com/
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 22. World Headquarters
3100 Main Street, Suite #900
Houston, TX 77002
Phone: +1 713-335-5151
Sales: +1 855-846-0641
Fax: +1 713-335-8144
PROS Germany GmbH
Feringastrasse 6
85774 Unterfoehring
Munich
Tel.: +49 89 99216 270
Fax: +49 89 99216 200
European Headquarters - United Kingdom
Lakeside House
1 Furzeground Way
Stockley Park
Heathrow
UB11 1BD
Phone: +44 (0) 208 622 3555
Fax: +44 208 622 3230
Regional Office - Austin, TX
3600 Parmer Lane, Suite 205
Austin, Texas 78727
Regional Office - Cary, North Carolina
1000 Centre Green Way, #200
Cary, NC 27513
Phone:+1 919-228-6334
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY