More Related Content Similar to Mining Your Logs - Gaining Insight Through Visualization (20) More from Raffael Marty (20) Mining Your Logs - Gaining Insight Through Visualization2. Raffael Marty
• Founder @
• Chief Security Strategist and Product Manager @ Splunk
• Manager Solutions @ ArcSight
• Intrusion Detection Research @ IBM Research
• IT Security Consultant @ PriceWaterhouse Coopers
Applied Security Visualization
Publisher: Addison Wesley (August, 2008)
ISBN: 0321510100
Logging as a Service 2 © by Raffael Marty
3. Agenda
•Log Analysis •Future Needs
•History •Data Visualization
•Log Architectures •Visualization Concepts
•What’s Working and •Security Visualization
What’s Not? Use-Cases
Logging as a Service 3 © by Raffael Marty
4. Log Analysis
10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/
HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/
533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
TYhzVH8AAAEAAGOkBOQAAADA 655268
2010-12-28T18:12:10.031+00:00 frontend2-raffy
syslog-ng[19600]: syslog-ng starting up;
version='3.1.1'
2011-01-10T21:27:04.820+00:00 frontend2-raffy
kernel: : [ 664.107313] blocked inbound
IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:
6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255
LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126
PROTO=UDP SPT=17500 DPT=17500 LEN=160
Logging as a Service 4 © by Raffael Marty
5. History
• 1980 Eric Allman develops syslogd(8)
• 1996 Intellitactics
• 1997 Tivoli Risk Manager developed by IBM Research
in Zurich (later Zurich Correlation Engine, ZCE)
• 1999 - 2010 A number of log management / SIEM
players enter the market (software, appliances)
• 2000 ArcSight - 2010 sold for $1.65bn to HP
• 2009 Loggly (logging as a service)
Logging as a Service 5 © by Raffael Marty
6. History - The Other View
• Network management (SNMP)
• IDS false positive reduction
• Security monitoring (multiple data sources)
• Unification of NOC and SOC (failed?)
• Application monitoring (moving up the stack)
- original tools failed due to architectural constraints
- new approaches have been presented
Logging as a Service 6 © by Raffael Marty
8. Log Management Today
less tools
DIY Log Management CEP and SIEM Advanced Analytics
•grep •Open source •Open source •Not log specific!
•Perl •Commercial •Commercial
•SQL MapReduce
•Open source
Logging as a Service © by Raffael Marty
9. Open Source Tools
• graylog2 • lire • MS Logparser
• logstash • LogSurfer • Sguil
• swatch • SEC • Octopussy
• tenshi • LogHound • Sagan
• logwatch • slct
• OSSEC • log2timeline
• snare • logzilla
• lasso • OSSIM this list is likely incomplete!
Logging as a Service 9 © by Raffael Marty
10. Commercial Tools
this list is likely incomplete!
pixlcloud | Visualization in the Cloud 10 © PixlCloud LLC 2011
12. Log Mgmt Architecture
Storage:
- on board
- external storage array
- clusters
Collection: Processing:
- syslog - indexing
- OPSEC - context storage
- SDEE - clustering
- netflow
- database
Logging as a Service 12 © by Raffael Marty
13. Log Mgmt Architecture
raw
normalized
or raw
Collection: Processing: Data Access:
- syslog - indexing - free-text search
- OPSEC - context storage - field-based search
- SDEE - clustering - tagging schemas
- netflow
- database
Logging as a Service 13 © by Raffael Marty
14. Agents and Connectors
• piece of code to transport logs to a central location
• features • often additional features: • special protocols:
- batch - parse - OPSEC, SDEE
- compress - normalize - Windows
- encrypt - aggregate
• file-based collection
- sign - enrichment (context)
- fail-over
• database collection
pixlcloud | Visualization in the Cloud 14 © PixlCloud LLC 2011
15. SIEM Architecture
asset context
raw
normalized identity
context
...
context / tagging
RDBMS
Logging as a Service 15 © by Raffael Marty
16. SIEM Architecture
• RDBMS schema
- Fixed number and type of fields
- New data sources with new fields?
‣ overloading
• RDBMS clusters are expensive and scale poorly
• Need a parser for every data source
• Slow historical data queries
• Hard to configure database efficiently
- because of different use-cases
Logging as a Service 16 © by Raffael Marty
17. SIEM Architecture Benefits
• Parsed data enables
- real-time correlation
- real-time statistics
- data augmentation (context) close to source
• Unified data access language
- over a fixed set of fields
• Real-time dashboards
Logging as a Service 17 © by Raffael Marty
18. Search vs. SIEM
• Full-text indexing
• Parsing at search time
Example search: Example search:
denied user=rmarty
• use index to find • use index to find ALL
occurrences of ‘denied’ occurrences of ‘rmarty’
• apply parser to results
• remove results where
user is not rmarty
Logging as a Service 18 © by Raffael Marty
19. New SIEM - Hybrid Models
• Use parsers for known data sources
• Collect everything else
• Index all data and use index for search
• Correlate parsed data
Logging as a Service 19 © by Raffael Marty
20. Categorization and Tagging
•How do you find all failed logins across any data source?
security:538 OR “sshd authentication failure” OR “sshd failed
password” OR ...
•Does not scale
- for new data sources
- for new events of existing sources id -> object, action, status
•Define a ‘taxonomy’ for all events
•Map events into taxonomy
Logging as a Service 20 © by Raffael Marty
21. Content Creation
• Rules, dashboards, reports, searches can use
taxonomy:
object=authentication AND action=login AND status=success
• All failures related to files:
object=file AND status=failure
• Approach scales well
• Mixing with other fields: • Huge effort to build and
action=login AND user=rmarty maintain mappings
Logging as a Service 21 © by Raffael Marty
22. Logging as a Service (LaaS)
• Economically advantageous - think about TCO
• Pay as you go
• Elastic infrastructure scales with your needs
• No installation needed
• No setup costs / time for logging solution
• Open platform with RESTful APIs
Logging as a Service 22
23. Loggly
Data Sources Consumers
Loggly
user interface
UI extensions
mobile-166 My syslog
Data collection
Proxies API Data access
Distributed
Indexers and Search Machines indexing and
processing
Log Archive Distributed
data store
Logging as a Service 23
24. Tool Usage
DIY MR Log Mgmt SIEM LaaS
data known known unknown known
-
sources only a few only a few many many
analysis known exploration unknown unknown extend
use-cases one or a few large-scale many many platform
dynamic no no yes yes yes
use-cases
real-time extend
no no no yes
correlation platform
engineer engineers license license
cost hardware hardware (hardware) hardware subscription
maintenance maintenance maintenance maintenance
Should you rather do it yourself (DIY)?
Logging as a Service 24 © by Raffael Marty
26. What’s Working
• Log collection
• Log centralization
• Alerting on a priori known patterns
• Solving specific, known use-cases for sets of
known data sources, e.g.,
- monitoring privileged access to financial servers
- generating compliance reports
- security forensics
Logging as a Service 26 © by Raffael Marty
27. What’s Not Working
• Log formats are all over and not documented
Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576
• No logging guidelines / developer education
• Parsing is broken
- based on regexes
- numerous mistakes
- doesn’t scale
Logging as a Service 27 © by Raffael Marty
28. What’s Not Working
• Normalization is broken:
- IP to hostnames (when to do DNS lookup)
- usernames (rmarty vs. ram vs. raffy)
• Categorization / Taxonomy
- doesn’t scale - is always out of date
- is buggy - expensive
• Prioritization has no working formula
• Anomaly detection is voodoo!
Logging as a Service 28 © by Raffael Marty
29. What Does It Mean?
• We don’t understand our data
• Security Operations Center (SOC) monitors all
corporate data sources. Analysts
- don’t know all the applications
- don’t know all the setups
- don’t know what log records are ‘normal’ behavior
--> Need tools to enable log owners to work
with their data
Logging as a Service 29 © by Raffael Marty
31. We Need Better Tools
• We will have more and more data and need to deal with
larger amounts of data
- SIEM needs to support new distributed, scalable data management
technologies
• More and more application layer data
- How are we going to deal with all the parsing / entity extraction?
- We need logging standards and guidelines
• How do we help analysts understand the data?
- What is important and what is not?
- Mapping problems to business process, business risk!
Logging as a Service 31 © by Raffael Marty
33. Data/Log Visualization
• Exploration and Discovery
• Answer Questions
• Communicate Information
• Support Decisions
Logging as a Service 33 © by Raffael Marty
34. Security Visualization
• We are nowhere!
• Visualization is an afterthought
• Sec Viz dichotomy
• Tools are lacking fundamental capabilities
• Users don’t understand data, how can
they understand visuals?
Logging as a Service 34 © by Raffael Marty
36. The Analysis Approach
Details on
Overview first Zoom
demand
Principle by Ben Shneiderman
Logging as a Service 36 © by Raffael Marty
40. Legible / Usable Graphs
Reducing non data ink!
Logging as a Service 40 © by Raffael Marty
42. Ode to the Pie
Logging as a Service 42 © by Raffael Marty
48. Situational Awareness
• Treemap
• Protovis.JS
• Size: Amount
• Brightness: Variance
• Color: Sensor
• Shows: Scans -
bright spots
• Thanks to Chris Horsley
Logging as a Service 48 © by Raffael Marty
51. Firewall Log
Port Source IP Destination IP
Logging as a Service 51 © by Raffael Marty
52. IDS Sig Tuning - Treemap
Hierarchy:
Source
Destination
Signature
Number of Events
Color: Priority
Size: Number of alerts
Logging as a Service 52 © by Raffael Marty
54. Visualization Future
• A solution to entity extraction
• Dynamic and interactive displays
• Computer aided intelligence / visualization
- Computer supported exploration
- Highly interactive
• Expert system that captures domain knowledge
- Collaborative
Logging as a Service 54 © by Raffael Marty
55. http://secviz.org
Share, discuss, challenge, and learn about security
visualization.
• List: secviz.org/mailinglist
• Twitter: @secviz
Logging as a Service 55 © by Raffael Marty