This document discusses how business analytics is shifting from relying solely on structured data to leveraging new unstructured data sources like machine data. Traditional analytics approaches involve rigid schemas and long design cycles, while Splunk allows indexing and searching of heterogeneous machine data in real-time without schemas. Splunk delivers insights across IT, security, and business by integrating machine data with structured context data to provide insights like customer analytics, product analytics, and digital intelligence.
2. Target Market Trends
• “Feeding transactional data into a traditional data warehouse no
longer represents the extent of capabilities necessary for BI.”
• “The simple idea of building a traditional data warehouse to
support a BI platform is no longer sufficient.”
• “….require new information management capabilities to integrate
information from disparate, external and unstructured information
sources.”
3. Traditional Analytics
Types:
• Business Intelligence
• Data mining
• OLAP
• Plain Analytics
Uses:
• Get better sense of their operations
• Cut costs
• Improve decision making
• Identify inefficient processes, which can
lead to identify new business
opportunities and reengineering their
processes
Challenges:
• Raw information lives are usually
decoupled or spread
across distributed systems
• Difficult to consolidate
• Involves an effort going through the
typical SDLC, which takes lots of time
4. Typical Process for Structured Data
Application
Application
Data
base
ETL Data
Application Connector
Warehouse
Analytics
Tool
Early Structure Binding
• Decide what questions to ask
• Design the data schema
• Normalize the data
• Write database insertion code
• Create the queries
• Feed the results into an analytics tool
5. Business Analytics –Before Splunk
IT/Business Challenges
• Most organizations only rely on structured data
for business analytics – not sufficient today!
• New data sources such as machine increasingly
critical sources of insight – not leveraged by
organizations
• Inability to scale / handle data volume of new
sources as data continues to grow Inability to
deliver real-time insights to the business.
• Most today rely on ETL causing latency in
analytics Existing solutions unable to do data
mash-up across structured and machine data
Business Consequence
• Inability to gain real-time business insights
from new data sources
• Business users across functions (sales ops,
product managers, marketing, and
customer support users cannot leverage
new data sources for analytics
• Competitive disadvantage as other
companies increasingly leverage machine
data for business insights
• Unable to get insights from new data
sources with their traditional structured
analytics tools
6. Business Analytics – After Splunk
IT/Business Vision
• Deliver real-time business insight from machine
data
• Enrich machine data with structured data to
provide business context
• Complement existing BI technologies for insight
into a new class of data
• Leverage search, interactive dashboards in Splunk
or other 3rd party visualization tools
• Rapid time to value in gaining business insights
from machine data
Business Benefits
• Application Analytics – to understand how customers
are interacting with various online applications.
• Content & Search Analytics – to understand how
customers are accessing and searching for content
served up over CDNs
• Real-time Sales Analytics – to gain real-time visibility
into products and services that customers are
purchasing.
• Service Cost Analytics – to gain insight (for example)
into call detail records and cost associated with
completing each call.
• Online Monetization Analytics – an example of this is
online gaming companies where they are introducing
virtual goods and charging for them.
• Marketing Analytics – understanding customer click-through
for ads helps improve placement, pricing and
click through rates.
7. Splunk Delivers Value Across IT and the
Business
Business
Analytics
Digital
Intelligence
Security
and
Compliance
IT
Operations
App
Manageme
nt
Industrial
Data
Developer Platform (REST API, SDKs)
>SPLUNK
Small Data. Big Data. Huge Data.
8. Splunk Turns Machine Data into
Operational Intelligence
Customer
Facing Data
Outside the
Datacenter
Applications
Web logs
Log4J, JMS, JMX
.NET events
Code and scripts
Networking
Configurations
syslog
SNMP
netflow
Databases
Configurations
Audit/query
logs
Tables
Schemas
Virtualization
& Cloud
Hypervisor
Guest OS, Apps
Cloud
Linux/Unix
Configuration
s
syslog
File system
ps, iostat, top
Windows
Registry
Event logs
File system
sysinternals
Logfiles Configs Messages Traps
Alerts
Metrics Scripts Changes Tickets
Click-stream data
Shopping cart data
Online transaction data
Manufacturing,
logistics…
CDRs & IPDRs
Power consumption
RFID data
GPS data
9. Early vs. Late Binding Schema
Early Structure Binding - Traditional
SELECT customers.* FROM customers WHERE
customers.customer_id NOT IN(SELECT customer_id FROM
Orders WHERE year(orders.order_date) = 2004)
Structure Data
• Schema – created
at design time
• Homogeneous–
must fit into tables
or be converted to
fit into tables
• Queries –
understood at
design time for
maximum
performance
• Must exactly match
constraints
10. Early vs. Late Binding Schema
Late Structure Binding - Splunk
Structure Data
• Schema-less • Heterogeneous–
can come from any
textual source
• Created at search
time
• Constantly
changing
• Queries/searches
can be ad-hoc
• No conversion
required, no
constraints
11. Analytics
Early Structure Binding Late Binding Schema
Decide the question(s)
you want to ask
Design the Schema
Normalize the data and
write DB insertion code
Create SQL & Feed into
Analytics Tool
Write data (or events) to
log files
Collect the log files
Create searches, graphs,
and reports using Splunk
(Days, Weeks or
Months &
Destructive)
(Minutes & Non-
Destructive)
12. Example: Business Visibility From
Machine Data
Machine Data (from customer interaction) Product Information Geo location Data
Customer interacts with service online or from
any device
User browser
information
Action Product
User
session
66.57.19.112 ..[05/Dec/2011 07:05:22:152]”GET
/card.do?action=addtocart&itemid=EST-17& product_id=K9-BD-
01&JSESSIONID.SD7SLSFF8ADFF8HTTP 1.1” 200 3923
AppleWebKit/535.2 (KHTML.like Gecko) Chrome/15.0.874.121
Safari535.2
Product_id=K9-BD-01
Product Name=2 TB Portable Drive
Manufacturer=iomega
Real-Time Business Insights from Machine Data
Geo location
data
Correlated with product
information from database
Location data based on where
the customer purchased /
interacted with service
– What products are popular in what region?
– Which product are customers leaving in cart?
– What are interaction paths by devices?
– How can we improve customer experience?
13. Getting Structured Data In Splunk
Log
files
CSV lookup
Splunk Connector
• Access data at scale
• In real-time
• Easy set-up & maintenance
Structured
databases
Applications
Web Servers
Other
systems
14. DB Connect: Business Context to
Machine Data
Structured Data >Machine Data >Business Analytics
Rate plans, customer
profile, geo location
Customer profile,
Service subscription
Product descriptions,
Customer profile
Device activation,
Radius, application logs
Application, server and
network logs
Application logs,
authentication logs
Sales Analytics
Customer Analytics
Product Analytics
15. Getting Business Insights from
Splunk
User Interface: Splunk
User Interface: Third Party
Dashboards Searches Pivot
Schedule SDK/APIs ODBC
16. Positioning Splunk for Business
Analytics
>New class of data for business analytics
>Enrich machine data with structured data
>Real-time business insights
>Complement traditional BI Tools
17. Splunk Complements Existing
BI Tools
Features Splunk Leading BI Tools
Focus Platform for real-time operational
intelligence
Data visualization and business
intelligence software
Value Collect, index, search, monitor, report on,
analyze massive streams of machine data
Analyze, visualize and share
structured data
Users IT, Operations, Security, Developers,
Analysts, Business Users (as consumers)
Business Users and Analysts
(already using data discovery
tool)
Use Cases IT Ops, App Management, Security, Digital
Intelligence, Business Analytics from
machine data, Internet of Things
Marketing, HR, Sales Reporting,
Supply Chain Analysis
18. Scales to TBs/day and Thousands of
Users
Automatic load balancing
linearly scales indexing
Distributed search and
MapReduce linearly scales
search and reporting
19. Summary
> Real Time Architecture
> Universal Machine Data Platform
> Schema on the Fly
> Agile Reporting and Analytics
> Scales from Desktop to Enterprise
> Fast Time to Value
> Passionate and Vibrant Community
Editor's Notes
Key Challenges
Requires pre-defined schema – limits flexibility
Difficult to handle data diversity in real time
Adding new and changing data sources is hard
Scaling for large volumes of data is difficult
Time consuming with long deployments