SlideShare a Scribd company logo
1 of 38
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Make Streaming Analytics
Work for you: The Devil is in
the Details
Kanishk Mahajan
June 2016
Director Product Management, Hortonworks
Ryan Medlin
Director Software Engineering, Neustar
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Building a Stream Analytics Platform:
Requirements
• Building a Stream Analytics Platform: Capabilities
• Use Case -
• CyberSecurity - Threat Protection
• Q&A
Building a Stream
Analytics Platform
Requirements
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
5 Attributes of a Streaming Platform
 Ingest
 Process
 Analyze
 Visualize
 Respond
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Flow versus Stream Analytics
 Data Flow
–Ingest and route terabytes of data into a ”unified firehose”
–Actively performance manage the latency and quality of these data flows -
–across high variability of data formats, size of data and speed of data
–Produces Streams - ”unbounded sequence of events”
Real Time Stream Analytics
–Sub second event processing with linear scalability to billions of events
–Predictive Analytics at Scale
– Real time data aggregation across edge nodes while processing 10s of millions of events and
100s of gigabytes per second
–Guaranteed no data loss and events processed in order
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
6 key focus areas for building a Streaming Platform
 Common Abstraction Layer
 Latency
 Lambda Architecture
–“Orchestrate” over static and real time data
 Scale-out
 Rapid Application Development
 Data Visualization
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Common Abstraction Layer
–Select one or more streaming engines
–Select one or more cloud providers
–Select one or more event sources
–…Future Proof
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Latency
–500 millisecond or less- Dashboards, Security Incidents, Asset Performance
–20 milliseconds or less - Ad Networks, Preventive Maintenance
–…
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Lambda Architecture
–Integrate static and real time data
–Enrichment
–Orchestration of Batch workflows
–Predictive Classification and Scoring
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Scale Out
–Linear Scale out or scale down
–Resource Management
–Handle Transient workloads
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Rapid Application Development
–Aggregations
–Filters
–Multi Stream Correlations
–Splits
–Joins
–Normalizations
–Business Rules Editor
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Data Visualization
–Time Series Visualizations
–Metrics Dashboarding
–Trends
–Comparisons
–Thresholds
–Custom UI extensions
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real Time Stream Analytics versus Streaming Engine
Abstracts underlying Streaming Engine- Storm, Spark Streaming, Flink..
OOTB Support for multiple (cloud) event sources - Kafka, AWS Kinesis, Azure Event
Hub
Built-in Operators for Complex Event Processing
Built-in Real Time Dashboarding- Metrics and Events
Pluggable Workflow Management
Business Rules Editor and Rapid Application Development Framework
Cloud Deployment
Scalable Architecture
 Handles different latency requirements
Hortonworks Inc Confidential 2016
Building a Stream
Analytics Platform
Capabilities
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Continuous, Real Time Processing
–Continuous Query and near real time processing of event data
–Distributed, scalable platform for processing continuous unbounded streams of data
–Abstraction across multiple open source stream engine technologies
–Large Scale Event Processing
–No coding approach for event processing logic
• Event Source
–Multiple event source support
–Multiple data format support
–Drag and Drop, no coding approach
• Event Storage
–Multiple event storage support
–Event Retention time period support
–Drag and Drop, no coding approach
Page 15
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Streaming Platform Capabilities - Analytics
Detect Trends
Alert on probable Failures
Descriptive Proactive Predictive Prescriptive
• Rules Calculations on
predefined settings
• High Throughput, Low
Latency Event Processing
Framework
• Train Models Offline
• Provide Query interface and
storage of Models
• Integrate Real Time Data with
Batch Data
• Connect with Enterprise
Applications
Reporting and Dashboarding
Visual Pattern Recognition
Monitor System
Alert before Failure
Take Action on Data
Avoid Failure and
Drive Operational Efficiency
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Filtering
–Event Selection based on pre-defined criteria
• Transformations
–Aggregations
–Event Prioritization
–Event Deduplication
–Event Time Stamping
–Custom User Defined Functions
• Enrichment
–Pull in reference data for additional context
Page 17
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Correlations
–Correlation of in-flight event data from multiple event streams
• Temporal Calculations
–Time Based Calculations on in-flight data in an event stream for e.g. a windowing
computation with a time based window
• Pattern Detection
–Calculating trends on in-flight data in an event stream
–Outlier detection : Pattern Matching with historical data
Page 18
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Serialization/Deserialization
–Marshalling/Unmarshalling event data into a series of bytes
–Drag and Drop, no coding support
• Rule Engine
–Support for business rules (business directives that impact a decision)
–Support for invoking external business rule engine
• Visualization
–Reporting and Dashboarding
–Visual Pattern Recognition
Page 19
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics - Capabilities Summary
# Capability
1. Filtering
2. Transformations
3. Enrichment
4. Correlations
5. Temporal Calculations
6. Pattern Detection
# Capability
1. Continuous Real Time
Processing
2. Event Sources
3. Event Storage
4. Temporal Calculations
5. Rules Engine
6. Visualization
# Capability
7. Filtering
8. Transformation
9. Event Storage
10. Enrichment
11. Correlations
12. Pattern Detection
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Cases
CyberSecurity - Threat Protection
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection Use Case
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analytics
• Expose
• Provide Visibility into your enterprise
• Passively watch network traffic and construct events to describe the activity it sees.
• “Ingest”
• Transmit
• Compress, Encrypt, De-identify event data
• Send to the cloud for centralized log retention, real-time threat analysis and incident investigation
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analytics
• Prioritize
• Aggregate intelligence across all control points to identify and prioritize those systems that remain
compromised and require immediate remediation.
• Allows analysts to “zero in” on just those events of most importance.
• Significantly reduce the number of incidents that security analysts need to investigate
• Combine global telemetry from cyber intelligence networks with local customer context across
endpoints, networks and email, to uncover attacks that would otherwise evade detection.
• “Correlate” and “Enrich”
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analytics
• Analyze
• Create comprehensive detection rules, behavioral analytics and guided investigations to detect the
latest threats.
• Enable quick and nimble data exploration across billions of events to proactively hunt for hidden
indicators of compromise (IOCs).
• Provide agile investigation tooling/UI to pivot from one indicator to the next, reconstruct the attack
storyline and plan a forceful response to disrupt the attack.
• Allow Security analysts to visualize all related attack components, e.g., all files used in the attack, all
email addresses, all malicious IP addresses, etc.
• “Proactive”, “Proactive”, “Prescriptive” and “Descriptive” Analytics
• “Real Time Analytics across large volumes of data at scale”
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analytics
• Remediate
• Generate alerts to expedite validating and scoping the incident.
• Enrich alerts with supporting data. This data could include threat intelligence, point-in-time
context regarding users impacted, actions taken and hosts involved help you validate and scope the
incident.
• Allow a Security Investigator to hunt for Indicators-of-Compromise across all your endpoints from a
single console.
• Leverage your existing installations of Endpoint Protection and Email Security for enforcement
without having to install any new endpoint agents.
• Export rich security intelligence into third-party security incident and event management systems
(SIEMs).
• “Notifications”, “Event Storage”, “Workflow”
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Make Streaming Analytics
Work for you: The Devil is in
the Details
Kanishk Mahajan
June 2016
Director Product Management, Hortonworks
Ryan Medlin
Director Software Engineering, Neustar
Copyright © 2015 Neustar, Inc. All Rights Reserved 28
AREAS OF FOCUS AND INNOVATION FOR NEUSTAR
 Security
 User
 Device
 Anomaly Detection
 Multi-Factor Authentication (PKI, etc)
 Policy Creation/Decisioning/Enforcement
 Local / Edge / Distributed functionality
 Openness and Standardization
 (OCF/OneM2M/MQTT/CoAP/GitHub)
 Partnerships
Copyright © 2015 Neustar, Inc. All Rights Reserved 29
DEVICE AND IOT PLATFORM
Devices
Data Processing Services
Local & Offline Functionality
Device and User Authentication
Device Directory & Identity
User and Device Policy
Device Certification & Public /
Private Key Distribution/MFA
OIC/OCF and IoTivity Standards Based Platform for Interoperability & device definition
Absolute, Group/Collection,
Local & cloud
Verification /
Fraud /
Anomaly
detection
Location
Services
Future
Peer-to-Peer communication &
Discovery. decentralization
Secure Discovery & Communication
OCF based security &
communication protocols
Data pipeline, enrichment and
analytics
CORE DATA PLATFORM REQUIREMENTS
 Provide Ingestion for all Machine Data Exhaust from all system components. (ex:
Anomaly Detection, Monitoring/Alerting)
 Deserializing avro messages using given schema(s) and by default store a
message in hdfs.
 Given a message attribute (for example device type and customer Id or device Id)
execute a rule to determine next course of action which should comprise of one
or more of these actions:
– Notify a user via email or sms or 3rd party api
– Notify or pass message to registry service/API
– Store data in hdfs
– Store data in a relational data store (hbase phoenix or other?)
Copyright © 2015 Neustar, Inc. All Rights Reserved 31
Copyright © 2015 Neustar, Inc. All Rights Reserved 32
Copyright © 2015 Neustar, Inc. All Rights Reserved 33
Copyright © 2015 Neustar, Inc. All Rights Reserved 34
Outgoing Messages
Copyright © 2015 Neustar, Inc. All Rights Reserved 35
Turn a Light on Manually
Inbound Message Flow
Copyright © 2015 Neustar, Inc. All Rights Reserved 36
Incoming Messages
INTERESTING PROBLEMS
 We need to work in offline and temporary offline mode so data replication is the hardest
problem to solve and is based on unique use cases.
 Also need to run a subset of the Cloud Inbound real time service rules and processing.
How to reuse this logic both in the cloud and locally on a Local Gateway is a design
challenge with tradeoffs.
Copyright © 2015 Neustar, Inc. All Rights Reserved 37
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Q & A

More Related Content

What's hot

Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFHortonworks
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHortonworks
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidDataWorks Summit
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in RealtimeDataWorks Summit
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouHortonworks
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Hortonworks
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...DataWorks Summit
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceDataWorks Summit/Hadoop Summit
 

What's hot (20)

Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDF
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare Transformation
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJIntro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in Realtime
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for You
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 

Similar to Make Streaming Analytics work for you: The Devil is in the Details

Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming dataCarolyn Duby
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionMilind Pandit
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk
 
What’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTWhat’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTSplunk
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHortonworks
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformAtul Sharma
 
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream Splunk
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDataWorks Summit
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk
 
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...Splunk
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at ScaleDataWorks Summit
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHaimo Liu
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunk
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto MeetupHortonworks
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

Similar to Make Streaming Analytics work for you: The Devil is in the Details (20)

Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming data
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi Introduction
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
 
What’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTWhat’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINT
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics Platform
 
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream
New Splunk Management Solutions Update: Splunk MINT and Splunk App for Stream
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream Breakout
 
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT Breakout
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto Meetup
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Make Streaming Analytics work for you: The Devil is in the Details

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Make Streaming Analytics Work for you: The Devil is in the Details Kanishk Mahajan June 2016 Director Product Management, Hortonworks Ryan Medlin Director Software Engineering, Neustar
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Building a Stream Analytics Platform: Requirements • Building a Stream Analytics Platform: Capabilities • Use Case - • CyberSecurity - Threat Protection • Q&A
  • 3. Building a Stream Analytics Platform Requirements
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 5 Attributes of a Streaming Platform  Ingest  Process  Analyze  Visualize  Respond
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data Flow versus Stream Analytics  Data Flow –Ingest and route terabytes of data into a ”unified firehose” –Actively performance manage the latency and quality of these data flows - –across high variability of data formats, size of data and speed of data –Produces Streams - ”unbounded sequence of events” Real Time Stream Analytics –Sub second event processing with linear scalability to billions of events –Predictive Analytics at Scale – Real time data aggregation across edge nodes while processing 10s of millions of events and 100s of gigabytes per second –Guaranteed no data loss and events processed in order
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 6 key focus areas for building a Streaming Platform  Common Abstraction Layer  Latency  Lambda Architecture –“Orchestrate” over static and real time data  Scale-out  Rapid Application Development  Data Visualization
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Common Abstraction Layer –Select one or more streaming engines –Select one or more cloud providers –Select one or more event sources –…Future Proof
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Latency –500 millisecond or less- Dashboards, Security Incidents, Asset Performance –20 milliseconds or less - Ad Networks, Preventive Maintenance –…
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Lambda Architecture –Integrate static and real time data –Enrichment –Orchestration of Batch workflows –Predictive Classification and Scoring
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Scale Out –Linear Scale out or scale down –Resource Management –Handle Transient workloads
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Rapid Application Development –Aggregations –Filters –Multi Stream Correlations –Splits –Joins –Normalizations –Business Rules Editor
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Data Visualization –Time Series Visualizations –Metrics Dashboarding –Trends –Comparisons –Thresholds –Custom UI extensions
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real Time Stream Analytics versus Streaming Engine Abstracts underlying Streaming Engine- Storm, Spark Streaming, Flink.. OOTB Support for multiple (cloud) event sources - Kafka, AWS Kinesis, Azure Event Hub Built-in Operators for Complex Event Processing Built-in Real Time Dashboarding- Metrics and Events Pluggable Workflow Management Business Rules Editor and Rapid Application Development Framework Cloud Deployment Scalable Architecture  Handles different latency requirements Hortonworks Inc Confidential 2016
  • 14. Building a Stream Analytics Platform Capabilities
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Continuous, Real Time Processing –Continuous Query and near real time processing of event data –Distributed, scalable platform for processing continuous unbounded streams of data –Abstraction across multiple open source stream engine technologies –Large Scale Event Processing –No coding approach for event processing logic • Event Source –Multiple event source support –Multiple data format support –Drag and Drop, no coding approach • Event Storage –Multiple event storage support –Event Retention time period support –Drag and Drop, no coding approach Page 15
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Streaming Platform Capabilities - Analytics Detect Trends Alert on probable Failures Descriptive Proactive Predictive Prescriptive • Rules Calculations on predefined settings • High Throughput, Low Latency Event Processing Framework • Train Models Offline • Provide Query interface and storage of Models • Integrate Real Time Data with Batch Data • Connect with Enterprise Applications Reporting and Dashboarding Visual Pattern Recognition Monitor System Alert before Failure Take Action on Data Avoid Failure and Drive Operational Efficiency
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Filtering –Event Selection based on pre-defined criteria • Transformations –Aggregations –Event Prioritization –Event Deduplication –Event Time Stamping –Custom User Defined Functions • Enrichment –Pull in reference data for additional context Page 17
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Correlations –Correlation of in-flight event data from multiple event streams • Temporal Calculations –Time Based Calculations on in-flight data in an event stream for e.g. a windowing computation with a time based window • Pattern Detection –Calculating trends on in-flight data in an event stream –Outlier detection : Pattern Matching with historical data Page 18
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Serialization/Deserialization –Marshalling/Unmarshalling event data into a series of bytes –Drag and Drop, no coding support • Rule Engine –Support for business rules (business directives that impact a decision) –Support for invoking external business rule engine • Visualization –Reporting and Dashboarding –Visual Pattern Recognition Page 19
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics - Capabilities Summary # Capability 1. Filtering 2. Transformations 3. Enrichment 4. Correlations 5. Temporal Calculations 6. Pattern Detection # Capability 1. Continuous Real Time Processing 2. Event Sources 3. Event Storage 4. Temporal Calculations 5. Rules Engine 6. Visualization # Capability 7. Filtering 8. Transformation 9. Event Storage 10. Enrichment 11. Correlations 12. Pattern Detection
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use Cases CyberSecurity - Threat Protection
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection Use Case
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Expose • Provide Visibility into your enterprise • Passively watch network traffic and construct events to describe the activity it sees. • “Ingest” • Transmit • Compress, Encrypt, De-identify event data • Send to the cloud for centralized log retention, real-time threat analysis and incident investigation
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Prioritize • Aggregate intelligence across all control points to identify and prioritize those systems that remain compromised and require immediate remediation. • Allows analysts to “zero in” on just those events of most importance. • Significantly reduce the number of incidents that security analysts need to investigate • Combine global telemetry from cyber intelligence networks with local customer context across endpoints, networks and email, to uncover attacks that would otherwise evade detection. • “Correlate” and “Enrich”
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Analyze • Create comprehensive detection rules, behavioral analytics and guided investigations to detect the latest threats. • Enable quick and nimble data exploration across billions of events to proactively hunt for hidden indicators of compromise (IOCs). • Provide agile investigation tooling/UI to pivot from one indicator to the next, reconstruct the attack storyline and plan a forceful response to disrupt the attack. • Allow Security analysts to visualize all related attack components, e.g., all files used in the attack, all email addresses, all malicious IP addresses, etc. • “Proactive”, “Proactive”, “Prescriptive” and “Descriptive” Analytics • “Real Time Analytics across large volumes of data at scale”
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Remediate • Generate alerts to expedite validating and scoping the incident. • Enrich alerts with supporting data. This data could include threat intelligence, point-in-time context regarding users impacted, actions taken and hosts involved help you validate and scope the incident. • Allow a Security Investigator to hunt for Indicators-of-Compromise across all your endpoints from a single console. • Leverage your existing installations of Endpoint Protection and Email Security for enforcement without having to install any new endpoint agents. • Export rich security intelligence into third-party security incident and event management systems (SIEMs). • “Notifications”, “Event Storage”, “Workflow”
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Make Streaming Analytics Work for you: The Devil is in the Details Kanishk Mahajan June 2016 Director Product Management, Hortonworks Ryan Medlin Director Software Engineering, Neustar
  • 28. Copyright © 2015 Neustar, Inc. All Rights Reserved 28
  • 29. AREAS OF FOCUS AND INNOVATION FOR NEUSTAR  Security  User  Device  Anomaly Detection  Multi-Factor Authentication (PKI, etc)  Policy Creation/Decisioning/Enforcement  Local / Edge / Distributed functionality  Openness and Standardization  (OCF/OneM2M/MQTT/CoAP/GitHub)  Partnerships Copyright © 2015 Neustar, Inc. All Rights Reserved 29
  • 30. DEVICE AND IOT PLATFORM Devices Data Processing Services Local & Offline Functionality Device and User Authentication Device Directory & Identity User and Device Policy Device Certification & Public / Private Key Distribution/MFA OIC/OCF and IoTivity Standards Based Platform for Interoperability & device definition Absolute, Group/Collection, Local & cloud Verification / Fraud / Anomaly detection Location Services Future Peer-to-Peer communication & Discovery. decentralization Secure Discovery & Communication OCF based security & communication protocols Data pipeline, enrichment and analytics
  • 31. CORE DATA PLATFORM REQUIREMENTS  Provide Ingestion for all Machine Data Exhaust from all system components. (ex: Anomaly Detection, Monitoring/Alerting)  Deserializing avro messages using given schema(s) and by default store a message in hdfs.  Given a message attribute (for example device type and customer Id or device Id) execute a rule to determine next course of action which should comprise of one or more of these actions: – Notify a user via email or sms or 3rd party api – Notify or pass message to registry service/API – Store data in hdfs – Store data in a relational data store (hbase phoenix or other?) Copyright © 2015 Neustar, Inc. All Rights Reserved 31
  • 32. Copyright © 2015 Neustar, Inc. All Rights Reserved 32
  • 33. Copyright © 2015 Neustar, Inc. All Rights Reserved 33
  • 34. Copyright © 2015 Neustar, Inc. All Rights Reserved 34 Outgoing Messages
  • 35. Copyright © 2015 Neustar, Inc. All Rights Reserved 35 Turn a Light on Manually Inbound Message Flow
  • 36. Copyright © 2015 Neustar, Inc. All Rights Reserved 36 Incoming Messages
  • 37. INTERESTING PROBLEMS  We need to work in offline and temporary offline mode so data replication is the hardest problem to solve and is based on unique use cases.  Also need to run a subset of the Cloud Inbound real time service rules and processing. How to reuse this logic both in the cloud and locally on a Local Gateway is a design challenge with tradeoffs. Copyright © 2015 Neustar, Inc. All Rights Reserved 37
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Q & A

Editor's Notes

  1. Allow Network sensors to be centrally managed from the cloud with no additional management consoles.
  2. --- this shows the core real time use case for anomoly detection we need
  3. Data processing