SlideShare a Scribd company logo
1 of 63
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analysing streaming data in real time
Javier Ramirez
@supercoco9
AWS Tech Evangelist
A N T 2
Ville Kurkinen
Principal Architect
F-Secure Oyj
S U M M I T
Sto ckho lm
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A simpleproblem (untilyou knowthedetails)
• I want to calculate the total and average of several numbers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A simplebig dataproblem (untilyou knowthedetails)
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A simplestreamingproblem
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
• The dataset is not static, new numbers are coming all the time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Asimplishstreamingproblem
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
• The dataset is not static, new numbers are coming all the time
• From different sensors, which are geo distributed and moving. We will be
adding and removing sensors all the time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A quitestandard streaming problem
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
• The dataset is not static, new numbers are coming all the time
• From different sensors, which are geo distributed and moving. We will be
adding and removing sensors all the time
• And since they use 3G and batteries, some might go quiet for a while
and then send a bunch of stale data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A elasticand scalablestreamingproblem
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
• The dataset is not static, new numbers are coming all the time
• From different sensors, which are geo distributed and moving. We will be
adding and removing sensors all the time
• And since they use 3G and batteries, some might go quiet for a while
and then send a bunch of stale data
• Flow will not be constant (from few events per second to thousands)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
An almostreal-lifestreaming analyticsscenario
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in
a single hard drive
• The dataset is not static, new numbers are coming all the time
• From different sensors, which are geo distributed and moving. We will be
adding and removing sensors all the time
• And since they use 3G and batteries, some might go quiet for a while
and then send a bunch of stale data
• Flow will not be constant (from few events per second to thousands)
• And I don’t want just the total average, but total per month, per week, per
day, per hour, per minute…
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
A realbusiness problem you cansolvewithstreaming
• I want to calculate the total and average of several numbers
• They might be MANY numbers, more than you can store in memory, or in a single hard drive
• The dataset is not static, new numbers are coming all the time
• From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time
• And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data
• Flow will not be constant (from few events per second to thousands)
• And I don’t want just the total average, but total per month, per week, per day, per hour, per minute…
• We need pretty dashboards with current status, comparison with the
past, trends, and anomaly detection
• To run this reliably, we need advanced monitoring, alerts, and
autoscaling
• No, I am not hiring a whole new operations team to manage the system
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
http://gunshowcomic.com/648
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Probably lessthanyou think
~20 lines of JAVA code (plus a few
hundreds with imports, POJOs,
and boilerplate, because JAVA)
a simple GROUP BY statement in
SQL with streaming extensions
(plus a few lines of boilerplate for
schema definition)
OR
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Apache Kafka
A distributed streaming platform
Apache Flink
Stateful computations over data streams
Elasticsearch
Search & Analyze data in real time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Distributed systemsarehard tomanage at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Software & Internet Education Technology BioTech and Pharma
Media and EntertainmentFinancial Services Social Media
Telecommunications Travel & Transportation Real Estate
Logistics & Operations Publishing Other
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon and open source
Amazon is committed to improving open-source
Apache Kafka and Elasticsearch
https://aws.amazon.com/opensource/
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Go
video analytics
Amazon.com
online catalog
Amazon
CloudWatch
logs
Amazon
S3 events
AWS
metering
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon KinesisData Firehose
• Zero administration and seamless elasticity
• Direct-to-data store integration
• Serverless continuous data transformations
• Near real-time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Ingest Transform Deliver
Amazon S3
Amazon Redshift
Amazon Elasticsearch Service
AWS IoT
Amazon Kinesis Agent
Amazon Kinesis Streams
Amazon CloudWatch Logs
Amazon CloudWatch Events
Apache Kafka
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon KinesisDataStreams
• Easy administration and low cost
• Real-time, elastic performance
• Secure, durable storage
• Available to multiple real-time analytics applications
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis - Firehose vs. Streams
Amazon Kinesis Data Streams is for use cases that require custom
processing, per incoming record, with sub-1 second processing latency, and
a choice of stream processing frameworks. Allows multiple consumers,
different consumer patterns, and stream replay
Amazon Kinesis Data Firehose is for use cases that require zero
administration, ability to use existing analytics tools based on Amazon S3,
Amazon Redshift, and Amazon ES, and a data latency of 60 seconds or
higher
Kinesis Data
Streams
Kinesis Data
Firehose
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SU M M I T
Amazon Kinesis - Firehose vs. Streams
Amazon Kinesis Data Streams isf or use casest hat require custom
processing, per incoming record, wit h sub-1 second processing latency, and
a choice of stream processing frameworks. Allows multiple consumers,
different consumer patterns, and stream replay
Amazon Kinesis Data Firehose isf or use casest hat require zero
administration, ability t o use existing analytics tools based on Amazon S3,
Amazon Redshift, and Amazon ES, and a data latency of 60 secondsor
higher
Kinesis Data
Streams
Kinesis Data
Firehose
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Dataisstoredintheorderitwasreceivedforasetduration
oftime,andcanbereplayedindefinitelyduringthistime.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
•AT_SEQUENCE_NUMBER - Start reading from the position denoted by a specific sequence number,
provided in the value StartingSequenceNumber.
•AFTER_SEQUENCE_NUMBER - Start reading right after the position denoted by a specific sequence
number, provided in the value StartingSequenceNumber.
•AT_TIMESTAMP - Start reading from the position denoted by a specific time stamp, provided in the
value Timestamp.
•TRIM_HORIZON - Start reading at the last untrimmed record in the shard in the system, which is the oldest
data record in the shard.
•LATEST - Start reading just after the most recent record in the shard, so that you always read the most
recent data in the shard.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Time-based
seek
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log processing atNetflixusing KinesisDataStreams
Netflix’s Amazon Kinesis Streams-based solution has proven to be highly scalable, each day
processing billions of traffic flows. Typically, about 1,000 Amazon Kinesis shards work in
parallel to process the data stream. “Amazon Kinesis Streams processes multiple terabytes of
log data each day, yet events show up in our analytics in seconds. We can discover and
respond to issues in real time, ensuring high availability and a great customer experience.”
“Our solution built on Amazon Kinesis enables us to identify ways to increase efficiency, reduce
costs, and improve resiliency for the best customer experience,”
John BennettSenior Software Engineer, Netflix
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon S3
Amazon Redshift
Amazon Elasticsearch
Splunk
Real-Time Applications (seconds)
Streaming ETL (minutes)
Stream Ingestion
[Wed Oct 11 14:32:52 2018]
[error] [client 127.0.0.1]
client denied by server
configuration:
/export/home/live/ap/htdocs
/test
Mobile device
Metering
Click streams
IoT sensors
Logs
AWS SDKsAmazon
Kinesis Agent
Amazon Kinesis
Producer Library
AmazonKinesis
ConsumerLibrary
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Processing a data streamwithApacheSpark
https://spark.apache.org/docs/2.3.1/streaming-kinesis-integration.htm
l
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Processing a data streamwithAWS Lambda
data
producer
Kinesis Data
Streams
Amazon
SNS
Continuously stream data
Lambda
service
Lambda
functionA
Lambda
function B
Continuously polls for new data,
1 poll per second
Automatically invokes your
function(s) when data found
• Stateless
• Lambda polls each shard once per second
• Scales with your data
ANALYSING CYBER
THREATS IN NEAR REAL-
TIME
Ville Kurkinen
Principal Architect
F-Secure Oyj
Finland
43
We are trusted by
companies for which cyber
security is absolutely
critical
5/5
Top UK Banks
3/5
Top US Banks
3/5
Top Singapore
Banks
4/5
Top South African
Banks
5/5
Top Nordic Banks
Endpoint protection
New cyber
security
solutions
F-SECURE• Founded in 1988
• +1600 employees
• Listed on NASDAQ OMX, Helsinki
• ~30 offices around the globe
• Revenue of €190 million in 2018
• +100,000 corporate customers and tens of millions of consumer
customers.
© F-Secure44
F-SECURE RAPID DETECTION & RESPONSE
SERVICE
Email
notification
with details
in portal
Phone call in
case of an
incident
Rapid
30-minute
Detection to
Response
24/7
Threat
Hunting
Service
Actionable
Expert
Guidance to
Respond
Direct Dialog
with Threat
Analysts
Global
Intelligence
Reports
Decoy
Sensor
s
RAPID DETECTION & RESPONSE
SERVICE:
COMBINING MAN & MACHINE
© F-Secure
F-SECURE RAPID DETECTION
& RESPONSE CENTER
Threat
hunters
Incident
responders
Forensic
experts
Windows
Sensors
Mac
Sensors
Linux
Sensors
YOUR ORGANIZATION
Router
Internet
Attacker Network
Sensor
ANOMALY
CLOUD-BASED AI/ML
ANALYTICS PLATFORM
Big data
analytics
Real-time
behavior
analytics
Reputationa
l analytics
RESPONSE
GUIDANCE
SOC
CSIRT
IT Help Desk
Partner
IoT
DETECT ATTACKS IN MINUTES
WITHOUT DROWNING IN ALERTS
2 billionDATA EVENTS/MONTH
• Endpoint sensors
• Network sensors
• Decoy sensors
Average number from a customer
organization with ~1300 endpoints
25DETECTIONS
Detections of
which customer
was notified
After threat hunters have
analyzed the machine filtered
detections
15REAL THREATS
Customer confirmed
that these were
real threats
900,000SUSPICIOUS EVENTS
Real-time behavioral analysis of
the raw data events supported by
AI and machine learning
Training set:
True / false positive
decisions by the hunters
Event
Enrichment
Host & User
Profiling
Anomaly
Detection
Detection
Significance
Analysis
ANALYZED EVENTS
PER DEPLOYMENT
© F-Secure Confidential
ARCHITECTURE
© F-Secure Confidential
© F-Secure Confidential49
1
2 3
4
5
6
7
8
Managed Kafka
Migrating from RabbitMQ to
Managed Kafka as stateful data
processing infrastructure.
Kinesis Data
Analytics
More real-time processing of
statistics data calculation from
telemetry and statistics streams.
Kinesis auto-
scaling
Automating Kinesis shard
management by splitting /
merging shards based on
load for increased
elasticity and cost
management.
WHAT’S NEXT?
© F-Secure
f-secure.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon KinesisDataAnalytics
• Interact with streaming data in real-time using SQL or integrated Java applications
• Build fully managed and elastic stream processing applications
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KDA for Java for sophisticated applications
UtilizesApache Flink, a Framework and distributed engine for stateful
processing of data streams
Simple
programming
High performance
Stateful
Processing
Strong data
integrity
Easy to use and
flexible APIs make
building apps fast
In-memory
computing provides
low latency & high
throughput
Durable
application state
saves
Exactly-once
processing and
consistent state
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KinesisDataAnalytics–JavaApplications
Build Java applications
using open source
(Apache Flink)
Upload your application
code to Kinesis Data
Analytics
Run your application in a
fully managed and elastic
service
1 2 3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
How do you build an application?
Streaming operators are applied to data streams in a pipeline
Source
Sink
DataStream
KeyedDataStream
DataStream
Sink
keyBy,
window
filter
apply
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Extensibleintegrations withAWS services
• Easily add sources and sinks to an application
• Build custom connectors for other data sources and sinks
Example Sources
Example
Destinations (Sinks)
Apache Kafka
Apache Kafka RabbitMQ
RabbitMQ ElasticSearchApache
Cassandra
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automaticallybackup your application
Create and restore your application to a previous point-
in-time (snapshots)
Running application state is automatically backed up
by default (checkpoints)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Application scaling– resources and parallelism
Resources
• Kinesis Process Unit (KPUs) used to run
code
• Each KPU is 1 vCPU and 4 GB memory
• 50 GB of running application storage per
KPU
• Automatic or provisioned scaling
Parallelism
• Number of instances of a task
• Default versus operator parallelism
• Maximum defines the largest possible
parallelism for an application
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KDA for SQL for simple and fast use cases
• Sub-second end to end processing latencies
• SQL steps can be chained together in serial or parallel steps
• Build applications with one or hundreds of queries
• Pre-built functions include everything from sum and count
distinct to machine learning algorithms
• Aggregations run continuously using window operators
• Fully managed and elastic
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Easily connect to Kinesis Data streams and
Kinesis Data Firehose delivery streams
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
WritingStreamingSQL
Pumps (continuous query)
CREATE OR REPLACE PUMP calls_per_ip_pump AS
INSERT INTO calls_per_ip_stream
SELECT STREAM "eventTimestamp",
COUNT(*),
"sourceIPAddress"
FROM source_sql_stream_001 ctrail
GROUP BY "sourceIPAddress",
STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE),
STEP(ctrail."eventTimestamp" BY INTERVAL '1'
MINUTE);
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Anomaly detection withSQL
Pumps (continuous query)
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO
"DESTINATION_SQL_STREAM"
SELECT "ANOMALY_SCORE", "ANOMALY_EXPLANATION" FROM
TABLE
(RANDOM_CUT_FOREST_WITH_EXPLANATION(CURSOR(SELECT
STREAM * FROM "SOURCE_SQL_STREAM_001"), 100, 256,
100000, 1, true)) WHERE ANOMALY_SCORE > 0
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AggregatingStreamingData?
• Aggregations (count, sum, min,…) take granular real time data and turn it into
insights
• Data is continuously processed so you need to tell the application when you
want results
• Tumbling windows, sliding windows, and custom windows
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
In-application stream
Amazon Kinesis Data Analytics application
SQL code joining
table and stream
streaming source destination
Amazon
S3
In-application table
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://aws.amazon.com/blogs/big-data/build-and-run-streaming-applications-with-apache-flink-
and-amazon-kinesis-data-analytics-for-java-applications/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
aws.amazon.com/kinesis
aws.amazon.com/kinesis/getting-started
aws.amazon.com/msk
aws.amazon.com/msk/getting-started
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Javier Ramirez
@supercoco9
Ville Kurkinen
Principal Architect
F-Secure Oyj

More Related Content

What's hot

Tech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSTech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSAmazon Web Services
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Summits
 
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesBuilding-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesAmazon Web Services
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Amazon Web Services
 
Machine Learning for innovation and transformation
Machine Learning for innovation and transformationMachine Learning for innovation and transformation
Machine Learning for innovation and transformationAmazon Web Services
 
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWS
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWSReinventing SAP on AWS: Scale & Simplify SAP Operations on AWS
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWSAmazon Web Services
 
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...Amazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ... No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...AWS Summits
 
AWS Startup Day Kyiv - AI/ML services for developers
AWS Startup Day Kyiv - AI/ML services for developersAWS Startup Day Kyiv - AI/ML services for developers
AWS Startup Day Kyiv - AI/ML services for developersAmazon Web Services
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSAmazon Web Services
 
Innovating SAP the Easy Way – Migrate it to AWS
Innovating SAP the Easy Way – Migrate it to AWSInnovating SAP the Easy Way – Migrate it to AWS
Innovating SAP the Easy Way – Migrate it to AWSAmazon Web Services
 
Aws Tools for Alexa Skills
Aws Tools for Alexa SkillsAws Tools for Alexa Skills
Aws Tools for Alexa SkillsBoaz Ziniman
 
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...Amazon Web Services
 
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitHow to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitAmazon Web Services
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...Boaz Ziniman
 
Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Amazon Web Services
 

What's hot (20)

Amazon Container Services
Amazon Container ServicesAmazon Container Services
Amazon Container Services
 
Tech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWSTech Talk: Cloud Data Management with Veeam & AWS
Tech Talk: Cloud Data Management with Veeam & AWS
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-ForkinesBuilding-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
Building-Event-Driven-Serverless-Apps-with-AWS-Event-Forkines
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
 
Machine Learning for innovation and transformation
Machine Learning for innovation and transformationMachine Learning for innovation and transformation
Machine Learning for innovation and transformation
 
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWS
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWSReinventing SAP on AWS: Scale & Simplify SAP Operations on AWS
Reinventing SAP on AWS: Scale & Simplify SAP Operations on AWS
 
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ... No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 
AWS Startup Day Kyiv - AI/ML services for developers
AWS Startup Day Kyiv - AI/ML services for developersAWS Startup Day Kyiv - AI/ML services for developers
AWS Startup Day Kyiv - AI/ML services for developers
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWS
 
Innovating SAP the Easy Way – Migrate it to AWS
Innovating SAP the Easy Way – Migrate it to AWSInnovating SAP the Easy Way – Migrate it to AWS
Innovating SAP the Easy Way – Migrate it to AWS
 
Aws Tools for Alexa Skills
Aws Tools for Alexa SkillsAws Tools for Alexa Skills
Aws Tools for Alexa Skills
 
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...
Developing-Effective-Mass-Migration-Strategy-out-of-a-Tool-based-Portfolio-As...
 
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitHow to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
 
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
SKL208 - Turbocharge your Business with AI and Machine Learning - Tel Aviv Su...
 
Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML
 

Similar to Streaming Analytics with AWS Kinesis

Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)javier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analyticsjavier ramirez
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Amazon Web Services
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...javier ramirez
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfAmazon Web Services
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersAmazon Web Services
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Amazon Web Services
 
Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Amazon Web Services
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019Amazon Web Services
 
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...Amazon Web Services
 
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitAmazon Web Services
 
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...Amazon Web Services
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitAmazon Web Services
 
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...Amazon Web Services
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...Amazon Web Services
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudAlluxio, Inc.
 
Breaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersBreaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersAmazon Web Services
 

Similar to Streaming Analytics with AWS Kinesis (20)

Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M Users
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
 
Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...
 
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
 
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...
Modernizing Architectures in AWS to Drive Efficiency for Municipal Mobility S...
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
 
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...
Tape Is a Four Letter Word: Back Up to the Cloud in Under an Hour (STG201) - ...
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
Breaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersBreaking Up the Monolith with Containers
Breaking Up the Monolith with Containers
 

More from javier ramirez

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfestjavier ramirez
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBjavier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragónjavier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessjavier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloudjavier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Divejavier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSjavier ramirez
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...javier ramirez
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...javier ramirez
 

More from javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 

Streaming Analytics with AWS Kinesis

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analysing streaming data in real time Javier Ramirez @supercoco9 AWS Tech Evangelist A N T 2 Ville Kurkinen Principal Architect F-Secure Oyj
  • 2. S U M M I T Sto ckho lm
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A simpleproblem (untilyou knowthedetails) • I want to calculate the total and average of several numbers
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A simplebig dataproblem (untilyou knowthedetails) • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A simplestreamingproblem • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Asimplishstreamingproblem • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time • From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A quitestandard streaming problem • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time • From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time • And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A elasticand scalablestreamingproblem • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time • From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time • And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data • Flow will not be constant (from few events per second to thousands)
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T An almostreal-lifestreaming analyticsscenario • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time • From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time • And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data • Flow will not be constant (from few events per second to thousands) • And I don’t want just the total average, but total per month, per week, per day, per hour, per minute…
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T A realbusiness problem you cansolvewithstreaming • I want to calculate the total and average of several numbers • They might be MANY numbers, more than you can store in memory, or in a single hard drive • The dataset is not static, new numbers are coming all the time • From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time • And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data • Flow will not be constant (from few events per second to thousands) • And I don’t want just the total average, but total per month, per week, per day, per hour, per minute… • We need pretty dashboards with current status, comparison with the past, trends, and anomaly detection • To run this reliably, we need advanced monitoring, alerts, and autoscaling • No, I am not hiring a whole new operations team to manage the system
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 12. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Probably lessthanyou think ~20 lines of JAVA code (plus a few hundreds with imports, POJOs, and boilerplate, because JAVA) a simple GROUP BY statement in SQL with streaming extensions (plus a few lines of boilerplate for schema definition) OR
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 16. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Apache Kafka A distributed streaming platform Apache Flink Stateful computations over data streams Elasticsearch Search & Analyze data in real time
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Distributed systemsarehard tomanage at scale
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Software & Internet Education Technology BioTech and Pharma Media and EntertainmentFinancial Services Social Media Telecommunications Travel & Transportation Real Estate Logistics & Operations Publishing Other
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon and open source Amazon is committed to improving open-source Apache Kafka and Elasticsearch https://aws.amazon.com/opensource/
  • 21. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Go video analytics Amazon.com online catalog Amazon CloudWatch logs Amazon S3 events AWS metering
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon KinesisData Firehose • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless continuous data transformations • Near real-time
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Ingest Transform Deliver Amazon S3 Amazon Redshift Amazon Elasticsearch Service AWS IoT Amazon Kinesis Agent Amazon Kinesis Streams Amazon CloudWatch Logs Amazon CloudWatch Events Apache Kafka
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon KinesisDataStreams • Easy administration and low cost • Real-time, elastic performance • Secure, durable storage • Available to multiple real-time analytics applications
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis - Firehose vs. Streams Amazon Kinesis Data Streams is for use cases that require custom processing, per incoming record, with sub-1 second processing latency, and a choice of stream processing frameworks. Allows multiple consumers, different consumer patterns, and stream replay Amazon Kinesis Data Firehose is for use cases that require zero administration, ability to use existing analytics tools based on Amazon S3, Amazon Redshift, and Amazon ES, and a data latency of 60 seconds or higher Kinesis Data Streams Kinesis Data Firehose © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SU M M I T Amazon Kinesis - Firehose vs. Streams Amazon Kinesis Data Streams isf or use casest hat require custom processing, per incoming record, wit h sub-1 second processing latency, and a choice of stream processing frameworks. Allows multiple consumers, different consumer patterns, and stream replay Amazon Kinesis Data Firehose isf or use casest hat require zero administration, ability t o use existing analytics tools based on Amazon S3, Amazon Redshift, and Amazon ES, and a data latency of 60 secondsor higher Kinesis Data Streams Kinesis Data Firehose
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Dataisstoredintheorderitwasreceivedforasetduration oftime,andcanbereplayedindefinitelyduringthistime.
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T •AT_SEQUENCE_NUMBER - Start reading from the position denoted by a specific sequence number, provided in the value StartingSequenceNumber. •AFTER_SEQUENCE_NUMBER - Start reading right after the position denoted by a specific sequence number, provided in the value StartingSequenceNumber. •AT_TIMESTAMP - Start reading from the position denoted by a specific time stamp, provided in the value Timestamp. •TRIM_HORIZON - Start reading at the last untrimmed record in the shard in the system, which is the oldest data record in the shard. •LATEST - Start reading just after the most recent record in the shard, so that you always read the most recent data in the shard.
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Time-based seek
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log processing atNetflixusing KinesisDataStreams Netflix’s Amazon Kinesis Streams-based solution has proven to be highly scalable, each day processing billions of traffic flows. Typically, about 1,000 Amazon Kinesis shards work in parallel to process the data stream. “Amazon Kinesis Streams processes multiple terabytes of log data each day, yet events show up in our analytics in seconds. We can discover and respond to issues in real time, ensuring high availability and a great customer experience.” “Our solution built on Amazon Kinesis enables us to identify ways to increase efficiency, reduce costs, and improve resiliency for the best customer experience,” John BennettSenior Software Engineer, Netflix
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon S3 Amazon Redshift Amazon Elasticsearch Splunk Real-Time Applications (seconds) Streaming ETL (minutes) Stream Ingestion [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs /test Mobile device Metering Click streams IoT sensors Logs AWS SDKsAmazon Kinesis Agent Amazon Kinesis Producer Library AmazonKinesis ConsumerLibrary
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Processing a data streamwithApacheSpark https://spark.apache.org/docs/2.3.1/streaming-kinesis-integration.htm l
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Processing a data streamwithAWS Lambda data producer Kinesis Data Streams Amazon SNS Continuously stream data Lambda service Lambda functionA Lambda function B Continuously polls for new data, 1 poll per second Automatically invokes your function(s) when data found • Stateless • Lambda polls each shard once per second • Scales with your data
  • 35. ANALYSING CYBER THREATS IN NEAR REAL- TIME Ville Kurkinen Principal Architect F-Secure Oyj Finland
  • 36. 43 We are trusted by companies for which cyber security is absolutely critical 5/5 Top UK Banks 3/5 Top US Banks 3/5 Top Singapore Banks 4/5 Top South African Banks 5/5 Top Nordic Banks Endpoint protection New cyber security solutions F-SECURE• Founded in 1988 • +1600 employees • Listed on NASDAQ OMX, Helsinki • ~30 offices around the globe • Revenue of €190 million in 2018 • +100,000 corporate customers and tens of millions of consumer customers.
  • 37. © F-Secure44 F-SECURE RAPID DETECTION & RESPONSE SERVICE Email notification with details in portal Phone call in case of an incident Rapid 30-minute Detection to Response 24/7 Threat Hunting Service Actionable Expert Guidance to Respond Direct Dialog with Threat Analysts Global Intelligence Reports
  • 38. Decoy Sensor s RAPID DETECTION & RESPONSE SERVICE: COMBINING MAN & MACHINE © F-Secure F-SECURE RAPID DETECTION & RESPONSE CENTER Threat hunters Incident responders Forensic experts Windows Sensors Mac Sensors Linux Sensors YOUR ORGANIZATION Router Internet Attacker Network Sensor ANOMALY CLOUD-BASED AI/ML ANALYTICS PLATFORM Big data analytics Real-time behavior analytics Reputationa l analytics RESPONSE GUIDANCE SOC CSIRT IT Help Desk Partner IoT
  • 39. DETECT ATTACKS IN MINUTES WITHOUT DROWNING IN ALERTS 2 billionDATA EVENTS/MONTH • Endpoint sensors • Network sensors • Decoy sensors Average number from a customer organization with ~1300 endpoints 25DETECTIONS Detections of which customer was notified After threat hunters have analyzed the machine filtered detections 15REAL THREATS Customer confirmed that these were real threats 900,000SUSPICIOUS EVENTS Real-time behavioral analysis of the raw data events supported by AI and machine learning Training set: True / false positive decisions by the hunters Event Enrichment Host & User Profiling Anomaly Detection Detection Significance Analysis
  • 40. ANALYZED EVENTS PER DEPLOYMENT © F-Secure Confidential
  • 43. Managed Kafka Migrating from RabbitMQ to Managed Kafka as stateful data processing infrastructure. Kinesis Data Analytics More real-time processing of statistics data calculation from telemetry and statistics streams. Kinesis auto- scaling Automating Kinesis shard management by splitting / merging shards based on load for increased elasticity and cost management. WHAT’S NEXT? © F-Secure
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon KinesisDataAnalytics • Interact with streaming data in real-time using SQL or integrated Java applications • Build fully managed and elastic stream processing applications
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KDA for Java for sophisticated applications UtilizesApache Flink, a Framework and distributed engine for stateful processing of data streams Simple programming High performance Stateful Processing Strong data integrity Easy to use and flexible APIs make building apps fast In-memory computing provides low latency & high throughput Durable application state saves Exactly-once processing and consistent state
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KinesisDataAnalytics–JavaApplications Build Java applications using open source (Apache Flink) Upload your application code to Kinesis Data Analytics Run your application in a fully managed and elastic service 1 2 3
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T How do you build an application? Streaming operators are applied to data streams in a pipeline Source Sink DataStream KeyedDataStream DataStream Sink keyBy, window filter apply
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Extensibleintegrations withAWS services • Easily add sources and sinks to an application • Build custom connectors for other data sources and sinks Example Sources Example Destinations (Sinks) Apache Kafka Apache Kafka RabbitMQ RabbitMQ ElasticSearchApache Cassandra
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automaticallybackup your application Create and restore your application to a previous point- in-time (snapshots) Running application state is automatically backed up by default (checkpoints)
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Application scaling– resources and parallelism Resources • Kinesis Process Unit (KPUs) used to run code • Each KPU is 1 vCPU and 4 GB memory • 50 GB of running application storage per KPU • Automatic or provisioned scaling Parallelism • Number of instances of a task • Default versus operator parallelism • Maximum defines the largest possible parallelism for an application
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KDA for SQL for simple and fast use cases • Sub-second end to end processing latencies • SQL steps can be chained together in serial or parallel steps • Build applications with one or hundreds of queries • Pre-built functions include everything from sum and count distinct to machine learning algorithms • Aggregations run continuously using window operators • Fully managed and elastic
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Easily connect to Kinesis Data streams and Kinesis Data Firehose delivery streams Amazon Kinesis Data Streams Amazon Kinesis Data Firehose
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 55. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T WritingStreamingSQL Pumps (continuous query) CREATE OR REPLACE PUMP calls_per_ip_pump AS INSERT INTO calls_per_ip_stream SELECT STREAM "eventTimestamp", COUNT(*), "sourceIPAddress" FROM source_sql_stream_001 ctrail GROUP BY "sourceIPAddress", STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  • 56. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Anomaly detection withSQL Pumps (continuous query) CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT "ANOMALY_SCORE", "ANOMALY_EXPLANATION" FROM TABLE (RANDOM_CUT_FOREST_WITH_EXPLANATION(CURSOR(SELECT STREAM * FROM "SOURCE_SQL_STREAM_001"), 100, 256, 100000, 1, true)) WHERE ANOMALY_SCORE > 0
  • 57. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T AggregatingStreamingData? • Aggregations (count, sum, min,…) take granular real time data and turn it into insights • Data is continuously processed so you need to tell the application when you want results • Tumbling windows, sliding windows, and custom windows
  • 58. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T In-application stream Amazon Kinesis Data Analytics application SQL code joining table and stream streaming source destination Amazon S3 In-application table
  • 59. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://aws.amazon.com/blogs/big-data/build-and-run-streaming-applications-with-apache-flink- and-amazon-kinesis-data-analytics-for-java-applications/
  • 60. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 61. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T aws.amazon.com/kinesis aws.amazon.com/kinesis/getting-started aws.amazon.com/msk aws.amazon.com/msk/getting-started
  • 62. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 63. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Javier Ramirez @supercoco9 Ville Kurkinen Principal Architect F-Secure Oyj

Editor's Notes

  1. 4 minutes (for slides 22 and 23) Hopefully the value of data streaming is very clear at this stage, however it is very important to notice that companies face many challenges as they attempt to build out real-time data streaming capabilities, and embark on generating real-time analytics. Data streams are difficult to setup, tricky to scale, hard to achieve high availability, complex to integrate into broader ecosystems, error prone and complex to manage over time, and can become very expensive to maintain. These challenges have often been enough of a reason for many companies to shy away from such projects. At AWS it has been our core focus over the last 5 years to build a solution that removes these challenges.
  2. The AWS solution is easy to setup and use, has high availability and durability (default being across 3 regions), is full-managed and scalable reducing the complexity of managing the system over time and scaling as demands increase, and also comes with seamless integration into other core AWS services such as Elasticsearch for Log Analytics, S3 for data lake storage, Redshift for data warehousing purposes, Lambda for serverless processing etc. etc. Finally with AWS you only pay for what you use making the solution very cost effective.
  3. Purpose of the slide – breaks out the top 6 benefits of the service. Supports Open-Source APIs and Tools - We provide an open source compatible version of Elasticsearch. If you are currently using self managed Elasticsearch, you can easily migrate it to the service. We take care of the management of the cluster (undifferentiated heavy lifting), and you can continue to use the same open source tools and APIs that you are already using. Easy to Use - You use the console, sdk, cli to easily create a cluster, we then do the work of deploying the cluster and making it available via an endpoint that you can access via a REST api Scalable - We make it very easy to scale your clusters. With just a few commands we will seamlessly deploy a new cluster for you and you can continue to run uninterrupted. Secure - We provide a number of different security options. You can use IAM and VPC to secure access to your cluster. Highly available - We provide 100% data redundancy in two availability zones. Tightly Integrated with Other AWS Services - On the ingest side you can easily send CWL to Amazon ES, you can use Kinesis Firehose to stream data to Amazon ES, and we also offer integration with AWS IoT. For cluster creation, CF also supports Amazon ES.
  4. Purpose of the slide – Gives them the confidence that they are not alone if they use Amazon ES regardless of their vertical. Key takeaway is that Amazon Elasticsearch Service usage is not isolated to a few verticals, or high-tech companies. Almost all enterprises today are using some form of log analytics and operational monitoring to ensure the success of their business.
  5. 4 minutes So finally – I would like to introduce the AWS services that we have built to enable real-time analytics for our customers. The Kinesis family consists of 3 core services for data streaming (note we also have a fourth service Kinesis Video Streaming enabling our customers to stream and analyze video and audio in real-time – although we are not covering that today it is a very exciting capability). Kinesis Data Streams enables customers to capture and store data Kinesis Data Analytics allows customers to build real-time applications in SQL or Java (with fully-managed Flink) And Kinesis Data Firehose enables customers to load streaming data into streams, data lakes and or warehouses and is a very effective way of conducting ETL on continuous, high velocity data. We will go into the details of these services tomorrow during Damian Wylie’s session. Finally we are very excited to announce the latest service that we announced at Re:Invent 2018 and is currently in public preview, and has already achieved a run rate of $5million. Amazon Managed Streaming for Kafka is a fully-managed service for Apache Kafka, a highly popular open-source framework for data streaming. Customers, who chose to use Kafka, currently either managed clusters on premise or on EC2, with many of the challenges that we spoke about before. My introducing Amazon MSK customers can now lift and shift their existing workloads and get full benefits of a fully-managed service where clusters are setup automatically and can be created or torn down on demand. This is a very exciting opportunity this year, and if you hear of any customer who use Amazon Kafka do mention Amazon MSK and convince them to give it a go. Another huge advantage of these 4 services is that it provides our customers with the flexibility to choose the right streaming technology depending on their use case, needs and preferences. Damian will discuss this in depth tomorrow, but we are certainly excited to be able to offer our customers choice in this space.
  6. We often get questions from customers regarding when to use Amazon Kinesis Data Streams or Amazon Kinesis Data Firehose. Amazon Kinesis Data Streams is for use cases that require custom processing, per incoming record, with sub-1 second processing latency, and a choice of stream processing frameworks Amazon Kinesis Data Firehose is for use cases that require zero administration, ability to use existing analytics tools based on Amazon S3, Amazon Redshift, and Amazon ES, and a data latency of 60 seconds or higher In many cases customers leverage both services. KDS for real-time, event processing, and then KDF to load the streaming data into data stores for more thorough analysis.
  7. 3 minutes In order to understand the basics of real-time analytics and data streaming capabilities there are 5 core stages to understand. Firstly the source of data – essentially where is the data coming from? Mobile, web click-stream, log analytics, IoT devices, smart devices etc. etc. Data then needs to be ingested into the stream. This requires the ability to scale a solution that can capture data coming from hundreds of thousands of devices, in a reliable manner, into one stream for analysis. Damian will dig into some of the details of this tomorrow. Data is then stored in the order it was received for a set duration of time, and can be replayed indefinitely during this time. As the data is stored in the stream it can be processed by real-time applications to generate real-time analytics, execute real-time ETL and then deliver the continuous data to an end destination such as a data lake (S3, and then analyzed by Athena), a warehouse (Redshift) or other data bases such as DynamoDB.
  8. 3 minutes In order to understand the basics of real-time analytics and data streaming capabilities there are 5 core stages to understand. Firstly the source of data – essentially where is the data coming from? Mobile, web click-stream, log analytics, IoT devices, smart devices etc. etc. Data then needs to be ingested into the stream. This requires the ability to scale a solution that can capture data coming from hundreds of thousands of devices, in a reliable manner, into one stream for analysis. Damian will dig into some of the details of this tomorrow. Data is then stored in the order it was received for a set duration of time, and can be replayed indefinitely during this time. As the data is stored in the stream it can be processed by real-time applications to generate real-time analytics, execute real-time ETL and then deliver the continuous data to an end destination such as a data lake (S3, and then analyzed by Athena), a warehouse (Redshift) or other data bases such as DynamoDB.
  9. Expected questions: What is time-based seek? How exactly does replay work?
  10. 3 minutes In order to understand the basics of real-time analytics and data streaming capabilities there are 5 core stages to understand. Firstly the source of data – essentially where is the data coming from? Mobile, web click-stream, log analytics, IoT devices, smart devices etc. etc. Data then needs to be ingested into the stream. This requires the ability to scale a solution that can capture data coming from hundreds of thousands of devices, in a reliable manner, into one stream for analysis. Damian will dig into some of the details of this tomorrow. Data is then stored in the order it was received for a set duration of time, and can be replayed indefinitely during this time. As the data is stored in the stream it can be processed by real-time applications to generate real-time analytics, execute real-time ETL and then deliver the continuous data to an end destination such as a data lake (S3, and then analyzed by Athena), a warehouse (Redshift) or other data bases such as DynamoDB.
  11. 3 minutes In order to understand the basics of real-time analytics and data streaming capabilities there are 5 core stages to understand. Firstly the source of data – essentially where is the data coming from? Mobile, web click-stream, log analytics, IoT devices, smart devices etc. etc. Data then needs to be ingested into the stream. This requires the ability to scale a solution that can capture data coming from hundreds of thousands of devices, in a reliable manner, into one stream for analysis. Damian will dig into some of the details of this tomorrow. Data is then stored in the order it was received for a set duration of time, and can be replayed indefinitely during this time. As the data is stored in the stream it can be processed by real-time applications to generate real-time analytics, execute real-time ETL and then deliver the continuous data to an end destination such as a data lake (S3, and then analyzed by Athena), a warehouse (Redshift) or other data bases such as DynamoDB.
  12. Highlight that we have earned the trust of some of the most demanding industries, such as finance Explain the multiple reasons why banks focus on security, and what’s driving them to do more today Explain how broadly we do business with them, and how we can help them Share first story here: some cool case from the a renowned bank (anonymous of course), with an incident response & forensics angle
  13. UNMATCHED NETWORK VISIBILITY BY NETWORK, DECOY & ENDPOINT SENSORS We have the sensors collecting the relevant data and sending the data to our cloud. We have real time behavioral analytics, and big data analytics to process the data, we will look for anomalies from two perspectives: known bad behavior and unknown bad behavior. All anomalies will be raised to our experts in Rapid Detection Center, they will further verify the anomalies, and alert the customer or partner in less than 30 minutes when something critical is discovered. Threat Analysts will walk the customer or partner through necessary steps to contain and remediate the threat. Alerts are in 2 high-level categories: High level alerts which are critical, e.g. strong indication of an ongoing breach. Customer/partner is alerted via phone and email, case is verified and the customer’s/partner’s critical incident response process is initiated when needed. Medium/low level alerts which are non-critical. Typically these are spy/adware or other potentially unwanted programs discovered from employee PCs. With more details: Your organization – what is deployed: End-point sensors: Windows (7 and later, 2008 R2 or later), Mac (MacOS 10.11 (El Capitan) and MacOS 10.12 (Sierra)), Linux (CentOS 6, CentOS 7, RHEL 6, RHEL 7, Debian 7 and Debian 8) We collect behavioral metadata – not the insides of e.g. document files Collected data and privacy issues are described in Privacy Policy We collect roughly 5 MB of data per typical Windows office user / day (use this to calculate network impact) We are constantly working to reduce the amount of data collected Honeypots (decoy sensors) are a good very low noise way to build traps for the attacker. Once someone is accessing honeypot, it is immediately correlated with the information from other sensors to filter out false alarms. If there’s clear pattern suggesting malicious behavior, the customer is alerted. Honeypots are build on top of Linux including the necessary components to mimic critical assets. We provide several predefined flavors of honeypots, based on popular setups: *NIX web server (HTTP, HTTPS, MySQL, SSH) *NIX ftp (FTP, SSH) *NIX VoIP server (SIP, SSH) Windows server (SMB, MSSQL, TFTP) Windows workstation (SMB services) It is possible to configure a honeypot with any combination of the services mentioned above. Threat intelligence – Global and industry specific All the Threat Intelligence (internal and external) have been connected and implemented into the core of RDS, which is a AI assisted threat hunting platform. RDS has been build as native global, high-performance, low-latency cloud service. Following describes how it works in high-level: When sensor is installed it looks for signs of compromise and then starts collecting relevant behavioral data Data is send from sensors and is received by data ingestion front-ends (distributed globally) Received data goes through very low latency data enrichment process where additional information is added to the events for example file/URL reputation After data enrichment events go through real-time detection engine which is looking for anomalies based on observed behavior, this is done in multiple levels if needed with data correlation If detection is triggered a baseliner algorithm (machine learning based) is driven to filter out potential false positives If result from baseliner is malicious, then the detection is raised to RDC to be taken further by RDS threat analysts Depending on the length of the observed behavior the steps from 1 to 3 typically takes from less than 1 minute to few minutes. We can utilize industry/customer specific detection algorithms depending on the need Once the data has been analyzed by real-time detection it goes into big data storage (if possible we always utilize pseudonymized data) We utilize stored data for threat hunting in the following ways: Data is being analyzed automatically by various algorithms (for example statistical analytics & machine learning) to find new anomalies and these are then further analyzed and correlated either by other algorithms or by the Threat Analysts. We use both organization specific and global analytics when applicable. We utilize threat hunting driven by data science where new algorithms are tested against sets of data to discover new previously unknown threats. Gained insights are transferred to new detection algorithms and new competences for Threat Analysts. We also utilize the data to improve our false positive rates and to improve the performance by e.g. collecting less data from the sensors Rapid Detection Center At the core of the RDS are the cyber security experts. We have 3 types of skills available: Threat Analysts (24/7) act as the first level. They constantly work to monitor the service and hunt for threats. Once they get an indication that something suspicious is happening, they will first verify the case by collecting necessary evidence, then make the decision on the priority, if priority is high, then customer/partner is alerted immediately with necessary actionable intelligence. If the case is non-critical, then the case is described with guidance to remediate and send to the customer/partner. Threat Analyst also keep the customer/partner up-to-date on any ongoing investigations. Incident Responders (24/7). IR personnel are typically involved in complex cases where customer/partner is not able to manage the case internally. Incident Responders will help the customer remotely or on-site to contain and remediate the case, and with evidence gathering for legal purposes. We offer both experienced case leaders and technical incident responders. We have also worked together with law enforcement and know how to collect evidence to be used in courts. RDS has been designed to be deployed during IR case as threat hunting service to quick gain visibility when the customer network has already been breached. Forensic experts. We are one of very few organizations globally who can handle very wide range of forensic tasks ranging from internal networks to deep reverse engineering of unique malware samples. This allows us to handle even the most complicated nation state originated attacks and the investigations that ensue the breach attempt.
  14. F-Secure has been applying machine learning over 10 years ago (2008) in malware detection engine called Hydra. There were other client-side components, including DeepGuard, BlackLight, and Gemini, all leveraging machine-learning based malware detection engine used in conjunction with client-side behavioral analysis logic. In 2017 F-Secure launched its AI Center of Excellence, currently applying techniques such as reinforcement learning, GANs, and federated learning. F-Secure is also an active member of a pan-European SHERPA project to understand better adversarial attacks against machine learning, and potential malicious uses of machine learning.
  15. Note: Technical service manager (TSM) is mandatory whenever reseller partner is not responsible of the deployments and the first line support. TSM supports customer deployments, ensuring settings are configured correctly and support available production with customer care’s support. TSM monitors service level (product, support), drive feature adoption and acts as a local escalation point.
  16. This is the Processing Section Are KDS and KDF the only streams that KDA can work with? (with MSK on the roadmap). Can the output be sent to any of the consumers on slide 17? Would KDA ever be replaced another consumer completely, if so why/what use cases? What is the standard architecture here? KDS/KDF -> KDA -> Lambda/ES/EMR -> S3/Redshift/DyanmoDB? If so we should talk about multiple consumers working in a workflow to execute effectively across many use cases.
  17. Is this specific to Java? Why is it in this section? What are the points we are making here? Two processes, one for S3 and one for DynamoDB? Keyby, window, filter need explanations.
  18. RabbitMQ? Are these the only three sources? What is the message here? Destinations could be a stream/messaging queue, a data lake/warehouse/database or an analytical service such as Elasticsearch. Anything else? (again is this specific to the Java application?)
  19. We support the majority of the ANSI 2011 SQL standard. St
  20. Some customers don’t have normalized or easy to structure data in their streams. These capabilities provide mechanisms to transform data ahead of SQL code (pre-processing).
  21. Some customers don’t have normalized or easy to structure data in their streams. These capabilities provide mechanisms to transform data ahead of SQL code (pre-processing).
  22. Proof of Concepts typically take less than a day
  23. We have a rich amount of content on the website including case studies, webinars, and many blogs and technical documentation. Thank you for listening and I would now like to hand over to Ajit to provide some more insight into sales opportunities.