SlideShare a Scribd company logo
1 of 33
Download to read offline
JUSTIN LONG | justin@skymind.io
Deep Learning with GPUs in Production
AI By the Bay 2017
DEEPLEARNING4J &
KAFKA
April 2019
| OBJECTIVES
By the end of this presentation, you should…
1. Know the Deeplearning4j stack and how it works
2. Understand why aggregation is useful
3. Have an example of using Deeplearning4j and Kafka together
the Deeplearning4j stack
DL4J Ecosystem
Deeplearning4j, ScalNet
Build, train, and deploy neural
networks on JVM and in Spark.
ND4J /libND4J
High performance linear algebra
on GPU/CPU. Numpy for JVM.
DataVec
Data ingestion, normalization, and
vectorization. Pandas integration.
SameDiff
Symbolic differentiation and
computation graphs.
Arbiter
Hyperparameter search for optimizing
neural networks.
RL4J
Reinforcement learning on JVM.
Model Import
Import neural nets from ONNX,
TensorFlow, Keras (Theano, Caffe).
Jumpy
Python API for ND4J.
DL4J Training API
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new AMSGrad(0.05))
.l2(5e-4).activation(Activation.RELU)
.list(
new ConvolutionLayer.Builder(5, 5).stride(1, 1).nOut(20).build(),
new SubsamplingLayer.Builder(PoolingType.MAX).kernelSize(2, 2).build(),
new ConvolutionLayer.Builder(5, 5).stride(1, 1).nOut(50).build(),
new SubsamplingLayer.Builder(PoolingType.MAX).kernelSize(2, 2).padding(2,2).build(),
new DenseLayer.Builder().nOut(500).build(),
new DenseLayer.Builder().nOut(nClasses).activation(Activation.SOFTMAX).build(),
new LossLayer.Builder().lossFunction(LossFunction.MCXENT).build()
)
.setInputType(InputType.convolutionalFlat(28, 28, 1))
.build()
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.fit(...);
DL4J Training Features
A very extensive feature rich library
- Large set of layers, including VAE
- Elaborate architectures, eg. center loss
- Listeners: score and performance, checkpoint
- Extensive Eval classes
- Custom Activation, Custom Layers
- Learning Rate Schedules
- Dropout, WeightNoise, WeightConstraints
- Transfer Learning
- And so much more
Inference with imported models
//Import model
model = KerasModelImport.import...
//Featurize input data into an INDArray
INDArray features = …
//Get prediction
INDArray prediction = model.output(features)
Featurizing Data
DataVec: A tool for ETL
Runs natively on Spark with GPUs and CPUs
Designed to support all major types of input data (text, CSV, audio,
image and video) with these specific input formats
Define Schemas and Transform Process
Serialize the transform processes, which allows them to be more
portable when they’re needed for production environments.
DataVec Schema
Define Schemas
Schema inputDataSchema = new Schema.Builder()
.addColumnsString("CustomerID", "MerchantID")
.addColumnInteger("NumItemsInTransaction")
.addColumnCategorical("MerchantCountryCode",
Arrays.asList("USA","CAN","FR","MX"))
.addColumnDouble("TransactionAmountUSD",0.0,null,false,false) //$0.0 or
more, no maximum limit, no NaN and no Infinite values
.addColumnCategorical("FraudLabel", Arrays.asList("Fraud","Legit"))
.build()
DataVec Transform Process
Basic Transform Example
- Filter rows by column value
- Handle invalid values with replacement (-ve $ amt)
- Handle datetime, extract hour of day etc
- Operate on columns in place
- Derive new columns from existing columns
- Join multiple sources of data
- AND much more...
Serialize to JSON!!
https://gist.github.com/eraly/3b15d35eb4285acd444f2f18976dd226
DataVec Data Analysis
DataAnalysis dataAnalysis =
AnalyzeSpark.analyze(schema, parsedInputData, maxHistogramBuckets);
HtmlAnalysis.createHtmlAnalysisFile(dataAnalysis, new File("DataVecAnalysis.html"));
Parallel Inference
Model model =
ModelSerializer.restoreComputationGraph("PATH_TO_YOUR_MODEL_FILE", false);
ParallelInference pi = new ParallelInference.Builder(model)
.inferenceMode(InferenceMode.BATCHED)
.batchLimit(32)
.workers(2)
.build();
INDArray result = pi.output(..);
DL4J Transfer Learning API
- Ability to freeze layers
- Modify layers, add new layers; change graph structure etc
- FineTuneConfiguration for changing learning
- Helper functions to presave featurized frozen layer outputs
(.featurize method in TransferLearningHelper)
Example with vgg16 that keeps bottleneck and below frozen and edits
new layers:
https://github.com/deeplearning4j/dl4j-examples/blob/5381c5f86170dc5
44522eb7926d8fbf8119bec67/dl4j-examples/src/main/java/org/deeplear
ning4j/examples/transferlearning/vgg16/EditAtBottleneckOthersFrozen.ja
va#L74-L90
DL4J Training UI
Helps with training and tuning by tracking gradients and updates works with Spark
Parallel Inference
• Skymind integrates
Deeplearning4j into it’s
commercial model server,
SKIL
• Underlying code uses
ParallelInference class
• Promising scalability as
minibatch and number of
local devices increases
Commercial Performance
minibatch size
• ParallelInference class
automatically picks up
available GPUs and balances
requests to them
• Backpressure can be
handled by “batching” the
requests in a queue
• Single-node, up to
programmer to scale out or
can use commercial
solution like SKIL
Parallel GPUs
GPU 1 GPU 1
ParallelInference
https://github.com/deeplearni
ng4j/dl4j-examples/blob/mast
er/dl4j-examples/src/main/jav
a/org/deeplearning4j/examples
/inference/ParallelInferenceEx
ample.java
Example
ParallelInference pi =
new ParallelInference
.Builder(model)
.inferenceMode(InferenceMo
de.BATCHED)
.batchLimit(32)
.workers(2)
.build();
Backlogged Inference
Prerequisites
What is anomaly detection?
In layman’s terms, anomaly detection is the identification of rare
events or items that are significantly different from the “normal” of a
dataset.
Something is not like the others...
The Problem
How to monitor 1 terabyte of CDN logs per day and detect anomalies.
We want to monitor the health of a live sports score websocket API.
Let’s analyze packet logs from a server farm streaming the latest
NFL game. It produces 1 TB of logs per day with files that look like:
91739747923947 live.nfl.org GET /panthers_chargers 0
1554863750 250 6670 wss 0
Let’s do some math. This line is 73 bytes...
Analysis
What’s the most efficient way to monitor for system disruptions?
I’ve seen attempts to perform anomaly detection on every single
packet! Ummm okay so if we have 1 TB of logs per day and each line
is 73 bytes, that is how many lines….
1e+12 bytes / 73 bytes =
13,698,630,137 log lines
Available Hardware
I have a 2 x Titan X Pascal GPU Workstation at home.
Titan X has 342.9 GFLOPS of FP64 (double) computing power.
Sounds like a lot? We can process a terabyte of logs per day?
Let’s benchmark it!
Data Vectorization
Format of log file is:
{id} {domain} {http_method} {uri} {server_errors}
{timestamp} {round_trip} {payload_size} {protocol}
{client_errors}
How anomalous is our packet when comparing errors, timing, and
round trip?
Let’s build an input using the above...
MLP Architecture
We need to encode our data into a representation that has some sort
of computational meaning. Potentially a small MLP encoder can work.
Model size: 158 parameters (very small)
Benchmarks: 43,166 logs/sec on 2xGPU
Total Capacity: 3,729,542,400 logs/day
We need at least 8 GPUs!!! And backpressure!
Analysis
What if there was a better way?
We already know we can leverage Kafka for backpressure. That
eliminates high burst loads. What if there was a way we could turn 13
billion packet logs into a fraction of that?
Aggregate!
We can add a Spark streaming component, use microbatching and
aggregate into smaller sequences.
LSTM Architecture
Our MLP encoder turns into an LSTM sequence encoder. We
aggregate across a rolling window of 30 seconds, every second. Do
we become more efficient?
Model size: 14,178 parameters (small)
Benchmarks: 1,494 aggregations/sec on 2xGPU
Total Capacity: 129,081,600 aggregations/day
Aggregation gains significant efficiency.
Lessons
Still need additional hardware.
Spark streaming will still require additional hardware. However you’re
optimizing this and not requiring expensive GPU usage. Aggregation
across all packets also gives big picture which is indicator of health.
Number of parameters.
While the models used for this thought experiment are small, you
could very well increase the size by 10x for performance or
dimensionality. That requires additional hardware.
Real Code
Github Example
Kafka, Keras, and Deeplearning4j.
A simplified real-world example involves a data science team training
in python via Keras, importing your model into Deeplearning4j and
Java, and deploying your model to perform inference from data fed
by Kafka.
Repository.
https://github.com/crockpotveggies/kafka-streams-machine-learning
-examples
Questions?
help@skymind.ai

More Related Content

What's hot

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSPeterAndreasEntschev
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainabilitygeetachauhan
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cachergrebski
 
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
 
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014StampedeCon
 
Meetup cassandra sfo_jdbc
Meetup cassandra sfo_jdbcMeetup cassandra sfo_jdbc
Meetup cassandra sfo_jdbczznate
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Financegeetachauhan
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with SparkRoger Rafanell Mas
 
Adtech x Scala x Performance tuning
Adtech x Scala x Performance tuningAdtech x Scala x Performance tuning
Adtech x Scala x Performance tuningYosuke Mizutani
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportPreferred Networks
 
CMUデータベース輪読会第8回
CMUデータベース輪読会第8回CMUデータベース輪読会第8回
CMUデータベース輪読会第8回Keisuke Suzuki
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...Sri Ambati
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Kenta Oono
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Chris Fregly
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and DatasetKazuaki Ishizaki
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter StormUwe Printz
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015Holden Karau
 

What's hot (20)

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
 
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
 
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
 
Meetup cassandra sfo_jdbc
Meetup cassandra sfo_jdbcMeetup cassandra sfo_jdbc
Meetup cassandra sfo_jdbc
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
Adtech x Scala x Performance tuning
Adtech x Scala x Performance tuningAdtech x Scala x Performance tuning
Adtech x Scala x Performance tuning
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereport
 
CMUデータベース輪読会第8回
CMUデータベース輪読会第8回CMUデータベース輪読会第8回
CMUデータベース輪読会第8回
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and Dataset
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
 

Similar to Deep learning with kafka

SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 
Final training course
Final training courseFinal training course
Final training courseNoor Dhiya
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at ScaleJeff Henrikson
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Spark Summit
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based programRalf Gommers
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
 
SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017Jags Ramnarayan
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData
 
Training course lect1
Training course lect1Training course lect1
Training course lect1Noor Dhiya
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMartin Zapletal
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecJosh Patterson
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSDatabricks
 

Similar to Deep learning with kafka (20)

SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
Final training course
Final training courseFinal training course
Final training course
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
 
Training course lect1
Training course lect1Training course lect1
Training course lect1
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache Spark
 
Elasticwulf Pycon Talk
Elasticwulf Pycon TalkElasticwulf Pycon Talk
Elasticwulf Pycon Talk
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 

More from Nitin Kumar

2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehnerNitin Kumar
 
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Nitin Kumar
 
Processing trillions of events per day with apache
Processing trillions of events per day with apacheProcessing trillions of events per day with apache
Processing trillions of events per day with apacheNitin Kumar
 
Ren cao kafka connect
Ren cao   kafka connectRen cao   kafka connect
Ren cao kafka connectNitin Kumar
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bbNitin Kumar
 
EventHub for kafka ecosystems kafka meetup
EventHub for kafka ecosystems   kafka meetupEventHub for kafka ecosystems   kafka meetup
EventHub for kafka ecosystems kafka meetupNitin Kumar
 
Microsoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceMicrosoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceNitin Kumar
 
Net flix kafka seattle meetup
Net flix kafka seattle meetupNet flix kafka seattle meetup
Net flix kafka seattle meetupNitin Kumar
 
Brandon obrien streaming_data
Brandon obrien streaming_dataBrandon obrien streaming_data
Brandon obrien streaming_dataNitin Kumar
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Nitin Kumar
 
Microsoft kafka load imbalance
Microsoft   kafka load imbalanceMicrosoft   kafka load imbalance
Microsoft kafka load imbalanceNitin Kumar
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaNitin Kumar
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphonNitin Kumar
 

More from Nitin Kumar (16)

2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
 
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
Kafka meetup seattle 2019 mirus reliable, high performance replication for ap...
 
Processing trillions of events per day with apache
Processing trillions of events per day with apacheProcessing trillions of events per day with apache
Processing trillions of events per day with apache
 
Ren cao kafka connect
Ren cao   kafka connectRen cao   kafka connect
Ren cao kafka connect
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bb
 
EventHub for kafka ecosystems kafka meetup
EventHub for kafka ecosystems   kafka meetupEventHub for kafka ecosystems   kafka meetup
EventHub for kafka ecosystems kafka meetup
 
Kafka eos
Kafka eosKafka eos
Kafka eos
 
Microsoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka serviceMicrosoft challenges of a multi tenant kafka service
Microsoft challenges of a multi tenant kafka service
 
Net flix kafka seattle meetup
Net flix kafka seattle meetupNet flix kafka seattle meetup
Net flix kafka seattle meetup
 
Avvo fkafka
Avvo fkafkaAvvo fkafka
Avvo fkafka
 
Brandon obrien streaming_data
Brandon obrien streaming_dataBrandon obrien streaming_data
Brandon obrien streaming_data
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Microsoft kafka load imbalance
Microsoft   kafka load imbalanceMicrosoft   kafka load imbalance
Microsoft kafka load imbalance
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphon
 

Recently uploaded

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 

Recently uploaded (20)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 

Deep learning with kafka

  • 1. JUSTIN LONG | justin@skymind.io
  • 2. Deep Learning with GPUs in Production AI By the Bay 2017 DEEPLEARNING4J & KAFKA April 2019
  • 3. | OBJECTIVES By the end of this presentation, you should… 1. Know the Deeplearning4j stack and how it works 2. Understand why aggregation is useful 3. Have an example of using Deeplearning4j and Kafka together
  • 5. DL4J Ecosystem Deeplearning4j, ScalNet Build, train, and deploy neural networks on JVM and in Spark. ND4J /libND4J High performance linear algebra on GPU/CPU. Numpy for JVM. DataVec Data ingestion, normalization, and vectorization. Pandas integration. SameDiff Symbolic differentiation and computation graphs. Arbiter Hyperparameter search for optimizing neural networks. RL4J Reinforcement learning on JVM. Model Import Import neural nets from ONNX, TensorFlow, Keras (Theano, Caffe). Jumpy Python API for ND4J.
  • 6. DL4J Training API MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .updater(new AMSGrad(0.05)) .l2(5e-4).activation(Activation.RELU) .list( new ConvolutionLayer.Builder(5, 5).stride(1, 1).nOut(20).build(), new SubsamplingLayer.Builder(PoolingType.MAX).kernelSize(2, 2).build(), new ConvolutionLayer.Builder(5, 5).stride(1, 1).nOut(50).build(), new SubsamplingLayer.Builder(PoolingType.MAX).kernelSize(2, 2).padding(2,2).build(), new DenseLayer.Builder().nOut(500).build(), new DenseLayer.Builder().nOut(nClasses).activation(Activation.SOFTMAX).build(), new LossLayer.Builder().lossFunction(LossFunction.MCXENT).build() ) .setInputType(InputType.convolutionalFlat(28, 28, 1)) .build() MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.fit(...);
  • 7. DL4J Training Features A very extensive feature rich library - Large set of layers, including VAE - Elaborate architectures, eg. center loss - Listeners: score and performance, checkpoint - Extensive Eval classes - Custom Activation, Custom Layers - Learning Rate Schedules - Dropout, WeightNoise, WeightConstraints - Transfer Learning - And so much more
  • 8. Inference with imported models //Import model model = KerasModelImport.import... //Featurize input data into an INDArray INDArray features = … //Get prediction INDArray prediction = model.output(features)
  • 9. Featurizing Data DataVec: A tool for ETL Runs natively on Spark with GPUs and CPUs Designed to support all major types of input data (text, CSV, audio, image and video) with these specific input formats Define Schemas and Transform Process Serialize the transform processes, which allows them to be more portable when they’re needed for production environments.
  • 10. DataVec Schema Define Schemas Schema inputDataSchema = new Schema.Builder() .addColumnsString("CustomerID", "MerchantID") .addColumnInteger("NumItemsInTransaction") .addColumnCategorical("MerchantCountryCode", Arrays.asList("USA","CAN","FR","MX")) .addColumnDouble("TransactionAmountUSD",0.0,null,false,false) //$0.0 or more, no maximum limit, no NaN and no Infinite values .addColumnCategorical("FraudLabel", Arrays.asList("Fraud","Legit")) .build()
  • 11. DataVec Transform Process Basic Transform Example - Filter rows by column value - Handle invalid values with replacement (-ve $ amt) - Handle datetime, extract hour of day etc - Operate on columns in place - Derive new columns from existing columns - Join multiple sources of data - AND much more... Serialize to JSON!! https://gist.github.com/eraly/3b15d35eb4285acd444f2f18976dd226
  • 12. DataVec Data Analysis DataAnalysis dataAnalysis = AnalyzeSpark.analyze(schema, parsedInputData, maxHistogramBuckets); HtmlAnalysis.createHtmlAnalysisFile(dataAnalysis, new File("DataVecAnalysis.html"));
  • 13. Parallel Inference Model model = ModelSerializer.restoreComputationGraph("PATH_TO_YOUR_MODEL_FILE", false); ParallelInference pi = new ParallelInference.Builder(model) .inferenceMode(InferenceMode.BATCHED) .batchLimit(32) .workers(2) .build(); INDArray result = pi.output(..);
  • 14. DL4J Transfer Learning API - Ability to freeze layers - Modify layers, add new layers; change graph structure etc - FineTuneConfiguration for changing learning - Helper functions to presave featurized frozen layer outputs (.featurize method in TransferLearningHelper) Example with vgg16 that keeps bottleneck and below frozen and edits new layers: https://github.com/deeplearning4j/dl4j-examples/blob/5381c5f86170dc5 44522eb7926d8fbf8119bec67/dl4j-examples/src/main/java/org/deeplear ning4j/examples/transferlearning/vgg16/EditAtBottleneckOthersFrozen.ja va#L74-L90
  • 15. DL4J Training UI Helps with training and tuning by tracking gradients and updates works with Spark
  • 17. • Skymind integrates Deeplearning4j into it’s commercial model server, SKIL • Underlying code uses ParallelInference class • Promising scalability as minibatch and number of local devices increases Commercial Performance minibatch size
  • 18. • ParallelInference class automatically picks up available GPUs and balances requests to them • Backpressure can be handled by “batching” the requests in a queue • Single-node, up to programmer to scale out or can use commercial solution like SKIL Parallel GPUs GPU 1 GPU 1 ParallelInference
  • 21. Prerequisites What is anomaly detection? In layman’s terms, anomaly detection is the identification of rare events or items that are significantly different from the “normal” of a dataset. Something is not like the others...
  • 22. The Problem How to monitor 1 terabyte of CDN logs per day and detect anomalies. We want to monitor the health of a live sports score websocket API. Let’s analyze packet logs from a server farm streaming the latest NFL game. It produces 1 TB of logs per day with files that look like: 91739747923947 live.nfl.org GET /panthers_chargers 0 1554863750 250 6670 wss 0 Let’s do some math. This line is 73 bytes...
  • 23. Analysis What’s the most efficient way to monitor for system disruptions? I’ve seen attempts to perform anomaly detection on every single packet! Ummm okay so if we have 1 TB of logs per day and each line is 73 bytes, that is how many lines…. 1e+12 bytes / 73 bytes = 13,698,630,137 log lines
  • 24. Available Hardware I have a 2 x Titan X Pascal GPU Workstation at home. Titan X has 342.9 GFLOPS of FP64 (double) computing power. Sounds like a lot? We can process a terabyte of logs per day? Let’s benchmark it!
  • 25. Data Vectorization Format of log file is: {id} {domain} {http_method} {uri} {server_errors} {timestamp} {round_trip} {payload_size} {protocol} {client_errors} How anomalous is our packet when comparing errors, timing, and round trip? Let’s build an input using the above...
  • 26. MLP Architecture We need to encode our data into a representation that has some sort of computational meaning. Potentially a small MLP encoder can work. Model size: 158 parameters (very small) Benchmarks: 43,166 logs/sec on 2xGPU Total Capacity: 3,729,542,400 logs/day We need at least 8 GPUs!!! And backpressure!
  • 27. Analysis What if there was a better way? We already know we can leverage Kafka for backpressure. That eliminates high burst loads. What if there was a way we could turn 13 billion packet logs into a fraction of that? Aggregate! We can add a Spark streaming component, use microbatching and aggregate into smaller sequences.
  • 28. LSTM Architecture Our MLP encoder turns into an LSTM sequence encoder. We aggregate across a rolling window of 30 seconds, every second. Do we become more efficient? Model size: 14,178 parameters (small) Benchmarks: 1,494 aggregations/sec on 2xGPU Total Capacity: 129,081,600 aggregations/day Aggregation gains significant efficiency.
  • 29. Lessons Still need additional hardware. Spark streaming will still require additional hardware. However you’re optimizing this and not requiring expensive GPU usage. Aggregation across all packets also gives big picture which is indicator of health. Number of parameters. While the models used for this thought experiment are small, you could very well increase the size by 10x for performance or dimensionality. That requires additional hardware.
  • 31. Github Example Kafka, Keras, and Deeplearning4j. A simplified real-world example involves a data science team training in python via Keras, importing your model into Deeplearning4j and Java, and deploying your model to perform inference from data fed by Kafka. Repository. https://github.com/crockpotveggies/kafka-streams-machine-learning -examples