SlideShare a Scribd company logo
1 of 33
Download to read offline
Spark NLP: State of the
Art Natural Language
Processing at Scale
Maziyar Panahi
David Talby
Agenda
Introducing Spark NLP
Accuracy
Performance
Scale
Ease of Use
Spark NLP: Apache License 2.0
Spark NLP in Industry
NLP Industry Survey by Gradient Flow,
an independent data science research & insights company, September 2020
Which NLP libraries does your organization use?
! Tokenization
! Sentence Detector
! Stop Words Removal
! Normalizer
! Stemmer
! Lemmatizer
! NGrams
! Regex Matching
! Text Matching
! Chunking
! Date Matcher
! Part-of-speech tagging
! Dependency parsing
! Sentiment Detection (ML models)
! Spell Checker (ML and DL models)
! Word Embeddings
! BERT Embeddings
! ELMO Embeddings
! ALBERT Embeddings
! XLNet Embeddings
! Universal Sentence Encoder
! BERT Sentence Embeddings
! Sentence Embeddings
! Chunk Embeddings
! Unsupervised keywords extraction
! Language Detection & Identification
! Multi-class Text Classification
! Multi-label Text Classification
! Multi-class Sentiment Analysis
! Named entity recognition
! Easy TensorFlow integration
! Full integration with Spark
ML functions
! +250 pre-trained models in 46
Spark NLP: Apache License 2.0
Trusted By
Vision of Spark NLP
Python, Scala, and Java
! ACCURACY
! PERFORMANCE
! SCALABILITY
! “State of the art” means the
best peer-reviewed academic
results
! For example: Best F1 score on
CoNLL-2003 NER benchmark
for a system in production
! Spark NLP uses Bi-LSTM + Char-
CNN + CRF + Word
Embeddings
Accuracy: State-of-the-art Models
Named Entity Recognition
! The best F1 score on CoNLL-2003
NER benchmark for a system in
production by using Spark NLP
! BERT Large model was used to
train our Bi-LSTM + Char-CNN +
CRF model
Accuracy: State-of-the-art Models
Named Entity Recognition
! Everything must work right out of the
box. Meaning, there is no extra steps to
make something work
! All the parameters are default
! CoNLL 2003 dataset is used in this
benchmark. The eng.train was used for
training and
the eng.testa was used for evaluating the
model
Accuracy: State-of-the-art Models
Named Entity Recognition
Accuracy: State-of-the-art Models
Transformers & Embeddings
Transformers & Embeddings
Spark NLP: 50 Word
Embeddings
! BERT
! Small BERT
! BioBERT
! CovidBERT
! ALBERT
! ELECTRA
! XLNet
! ELMO
! GloVe
Accuracy: State-of-the-art Models
Multi-class & Multi-label Text Classifications
! Multi-class text classification to detect
emotions, cyberbullying, fake news, spams,
etc.
! Multi-label text classification to detect toxic
comments, movie genre, etc.
! 90+ pertained Word and Sentence
Embeddings models
! Language-Agnostic BERT Sentence
Embedding
! Universal Sentence Encoder as an input for
text classifications
Accuracy: State-of-the-art Models
SentimentDL, ClassifierDL, and MultiClassifierDL
! BERT
! Small BERT
! BioBERT
! CovidBERT
! LaBSE
! ALBERT
! ELECTRA
! XLNet
! ELMO
! Universal Sentence
Encoder
! GloVe
! 100 dimensions
! 200
dimensions
! 128 dimensions
! 256
dimensions
! 300
dimensions
! 512 dimensions
! 768
dimensions
! 1024
dimensions
! tfhub_ues
! tfhub_use_lg
! glove_6B_100
! glove_6B_300
! glove_840B_300
! bert_base_cased
! bert_base_uncased
! bert_large_cased
! bert_large_uncased
! bert_multi_uncased
! electra_small_uncased
! elmo
! ... 90+ Word & Sentence models
! 2 classes (positive/negative)
! 3 classes (0, 1, 2)
! 4 classes (Sports, Business,
etc.)
! 5 classes (1.0, 2.0, 3.0, 4.0, 5.0)
! ... 100 classes!
Accuracy: State-of-the-art Models
Language Detection & Identification
! LanguageDetectorDL is a state-of-the-art
TensorFlow/Keras model
! Uses the positions of the characters
! It is around 3 MB to 5 MB
! It has been trained over 8 million Wikipedia
pages
! It has between 97% to 99% accuracy for text
longer than 140 characters
Accuracy: State-of-the-art Models
Context Spell Checker
! Ability to consider OCR specific error
patterns
! Ability to leverage the context
! Ability to preserve and even correct custom
patterns
! Flexibility to incorporate your own custom
patterns
Improved Speed &
Accuracy
Performance
Named Entity Recognition
Performance:
BERT Embeddings
! Transformers are slow!
! They need GPUs
! It depends highly on max sequence
length
Spark NLP 2.6:
! Improve the memory consumption by
30%
! Improve performance by more than 70%
with dynamic shape
Performance:
BERT Embeddings
! 24 new smaller models!
! Tiny BERT
! Mini BERT
! Small BERT
! Medium BERT
Example:
! BERT-Tiny is 24x times smaller and
28x times faster than BERT-Base
Performance:
Hardware
! Optimized builds of Spark NLP for
both Intel and Nvidia
! Benchmark done on AWS: Train a
Named Entity Recognition in
French for 80 Epochs with 512
batch size
! Intel outperformed Nvidia:
Cascade Lake was 19% faster &
46% cheaper than Tesla P-100
Scale: Distribution & Parallelism
! Zero code changes to scale a pipeline to any
Spark cluster
! Only natively distributed open-source NLP
library
! Spark provides execution planning, caching,
serialization, and shuffling
! Caveats
! Speedup depends on what you actually do
! Spark configurations matter
! Cluster tuning based on your data is advised
Scale: Distribution & Parallelism
Recognize Entity DL Pipeline
! Amazon full reviews, 15 million sentences,
and
255 million tokens
! Single node, 32G memory & 32 cores
! 10x workers with 32G memory & 16 cores
! The pipeline includes sentence detection,
tokenization, word embeddings, and NER
NOTE:
! Single node is dedicated Dell Server
! 10 Nodes are in Databricks on AWS
Scale: Distribution & Parallelism
BERT Embeddings
! Amazon full reviews, 15 million sentences,
and 255 million tokens
! Single node with 64G memory & 32 cores
! 10x workers with 32G memory & 16 cores
! 128 max sequence length
NOTE:
! Single node is dedicated Dell Server
! 10 Nodes are in Databricks on AWS
Easy to Use
Python, Scala, and Java
! Pretrained pipelines
! Pretrained models
! Training your own models
Easy to Use
Pretrained Pipelines
! Over 90+ pretrained pipelines
! Full support for 13 languages
! Simple and easy to use
! Works online and offline
! Not flexible
Easy to Use
Pretrained Models
! Over 250+ pretrained models
! Support 46 languages
! Works online and offline
! Flexible & customized pipelines
! Caveat: some models depend on each
other
Easy to Use
Train your own POS tagging models
! POS() accepts token-tag format
! POS Tagger is based on
Perceptron Average algorithm
! Language-agnostic and supports
any language
Easy to Use
Train your own NER models
! CoNLL 2003 format as input
! Accepts 50+ Word Embeddings
models
! Train on CPU or GPU
! Extended metrics and evaluation
! Built-in validation split with
metrics
Easy to Use
Train your own NER models
! BERT with 2 layers & 768
dimensions
! 16 minutes training
! 91% Micro F1 on Dev
! 90% conll_eval on Dev
! Full CoNLL 2003 training dataset
! Google Colab with GPU
Easy to Use
Train your own multi-class classifiers
! Supports up to 100 classes
! Accepts 90+ Word & Sentence Embeddings
models
! Train on CPU or GPU
! Extended metrics and evaluation
! Built-in validation split with metrics
Spark NLP
! 73 total releases
! Release every two weeks for
the
past 3 years
! A single unified library for all
your NLP/NLU need
! Active community on Slack &
GitHub
Thank You!
Getting started, docs, videos, examples:
nlp.johnsnowlabs.com
Community:
spark-nlp.slack.com

More Related Content

What's hot

인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템NAVER D2
 
[226]대용량 텍스트마이닝 기술 하정우
[226]대용량 텍스트마이닝 기술 하정우[226]대용량 텍스트마이닝 기술 하정우
[226]대용량 텍스트마이닝 기술 하정우NAVER D2
 
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Databricks
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Sneha Ravikumar
 
コンピュータビジョンの研究開発状況
コンピュータビジョンの研究開発状況コンピュータビジョンの研究開発状況
コンピュータビジョンの研究開発状況cvpaper. challenge
 
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisKento Doi
 
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...Deep Learning JP
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Jinpyo Lee
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)HironoriKanazawa
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기NAVER D2
 
LLM+LangChainで特許調査・分析に取り組んでみた
LLM+LangChainで特許調査・分析に取り組んでみたLLM+LangChainで特許調査・分析に取り組んでみた
LLM+LangChainで特許調査・分析に取り組んでみたKunihiroSugiyama1
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn nsAndrew Brozek
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)WON JOON YOO
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組みHironori Washizaki
 
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)Amazon Web Services Korea
 
Word2vec algorithm
Word2vec algorithmWord2vec algorithm
Word2vec algorithmAndrew Koo
 
【DL輪読会】Perceiver io a general architecture for structured inputs & outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs & outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs & outputs
【DL輪読会】Perceiver io a general architecture for structured inputs & outputs Deep Learning JP
 

What's hot (20)

인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템
 
[226]대용량 텍스트마이닝 기술 하정우
[226]대용량 텍스트마이닝 기술 하정우[226]대용량 텍스트마이닝 기술 하정우
[226]대용량 텍스트마이닝 기술 하정우
 
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation
 
コンピュータビジョンの研究開発状況
コンピュータビジョンの研究開発状況コンピュータビジョンの研究開発状況
コンピュータビジョンの研究開発状況
 
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...
【DL輪読会】DiffRF: Rendering-guided 3D Radiance Field Diffusion [N. Muller+ CVPR2...
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
 
LLM+LangChainで特許調査・分析に取り組んでみた
LLM+LangChainで特許調査・分析に取り組んでみたLLM+LangChainで特許調査・分析に取り組んでみた
LLM+LangChainで特許調査・分析に取り組んでみた
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn ns
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
 
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)
AI 비지니스 무엇을 어떻게 준비하고 해야 하는가? - 정우진 (AWS 사업개발 담당)
 
Word2vec algorithm
Word2vec algorithmWord2vec algorithm
Word2vec algorithm
 
【DL輪読会】Perceiver io a general architecture for structured inputs & outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs & outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs & outputs
【DL輪読会】Perceiver io a general architecture for structured inputs & outputs
 
Rpa definiftion
Rpa definiftionRpa definiftion
Rpa definiftion
 

Similar to Spark NLP: State of the Art Natural Language Processing at Scale

Advanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLPAdvanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLPDatabricks
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch BasicsShifa Khan
 
Recurrent Neural Network : Multi-Class & Multi Label Text Classification
Recurrent Neural Network : Multi-Class & Multi Label Text ClassificationRecurrent Neural Network : Multi-Class & Multi Label Text Classification
Recurrent Neural Network : Multi-Class & Multi Label Text ClassificationAmit Agarwal
 
Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Chris Fregly
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Edureka!
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling WorldsIstvan Rath
 
Build a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationCraig Chao
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description LanguagesAkana
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description LanguagesAkana
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingTyrone Systems
 
Listen and look at your PHP code
Listen and look at your PHP codeListen and look at your PHP code
Listen and look at your PHP codeGabriele Santini
 
Serverless Functions and Machine Learning: Putting the AI in APIs
Serverless Functions and Machine Learning: Putting the AI in APIsServerless Functions and Machine Learning: Putting the AI in APIs
Serverless Functions and Machine Learning: Putting the AI in APIsNordic APIs
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013Iván Montes
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform MunichVMware Tanzu
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesMarkus Voelter
 

Similar to Spark NLP: State of the Art Natural Language Processing at Scale (20)

Advanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLPAdvanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLP
 
DSL Best Practices
DSL Best PracticesDSL Best Practices
DSL Best Practices
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
Recurrent Neural Network : Multi-Class & Multi Label Text Classification
Recurrent Neural Network : Multi-Class & Multi Label Text ClassificationRecurrent Neural Network : Multi-Class & Multi Label Text Classification
Recurrent Neural Network : Multi-Class & Multi Label Text Classification
 
Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Build a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimization
 
DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description Languages
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description Languages
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language Processing
 
Antlr Conexaojava
Antlr ConexaojavaAntlr Conexaojava
Antlr Conexaojava
 
Listen and look at your PHP code
Listen and look at your PHP codeListen and look at your PHP code
Listen and look at your PHP code
 
Serverless Functions and Machine Learning: Putting the AI in APIs
Serverless Functions and Machine Learning: Putting the AI in APIsServerless Functions and Machine Learning: Putting the AI in APIs
Serverless Functions and Machine Learning: Putting the AI in APIs
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform Munich
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 

Recently uploaded (20)

Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

Spark NLP: State of the Art Natural Language Processing at Scale

  • 1. Spark NLP: State of the Art Natural Language Processing at Scale Maziyar Panahi David Talby
  • 3. Spark NLP: Apache License 2.0
  • 4. Spark NLP in Industry NLP Industry Survey by Gradient Flow, an independent data science research & insights company, September 2020 Which NLP libraries does your organization use?
  • 5. ! Tokenization ! Sentence Detector ! Stop Words Removal ! Normalizer ! Stemmer ! Lemmatizer ! NGrams ! Regex Matching ! Text Matching ! Chunking ! Date Matcher ! Part-of-speech tagging ! Dependency parsing ! Sentiment Detection (ML models) ! Spell Checker (ML and DL models) ! Word Embeddings ! BERT Embeddings ! ELMO Embeddings ! ALBERT Embeddings ! XLNet Embeddings ! Universal Sentence Encoder ! BERT Sentence Embeddings ! Sentence Embeddings ! Chunk Embeddings ! Unsupervised keywords extraction ! Language Detection & Identification ! Multi-class Text Classification ! Multi-label Text Classification ! Multi-class Sentiment Analysis ! Named entity recognition ! Easy TensorFlow integration ! Full integration with Spark ML functions ! +250 pre-trained models in 46 Spark NLP: Apache License 2.0
  • 7. Vision of Spark NLP Python, Scala, and Java ! ACCURACY ! PERFORMANCE ! SCALABILITY
  • 8. ! “State of the art” means the best peer-reviewed academic results ! For example: Best F1 score on CoNLL-2003 NER benchmark for a system in production ! Spark NLP uses Bi-LSTM + Char- CNN + CRF + Word Embeddings Accuracy: State-of-the-art Models Named Entity Recognition
  • 9. ! The best F1 score on CoNLL-2003 NER benchmark for a system in production by using Spark NLP ! BERT Large model was used to train our Bi-LSTM + Char-CNN + CRF model Accuracy: State-of-the-art Models Named Entity Recognition
  • 10. ! Everything must work right out of the box. Meaning, there is no extra steps to make something work ! All the parameters are default ! CoNLL 2003 dataset is used in this benchmark. The eng.train was used for training and the eng.testa was used for evaluating the model Accuracy: State-of-the-art Models Named Entity Recognition
  • 12. Transformers & Embeddings Spark NLP: 50 Word Embeddings ! BERT ! Small BERT ! BioBERT ! CovidBERT ! ALBERT ! ELECTRA ! XLNet ! ELMO ! GloVe
  • 13. Accuracy: State-of-the-art Models Multi-class & Multi-label Text Classifications ! Multi-class text classification to detect emotions, cyberbullying, fake news, spams, etc. ! Multi-label text classification to detect toxic comments, movie genre, etc. ! 90+ pertained Word and Sentence Embeddings models ! Language-Agnostic BERT Sentence Embedding ! Universal Sentence Encoder as an input for text classifications
  • 14. Accuracy: State-of-the-art Models SentimentDL, ClassifierDL, and MultiClassifierDL ! BERT ! Small BERT ! BioBERT ! CovidBERT ! LaBSE ! ALBERT ! ELECTRA ! XLNet ! ELMO ! Universal Sentence Encoder ! GloVe ! 100 dimensions ! 200 dimensions ! 128 dimensions ! 256 dimensions ! 300 dimensions ! 512 dimensions ! 768 dimensions ! 1024 dimensions ! tfhub_ues ! tfhub_use_lg ! glove_6B_100 ! glove_6B_300 ! glove_840B_300 ! bert_base_cased ! bert_base_uncased ! bert_large_cased ! bert_large_uncased ! bert_multi_uncased ! electra_small_uncased ! elmo ! ... 90+ Word & Sentence models ! 2 classes (positive/negative) ! 3 classes (0, 1, 2) ! 4 classes (Sports, Business, etc.) ! 5 classes (1.0, 2.0, 3.0, 4.0, 5.0) ! ... 100 classes!
  • 15. Accuracy: State-of-the-art Models Language Detection & Identification ! LanguageDetectorDL is a state-of-the-art TensorFlow/Keras model ! Uses the positions of the characters ! It is around 3 MB to 5 MB ! It has been trained over 8 million Wikipedia pages ! It has between 97% to 99% accuracy for text longer than 140 characters
  • 16. Accuracy: State-of-the-art Models Context Spell Checker ! Ability to consider OCR specific error patterns ! Ability to leverage the context ! Ability to preserve and even correct custom patterns ! Flexibility to incorporate your own custom patterns
  • 18. Performance: BERT Embeddings ! Transformers are slow! ! They need GPUs ! It depends highly on max sequence length Spark NLP 2.6: ! Improve the memory consumption by 30% ! Improve performance by more than 70% with dynamic shape
  • 19. Performance: BERT Embeddings ! 24 new smaller models! ! Tiny BERT ! Mini BERT ! Small BERT ! Medium BERT Example: ! BERT-Tiny is 24x times smaller and 28x times faster than BERT-Base
  • 20. Performance: Hardware ! Optimized builds of Spark NLP for both Intel and Nvidia ! Benchmark done on AWS: Train a Named Entity Recognition in French for 80 Epochs with 512 batch size ! Intel outperformed Nvidia: Cascade Lake was 19% faster & 46% cheaper than Tesla P-100
  • 21. Scale: Distribution & Parallelism ! Zero code changes to scale a pipeline to any Spark cluster ! Only natively distributed open-source NLP library ! Spark provides execution planning, caching, serialization, and shuffling ! Caveats ! Speedup depends on what you actually do ! Spark configurations matter ! Cluster tuning based on your data is advised
  • 22. Scale: Distribution & Parallelism Recognize Entity DL Pipeline ! Amazon full reviews, 15 million sentences, and 255 million tokens ! Single node, 32G memory & 32 cores ! 10x workers with 32G memory & 16 cores ! The pipeline includes sentence detection, tokenization, word embeddings, and NER NOTE: ! Single node is dedicated Dell Server ! 10 Nodes are in Databricks on AWS
  • 23. Scale: Distribution & Parallelism BERT Embeddings ! Amazon full reviews, 15 million sentences, and 255 million tokens ! Single node with 64G memory & 32 cores ! 10x workers with 32G memory & 16 cores ! 128 max sequence length NOTE: ! Single node is dedicated Dell Server ! 10 Nodes are in Databricks on AWS
  • 24. Easy to Use Python, Scala, and Java ! Pretrained pipelines ! Pretrained models ! Training your own models
  • 25. Easy to Use Pretrained Pipelines ! Over 90+ pretrained pipelines ! Full support for 13 languages ! Simple and easy to use ! Works online and offline ! Not flexible
  • 26. Easy to Use Pretrained Models ! Over 250+ pretrained models ! Support 46 languages ! Works online and offline ! Flexible & customized pipelines ! Caveat: some models depend on each other
  • 27. Easy to Use Train your own POS tagging models ! POS() accepts token-tag format ! POS Tagger is based on Perceptron Average algorithm ! Language-agnostic and supports any language
  • 28. Easy to Use Train your own NER models ! CoNLL 2003 format as input ! Accepts 50+ Word Embeddings models ! Train on CPU or GPU ! Extended metrics and evaluation ! Built-in validation split with metrics
  • 29. Easy to Use Train your own NER models ! BERT with 2 layers & 768 dimensions ! 16 minutes training ! 91% Micro F1 on Dev ! 90% conll_eval on Dev ! Full CoNLL 2003 training dataset ! Google Colab with GPU
  • 30. Easy to Use Train your own multi-class classifiers ! Supports up to 100 classes ! Accepts 90+ Word & Sentence Embeddings models ! Train on CPU or GPU ! Extended metrics and evaluation ! Built-in validation split with metrics
  • 31. Spark NLP ! 73 total releases ! Release every two weeks for the past 3 years ! A single unified library for all your NLP/NLU need ! Active community on Slack & GitHub
  • 32.
  • 33. Thank You! Getting started, docs, videos, examples: nlp.johnsnowlabs.com Community: spark-nlp.slack.com