DeepPavlov is an open-source framework for the development of production-ready chat-bots and complex conversational systems, as well as NLP and dialog systems research.
6. DeepPavlov.ai
Personal AI Assistants
Voice is universal
There’s no user manual needed, and people of all ages,
across all types of devices, and in many different
geographies can use the Assistant.
Scott Huffman
VP, Engineering, Google Assistant
9. DeepPavlov.ai
Modular dialog system
Are there any comedy movies to see this
weekend?
text data
NLU
(Natural Language Understanding)
• Domain detection
• Intent detection
• Entities detection
DM
(Dialogue manager)
• Dialogue state
• Policy
intent = request_movie
entities = { genre = ‘comedy’,
date = ‘weekend ’ }
semantic frame
NLG
(Natural Language Generation)
• Generative models
• Templates
action = request_location
system action
Where are you?
text data
10. DeepPavlov.ai
Modular dialog system
• Scalability problem
Paek, Tim, and Roberto Pieraccini. "Automating spoken dialogue management design using machine learning: An industry
perspective." Speech communication 50.8 (2008): 716-729.
13. DeepPavlov.ai
DeepPavlov
• DeepPavlov is for
- development of production ready chat-bots and complex conversational systems,
- NLP and dialog systems research.
• DeepPavlov’s goal is to enable AI-application developers and researchers with:
- set of pre-trained NLP models, pre-defined dialog system components (ML/DL/Rule-
based) and conversational agents templates for a typical scenarios;
- a framework for implementing and testing their own dialog models;
- tools for application integration with adjacent infrastructure (messengers, helpdesk
software etc.);
- benchmarking environment for conversational models and uniform access to
relevant datasets.
• Distributed under Apache v2 license
17. DeepPavlov.ai
From Conversational AI to Artificial General Intelligence
• Personal assistants as platforms
for AI skills
• Evolution towards
personalization and integration
of emerging AI skills
• Minsky’s ‘Society of mind’
19. DeepPavlov.ai
GitHub Stars & Pip DownloadsGitHubStars
Total downloads
36,914
Total downloads - 30 days
4,592
Total downloads - 7 days
1,736
https://pepy.tech/project/deeppavlov
20. DeepPavlov.ai
DeepPavlov Open Source Library
Task-Oriented Factoid Chit-Chat
Named Entity Recognition √ √
Coreference resolution √ √
Intent recognition √ √
Insults detection √ √
Q&A √
Dialogue Policy √ √
Dialog history √ √
Lanaguage model √ √ √
…
Dataset DSTC-2 SQuAD reddit
Skill Skill Skill
Agent
Models/Components
Dialogue agent combines complimentary
skills to help user.
21. DeepPavlov.ai
Core concepts
Agent is a conversational agent communicating
with users in natural language (text).
Skill fulfills user’s goal in some domain. It receives
input utterance and returns response and confidence.
Skill Manager performs selection of
the Skill to generate response.
Component is a reusable
functional component of Skill.
Chainer builds an agent/component pipeline from
heterogeneous Components (rule-based/ml/dl). It allows
to train and infer models in a pipeline as a whole.
25. DeepPavlov.ai
Use cases: Human assistance
1. Request routing
1. Domain classification
2. Routing to an operator
3. Operator replies
2. Ranking of pre-defined answers
1. Semantic embedding
2. Ranking of replies
3. Best answers are presented to an operator
4. Operator replies
1
2
3
1 2
4
3
26. DeepPavlov.ai
Use cases: Question Answering
1. Semantic embedding
2. Scoring of replies
3. Automated reply if the best answer has a high
confidence
4. Routing to an operator in the case of low
confidence
5. Operator replies
1. Semantic embedding
2. Search of answer in collection of documents
3. Automated reply if the best answer has a high
confidence
4. Routing to an operator in the case of low confidence
5. Operator replies
1 2
5
4
3
1 2
5
4
3
3. Frequently asked questions 4. Knowledge base Q&A
27. DeepPavlov.ai
Use cases: Rule-based bot
1. Semantic embedding
2. Selection of the most relevant dialogue script
3. Natural language answer generation
1. Semantic embedding
2. Sentiment analysis
3. Entity recognition tagging
4. Integration with BPM system
5. Simple bot 6. Other NLP tasks
1
2
3
1 2
4
4
3
31. DeepPavlov.ai
Features
Component Description
NER Based on neural Named Entity Recognition network with Bi-LSTM+CRF architecture.
Slot filling
Based on fuzzy Levenshtein search to extract normalized slot values from text. The components either rely on NER
results or perform needle in haystack search.
Classification
Component for classification tasks (intents, sentiment, etc). Based on shallow-and-wide Convolutional Neural
Network architecture. The model allows multilabel classification of sentences.
Automatic spelling
correction component
Pipelines that use candidates search in a static dictionary and an ARPA language model to correct spelling errors.
Ranking
Based on LSTM-based deep learning models for non-factoid answer selection. The model performs ranking of
responses or contexts from some database by their relevance for the given context.
Question Answering
Based on R-NET: Machine Reading Comprehension with Self-matching Networks. The model solves the task of
looking for an answer on a question in a given context (SQuAD task format).
Morphological tagging
Based on character-based approach to morphological tagging Heigold et al., 2017. An extensive empirical
evaluation of character-based morphological tagging for 14 languages. A state-of-the-art model for Russian and
several other languages. Model assigns morphological tags in UD format to sequences of words.
Skills
Goal-oriented bot
Based on Hybrid Code Networks (HCNs) architecture. It allows to predict responses in goal-oriented dialog. The
model is customizable: embeddings, slot filler and intent classifier can switched on and off on demand.
Seq2seq goal-oriented bot
Dialogue agent predicts responses in a goal-oriented dialog and is able to handle multiple domains (pretrained bot
allows calendar scheduling, weather information retrieval, and point-of-interest navigation). The model is end-to-
end differentiable and does not need to explicitly model dialogue state or belief trackers.
ODQA
An open domain question answering skill. The skill accepts free-form questions about the world and outputs an
answer based on its Wikipedia knowledge.
Embeddings
Pre-trained embeddings
for the Russian language
Word vectors for the Russian language trained on joint Russian Wikipedia and Lenta.ru corpora.
32. DeepPavlov.ai
Automatic spelling correction
• We provide two types of pipelines for spelling correction: levenshtein_corrector uses simple Damerau-
Levenshtein distance to find correction candidates and brillmoore uses statistics based error model for
it. In both cases correction candidates are chosen based on context with the help of a kenlm language
model.
Correction method F-measure Speed
(sentences/s)
Yandex.Speller 69.59 5.
[DP] Damerau Levenstein 1 + lm 53.50 29.3
[DP] Brill Moore top 4 + lm 52.91 0.6
Hunspell + lm 44.61 2.1
JamSpell 39.64 136.2
[DP] Brill Moore top 1 39.17 2.4
Hunspell 32.06 20.3
34. DeepPavlov.ai
Sentence classification
BERT models
BERT (Bidirectional Encoder Representations from Transformers) showed state-of-the-art results on a
wide range of NLP tasks in English.
deeppavlov.models.bert.BertClassifierModel (see here) provides easy to use solution for classification
problem using pre-trained BERT.
Neural Networks on Keras
deeppavlov.models.classifiers.KerasClassificationModel (see here) contains a number of different
neural network configurations for classification task.
•dcnn_model – Deep CNN with number of layers determined by the given number of kernel sizes and
filters,
•cnn_model – Shallow-and-wide CNN 1 with max pooling after convolution,
•cnn_model_max_and_aver_pool – Shallow-and-wide CNN 1 with max and average pooling
concatenation after convolution,
•bilstm_model – Bidirectional LSTM,
•bilstm_bilstm_model – 2-layers bidirectional LSTM,
•bilstm_cnn_model – Bidirectional LSTM followed by shallow-and-wide CNN,
•cnn_bilstm_model – Shallow-and-wide CNN followed by bidirectional LSTM,
•bilstm_self_add_attention_model – Bidirectional LSTM followed by self additive attention layer,
•bilstm_self_mult_attention_model – Bidirectional LSTM followed by self multiplicative attention layer,
•bigru_model – Bidirectional GRU model.
35. DeepPavlov.ai
Intent recognition
Source of all data except DeepPavlov is https://www.slideshare.net/KonstantinSavenkov/nlu-intent-detection-benchmark-by-intento-august-2017
F1
# of training samples
36. DeepPavlov.ai
Sentence classification
• Pre-trained models
Task Dataset Lang Model Metric Valid Test Downloads
28 intents DSTC 2
En
DSTC 2 emb
Accuracy
0.7732 0.7868 800 Mb
Wiki emb 0.9602 0.9593 8.5 Gb
7 intents SNIPS-2017
DSTC 2 emb
F1
0.8685 – 800 Mb
Wiki emb 0.9811 – 8.5 Gb
Tfidf + SelectKBest + PCA + Wiki emb 0.9673 – 8.6 Gb
Wiki emb weighted by Tfidf 0.9786 – 8.5 Gb
Insult
detection
Insults Reddit emb ROC-AUC 0.9271 0.8618 6.2 Gb
5 topics AG News Wiki emb
Accuracy
0.8876 0.9011 8.5 Gb
Sentiment
Twitter
mokoron
Ru
RuWiki+Lenta emb w/o preprocessing 0.9972 0.9971 6.2 Gb
RuWiki+Lenta emb with preprocessing 0.7811 0.7749 6.2 Gb
RuSentimen
t
RuWiki+Lenta emb
F1
0.6393 0.6539 6.2 Gb
ELMo 0.7066 0.7301 700 Mb
Intent Yahoo-L31
Yahoo-L31 on ELMo pre-trained on Yahoo-
L6
ROC-AUC 0.9269 – 700 Mb
38. DeepPavlov.ai
Neural Ranking
Trained with triplet loss and hard negative sampling
Tan, Ming & Dos Santos, Cicero & Xiang, Bing & Zhou, Bowen. (2015). LSTM-based Deep Learning Models for
Non-factoid Answer Selection.
Dataset Model config Validation (Recall@1) Test1 (Recall@1) Downloads
Ubuntu V2 ranking_ubuntu_v2_interact 52.9 52.4 8913M
Ubuntu V2 ranking_ubuntu_v2_mt_interact 59.2 58.7 8906M
Dataset Model config
Val
(accuracy)
Test
(accuracy)
Val (F1) Test (F1) Val (log_loss) Test (log_loss) Downloads
paraphraser.ru
paraphrase_ident_parap
hraser
83.8 75.4 87.9 80.9 0.468 0.616 5938M
Quora
Question Pairs
paraphrase_ident_qqp 87.1 87.0 83.0 82.6 0.300 0.305 8134M
Quora
Question Pairs
paraphrase_ident_qqp 87.7 87.5 84.0 83.8 0.287 0.298 8136M
Model
Validation
(Recall@1)
Test1
(Recall@1)
Architecture II (HLQA(200) CNNQA(4000) 1-
MaxPooling Tanh)
61.8 62.8
QA-LSTM basic-model(max pooling) 64.3 63.1
ranking_insurance 72.0 72.2
39. DeepPavlov.ai
Teхt QA (SQuAD) + Open Domain QA
R-NET: Machine Reading Comprehension with Self-matching Networks. (2017)
Model
(single
model)
EM (dev) F-1 (dev)
DeepPavlov
BERT
80.88 88.49
DeepPavlov
R-Net
71.49 80.34
BiDAF + Self
Attention +
ELMo
– 85.6
R-Net 71.1 79.5
Model Lang
Ranker@5
F1 EM
DeepPavlov En 37.83 31.26
DrQA 1 - 27.1
R3 4 37.5 29.1
DeepPavlov with
RuBERT reader
Ru
42.02 29.56
DeepPavlov 28.56 18.17
Teхt QA (SQuAD) Open Domain (Wiki) QA
40. DeepPavlov.ai
Task-Oriented Dialog (DSTC-2)
Model
Test turn
textual
accuracy
basic bot 0.3809
bot with slot filler & fasttext embeddings 0.5317
bot with slot filler & intents 0.5248
bot with slot filler & intents & embeddings 0.5145
bot with slot filler & embeddings & attention 0.5551
Bordes and Weston (2016) [4] 0.411
Perez and Liu (2016) [5] 0.487
Eric and Manning (2017) [6] 0.480
Williams et al. (2017) [1] 0.556
Jason D. Williams, Kavosh Asadi, Geoffrey Zweig “Hybrid Code Networks: practical and efficient end-to-
end dialog control with supervised and reinforcement learning” – 2017
41. DeepPavlov.ai
Sequence-To-Sequence Dialogue Bot For Goal-Oriented Task
Model Test BLEU
DeepPavlov implementation of
KV Retrieval Net
13.2
KV Retrieven Net from [1] 13.2
Copy Net from [1] 11.0
Attn. Seq2Seq from [1] 10.2
Rule-Based from [1] 6.60
[1] Mihail Eric, Lakshmi Krishnan, Francois Charette, and Christopher D. Manning, “Key-Value Retrieval Networks for
Task-Oriented Dialogue – 2017
Model Test BLEU
Weather Navigation Schedules
DeepPavlov implementation of
KV Retrieval Net
14.6 12.5 11.9
Wen et al [2] 14.9 13.7 -
[2] Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin and Ting Liu. Sequence-to-Sequence Learning for Task-
oriented Dialogue with Dialogue State Representation. COLING 2018.
42. DeepPavlov.ai
Latest release
DeepPavlov 0.3.0
* BERT-based models for ranking, NER,
classification and Text Q&A (SQuAD)
* New SMN, DAM, DAM-USE-T ranking models
* Multilingual NER for 100 languages
* New AIML wrapper component
43. DeepPavlov.ai
Future steps
• Better usability
- Improved Python API
- Tutorials, How to examples
• Support for script based skills
- Python API with script uploading from file
- DSL
- GUI tool for fast script prototyping
• Skill manager
- Implementation of baseline multi-skill manager with ranking model
- Adding of rich context to the skill manager
• Research
- Training with low data (transfer learning, language models etc.)
- Better dialogue models combining knowledge graphs and deep learning to address lack of common
sense in current solutions
44. DeepPavlov.ai
• Code
- https://github.com/deepmipt/DeepPavlov
• Documentation
- http://docs.deeppavlov.ai/
• Demo (experimental, not all models have the same performance as in the library)
- http://demo.ipavlov.ai/
• Tutorials
- Simple text classification skill of DeepPavlov
▪ https://towardsdatascience.com/simple-text-classification-skill-of-deeppavlov-54bc1b61c9ea
- Open-domain question answering with DeepPavlov
▪ https://medium.com/deeppavlov/open-domain-question-answering-with-deeppavlov-c665d2ee4d65
• References
- Burtsev M., et al. DeepPavlov: Open-Source Library for Dialogue Systems // Proceedings of ACL 2018, System Demonstrations (2018): 122-127.
- Burtsev M., et al. DeepPavlov: An Open Source Library for Conversational AI // Proceedings of NeurIPS 2018, MLOSS Workshop, 2018.
DeepPavlov.ai
45. DeepPavlov.ai
Q&A
1. What is an "ideal" framework for development of conversational
agents?
2. Do we need an "operating system" for conversational AI agents? If yes,
then how should it look like?
3. What are the most promising fields of application /verticals for the
conversational AI right now?
4. Looking into the future of ML/AI, how will conversational AI evolve
and interrelate with other research and technology directions?