SlideShare a Scribd company logo
1 of 50
Download to read offline
16/11/2020 1
A/Prof Truyen Tran
With contribution from Vuong Le, Hung
Le, Thao Le, Tin Pham & Dung Nguyen
Deakin University
December 2020
Deep learning 1.0 and Beyond
A tutorial
Part II
@truyenoz
truyentran.github.io
truyen.tran@deakin.edu.au
letdataspeak.blogspot.com
goo.gl/3jJ1O0
linkedin.com/in/truyen-tran
16/11/2020 2
“[By 2023] …
Emergence of the
generally agreed upon
"next big thing" in AI
beyond deep learning.”
Rodney Brooks
rodneybrooks.com
“[…] general-purpose computer
programs, built on top of far richer
primitives than our current
differentiable layers—[…] we will
get to reasoning and abstraction,
the fundamental weakness of
current models.”
Francois Chollet
blog.keras.io
“Software 2.0 is written in
neural network weights”
Andrej Karpathy
medium.com/@karpathy
DL 1.0 has been fantastic, but has serious limitations
(but not always its fault)
DL builds glorified function
approximators using gradient
descent
 Great at interpolating. Think GPT-X.
 One-step input/output mapping
 Require differentiability
Little systematic generalization
#REF: Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018).
Data hungry to cover all possible
patterns
 Computation demanding to process large
data
 Energy inefficient
 Prohibitive for small labs to compete
 Engineering effort is huge  Technical
debt
A little too much heuristic. Lack of
theory.
DL 1.0 has been fantastic, but has serious limitations
(but not always its fault) (cont.)
#REF: Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018).
Lack natural mechanism to
incorporate prior knowledge, e.g.,
common sense
Assume stationaries
 Changes cause trouble  Expensive
retraining
 No causality  Random correlations
can be “learnt”
Sensitive to adversarial attacks
Lack of reasoning
 Pure pattern recognizer
 Little explainability
  Trust issue
To be fair, may of these problems are
common issues of statistical
learning!
DL 1.0 is great, but it is struggled to solve many
AI/ML problems
Learn to organize and remember ultra-
long sequences
Learn to generate arbitrary objects, with
zero supports
Reasoning about object, relation,
causality, self and other agents
Imagine scenarios, act on the world and
learn from the feedbacks
Continual learning, never-ending, across
tasks, domains, representations
Learn by socializing
Learn just by observing and self-prediction
Organizing and reasoning about (common-
sense) knowledge
Automated discovery of physical laws
Solve genetics, neuroscience and
healthcare
Automate physical sciences
Automate software engineering
Neural memories
Theory of mind
Neural reasoning
A system view
Deep learning 2.0
16/11/2020 6
Classic models
Transformers
Graph neural networks
Unsupervised learning
Deep learning 1.0
Agenda
1960s-1990s
 Hand-crafting rules,
domain-specific, logic-
based
 High in reasoning
 Can’t scale.
 Fail on unseen cases.
16/11/2020 7
2020s-2030s
 Learning + reasoning, general
purpose, human-like
 Has contextual and common-
sense reasoning
 Requires less data
 Adapt to change
 Explainable
1990s-present
 Machine learning, general
purpose, statistics-based
 Low in reasoning
 Needs lots of data
 Less adaptive
 Little explanation
Photo credit: DARPA
8
System 1:
Intuitive
System 1:
Intuitive
System 1:
Intuitive
• Fast
• Implicit/automatic
• Pattern recognition
• Multiple
System 2:
Analytical
• Slow
• Deliberate/rational
• Careful analysis
• Single, sequential
• Hypothetical thought
• Decoupled from data rep
Single
Memory
• Facts
• Semantics
• Events and relational
associations
• Working space –
temporal buffer
Pattern
recognition
Reasoning
Current neural networks offerings
16/11/2020 9
No storage of intermediate results
Little choices over what to compute and what to use
Lack of conditional computation
Little support for complex chained reasoning
Little support for rapid switching of tasks
Credit: hexahedria
What is missing? A memory
Use multiple pieces of information
Store intermediate results (RAM like)
Episodic recall of previous tasks (Tape like)
Encode/compress & generate/decompress
long sequences
Learn/store programs (e.g., fast weights)
Store and query external knowledge
Spatial memory for navigation
16/11/2020 10
Rare but important events (e.g., snake
bite)
Needed for complex control
Short-cuts for ease of gradient
propagation = constant path length
Division of labour: program, execution
and storage
Working-memory is an indicator of IQ in
human
Memory enables reasoning
Expert reasoning was enabled by a large long-term
memory, acquired through experience
Working memory for analytic reasoning
 WM is a system to support information binding to a coordinate
system
 Reasoning as deliberative hypothesis testing  memory-retrieval
based hypothesis generation
 Higher order cognition = creating & manipulating relations 
representation of premises, temporarily stored in WM.
Reasoning over concepts & relations requires semantic
memory
Memory is critical for episodic future thinking (mental
simulation)
16/11/2020 11
“[…] one cannot hope to
understand reasoning
without understanding the
memory processes […]”
(Thompson and Feeney, 2014)
Neural memories
Theory of mind
Neural reasoning
A system view
Deep learning 2.0
16/11/2020 12
Classic models
Transformers
Graph neural networks
Unsupervised learning
Deep learning 1.0
Agenda
Recall: Memory networks
 Input is a set  Load into memory,
which is NOT updated.
 State is a RNN with attention reading
from inputs
 Concepts: Query, key and content +
Content addressing.
 Deep models, but constant path length
from input to output.
 Equivalent to a RNN with shared input
set.
16/11/2020 13
Sukhbaatar, Sainbayar, Jason Weston, and Rob
Fergus. "End-to-end memory networks." Advances in
neural information processing systems. 2015.
MANN: Memory-Augmented Neural Networks
(a constant path length)
Long-term dependency
E.g., outcome depends on the far past
Memory is needed (e.g., as in LSTM)
Complex program requires multiple computational steps
Each step can be selective (attentive) to certain memory cell
Operations: Encoding | Decoding | Retrieval
16/11/2020 15
Learning a Turing machine
 Can we learn a (neural)
program that learns to
program from data?
Visual reasoning is a
specific program of two
inputs (visual, linguistic)
Neural Turing machine (NTM)
(simulating a differentiable Turing machine)
A controller that takes
input/output and talks to an
external memory module.
Memory has read/write
operations.
The main issue is where to write,
and how to update the memory
state.
All operations are differentiable.
Source: rylanschaeffer.github.io
NTM operations
16/11/2020 17
medium.com/@aidangomez
rylanschaeffer.github.io
16/11/2020 18
NTM unrolled in time with LSTM as controller
#Ref: https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315
MANN for reasoning
Three steps:
 Store data into memory
 Read query, process sequentially, consult memory
 Output answer
Behind the scene:
 Memory contains data & results of intermediate steps
Drawbacks of current MANNs:
 No memory of controllers  Less modularity and
compositionality when query is complex
 No memory of relations  Much harder to chain predicates.
16/11/2020 19
Source: rylanschaeffer.github.io
Failures of item-only MANNs for reasoning
Relational representation is NOT stored  Can’t reuse later in the
chain
A single memory of items and relations  Can’t understand how
relational reasoning occurs
The memory-memory relationship is coarse since it is represented as
either dot product, or weighted sum.
16/11/2020 20
Self-attentive associative memories (SAM)
Learning relations automatically over time
16/11/2020 21
Hung Le, Truyen Tran, Svetha Venkatesh, “Self-
attentive associative memory”, ICML'20.
NUTM = NTM + NSM
Hung Le, Truyen Tran, Svetha Venkatesh,
“Neural stored-program memory”, ICLR'20.
Computing devices vs neural counterparts
FSM (1943) ↔ RNNs (1982)
PDA (1954) ↔ Stack RNN (1993)
TM (1936) ↔ NTM (2014)
UTM/VNA (1936/1945) ↔ NUTM (2019)
Neural memories
Theory of mind
Neural reasoning
A system view
Deep learning 2.0
16/11/2020 24
Classic models
Transformers
Graph neural networks
Unsupervised learning
Deep learning 1.0
Agenda
25
What color is the thing with the same
size as the blue cylinder?
blue
• Requires multi-step
reasoning: find blue cylinder
➔ locate other object of the
same size ➔ determine its
color (green).
A testbed: Visual QA
26
Reasoning
Qualitative spatial
reasoning
Relational, temporal
inference
Commonsense
Object recognition
Scene graphs
Computer Vision
Natural Language
Processing
Machine
learning
Visual QA
Parsing
Symbol binding
Systematic generalisation
Learning to classify
entailment
Unsupervised
learning
Reinforcement
learning
Program synthesis
Action graphs
Event detection
Object
discovery
Learning to reason
Learning is to improve itself by experiencing ~ acquiring
knowledge & skills
Reasoning is to deduce knowledge from previously
acquired knowledge in response to a query (or a cues)
Learning to reason is to improve the ability to decide if a
knowledge base entails a predicate.
 E.g., given a video f, determines if the person with the hat turns
before singing.
Hypotheses:
 Reasoning as just-in-time program synthesis.
 It employs conditional computation.
16/11/2020 27
Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM
(JACM) 44.5 (1997): 697-725.
(Dan Roth; ACM
Fellow; IJCAI John
McCarthy Award)
Why neural reasoning?
Reasoning is not necessarily achieved by making
logical inferences
There is a continuity between [algebraically rich
inference] and [connecting together trainable
learning systems]
Central to reasoning is composition rules to guide
the combinations of modules to address new tasks
16/11/2020 28
“When we observe a visual scene, when
we hear a complex sentence, we are
able to explain in formal terms the
relation of the objects in the scene, or
the precise meaning of the sentence
components. However, there is no
evidence that such a formal analysis
necessarily takes place: we see a scene,
we hear a sentence, and we just know
what they mean. This suggests the
existence of a middle layer, already a
form of reasoning, but not yet formal
or logical.”
Bottou, Léon. "From machine learning to machine
reasoning." Machine learning 94.2 (2014): 133-149.
The two approaches to neural reasoning
Implicit chaining of predicates through recurrence:
 Step-wise query-specific attention to relevant concepts & relations.
 Iterative concept refinement & combination, e.g., through a working
memory.
 Answer is computed from the last memory state & question embedding.
Explicit program synthesis:
 There is a set of modules, each performs an pre-defined operation.
 Question is parse into a symbolic program.
 The program is implemented as a computational graph constructed by
chaining separate modules.
 The program is executed to compute an answer.
16/11/2020 29
MACNet: Composition-Attention-
Control
(reasoning by progressive refinement
of selected data)
16/11/2020 30
Hudson, Drew A., and Christopher D. Manning.
"Compositional attention networks for machine
reasoning." arXiv preprint arXiv:1803.03067 (2018).
LOGNet: Relational object reasoning with language binding
31
• Key insight: Reasoning is chaining of relational predicates to arrive
at a final conclusion
→ Needs to uncover spatial relations, conditioned on query
→ Chaining is query-driven
→ Objects/language needs binding
→ Object semantics is query-dependent
→ Very thing is end-to-end differentiable
System 1: visual
representation
System 2: High-level
reasoning
Thao Minh Le, Vuong Le, Svetha Venkatesh, and
Truyen Tran, “Dynamic Language Binding in
Relational Visual Reasoning”, IJCAI’20.
32
Language-binding Object Graph Network for VQA
Thao Minh Le, Vuong Le,
Svetha Venkatesh, and
Truyen Tran, “Dynamic
Language Binding in
Relational Visual
Reasoning”, IJCAI’20.
16/11/2020 33
Transformer as implicit reasoning
Reasoning as (free-) energy minimisation
The classic Belief Propagation algorithm is minimization algorithm of
the Bethe free-energy!
Transformer has relational, iterative state refinement makes
it a great candidate for implicit relational reasoning.
16/11/2020 34
Heskes, Tom. "Stable fixed points of loopy belief propagation are local minima of the bethe free
energy." Advances in neural information processing systems. 2003.
Ramsauer, Hubert, et al. "Hopfield networks is all you need." arXiv preprint
arXiv:2008.02217 (2020).
16/11/2020 35http://mccormickml.com/2020/03/10/question-answering-with-a-fine-tuned-BERT/
On SQuAD, Answer = start/end positions
16/11/2020 36
Anonymous, “Neural spatio-temporal reasoning with object-centric self-
supervised learning”, https://openreview.net/pdf?id=rEaz5uTcL6Q
Answer place holder
38
Mao, Jiayuan, et al. "The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences
From Natural Supervision." International Conference on Learning Representations. 2019.
NS-CL: Neuro-Symbolic Concept Learner
Question
parser
Extract object proposals from the image from which a feature vector is obtained usingRoI Align. Each
object feature is donated as 𝑜𝑜𝑖𝑖
Object concepts of the same attribute is mapped into a embedding space. For example, sphere, cube, and
cylinder are mapped into shape embedding space. This mapping is a classification problem!
= σ < 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠. 𝑜𝑜𝑜𝑜 𝑜𝑜𝑖𝑖, 𝑣𝑣 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
> −γ /τ
Where
 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠. 𝑜𝑜𝑜𝑜 is a neural networks
 𝑣𝑣𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
is the concept embedding to be learned of cube
 σ : sigmoid function
 γ and τ are scaling constants. 39
Concept learner
Program execution
Work on object-based visual
representation
An intermediate set of objects is
represented by a vector, as attention mask
over all object in the scene. For example,
Filter(Green_cube) outputs a mask
(0,1,0,0).
The output mask is fed into the next
module (e.g Relate)
40
Neural memories
Theory of mind
Neural reasoning
A system view
Deep learning 2.0
16/11/2020 41
Classic models
Transformers
Graph neural networks
Unsupervised learning
Deep learning 1.0
Agenda
Contextualized recursive reasoning
Thus far, QA tasks are straightforward and
objective:
Questioner: I will ask about what I don’t know.
Answerer: I will answer what I know.
Real life can be tricky, more subjective:
Questioner: I will ask only questions I think
they can answer.
Answerer 1: This is what I think they want from
an answer.
Answerer 2: I will answer only what I think
they think I can.
16/11/2020 42
Source: religious studies project
 We need Theory of Mind to function socially.
Sally and Anne
Sally Anne
Sally puts her cake
into her basket
Sally’s basket Anne’s box
Sally goes out of
the room.
Anne takes Sally’s
cake out of Sally’s
basket and put this
cake into Anne’s box
Sally comes back to
the room
1
2
4
5
3
Photo: wikipedia
Social dilemma: Stag Hunt games
Difficult decision: individual outcomes (selfish) or group outcomes
(cooperative).
 Together hunt Stag (both are cooperative): Both have more meat.
 Solely hunt Hare (both are selfish): Both have less meat.
 One hunts Stag (cooperative), other hunts Hare (selfish): Only one hunts hare
has meat.
Human evidence: Self-interested but considerate of others
(cultures vary).
Idea: Belief-based guilt-aversion
 One experiences loss if it lets other down.
 Necessitates Theory of Mind: reasoning about other’s mind.
A neural theory of mind
Successor
representationsnext-step action
probability
goal
Rabinowitz, Neil C., et al.
"Machine theory of
mind." arXiv preprint
arXiv:1802.07740 (2018).
Theory of Mind Agent with Guilt Aversion (ToMAGA)
Update Theory of Mind
 Predict whether other’s behaviour are
cooperative or uncooperative
 Updated the zero-order belief (what other will
do)
 Update the first-order belief (what other think
about me)
Guilt Aversion
 Compute the expected material reward of
other based on Theory of Mind
 Compute the psychological rewards, i.e.
“feeling guilty”
 Reward shaping: subtract the expected loss of
the other.
Nguyen, Dung, et al. "Theory of Mind with Guilt
Aversion Facilitates Cooperative Reinforcement
Learning." Asian Conference on Machine Learning.
PMLR, 2020.
47
System 1:
Intuitive
System 1:
Intuitive
System 1:
Intuitive
• Fast
• Implicit/automatic
• Pattern recognition
• Multiple
System 2:
Analytical
• Slow
• Deliberate/rational
• Careful analysis
• Single, sequential
• Hypothetical thought
• Decoupled from data rep
Single
Memory
• Facts
• Semantics
• Events and relational
associations
• Working space –
temporal buffer
Pattern
recognition
Reasoning
Neural memories
Theory of mind
Neural reasoning
A system view
Deep learning 2.0
16/11/2020 48
Classic models
Transformers
Graph neural networks
Unsupervised learning
Deep learning 1.0
Summary
End of part II
16/11/2020 49
References
Anonymous, “Neuralspatio-temporal reasoning with object-centric self-supervised learning”,
https://openreview.net/pdf?id=rEaz5uTcL6Q
Bello, Irwan, et al. "Neural optimizer search with reinforcement learning." arXiv preprint arXiv:1709.07417 (2017).
Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." IEEE
transactions on pattern analysis and machine intelligence 35.8 (2013): 1798-1828.
Bottou, Léon. "From machine learning to machine reasoning." Machine learning 94.2 (2014): 133-149.
Dehghani, Mostafa, et al. "Universal Transformers." International Conference on Learning Representations. 2018.
Kien Do, Truyen Tran, and Svetha Venkatesh. "Graph Transformation Policy Network for Chemical Reaction
Prediction." KDD’19.
Kien Do, Truyen Tran, Svetha Venkatesh, “Learning deep matrix representations”,arXiv preprint arXiv:1703.01454
Gilmer, Justin, et al. "Neural message passing for quantum chemistry."arXiv preprint arXiv:1704.01212 (2017).
Ha, David, Andrew Dai, and Quoc V. Le. "Hypernetworks." arXiv preprint arXiv:1609.09106 (2016).
Heskes, Tom. "Stable fixed points of loopy belief propagation are local minima of the bethe free energy." Advances in
neural information processing systems. 2003.
Hudson, Drew A., and Christopher D. Manning. "Compositional attention networks for machine reasoning."arXiv preprint
arXiv:1803.03067 (2018).
Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and
variation. arXiv preprint arXiv:1710.10196.
Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM (JACM) 44.5 (1997): 697-725.
Hung Le, Truyen Tran, Svetha Venkatesh, “Self-attentive associative memory”, ICML'20.
Hung Le, Truyen Tran, Svetha Venkatesh, “Neural stored-program memory”, ICLR'20.
16/11/2020 50
Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Dynamic Language Binding in Relational Visual
Reasoning”, IJCAI’20.
Le-Khac, Phuc H., Graham Healy, and Alan F. Smeaton. "Contrastive Representation Learning: A Framework and
Review." arXiv preprint arXiv:2010.05113 (2020).
Liu, Xiao, et al. "Self-supervised learning: Generative or contrastive." arXiv preprint arXiv:2006.08218 (2020). Marcus,
Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018).
Mao, Jiayuan, et al. "The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural
Supervision." International Conference on Learning Representations. 2019.
Nguyen, Dung, et al. "Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning." Asian
Conference on Machine Learning. PMLR, 2020.
Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X-ray structure of dopamine transporter elucidates antidepressant
mechanism." Nature 503.7474 (2013): 85-90.
Pham, Trang, et al. "Column Networks for Collective Classification."AAAI. 2017.
Ramsauer, Hubert, et al. "Hopfield networks is all you need." arXiv preprint arXiv:2008.02217 (2020).
Rabinowitz, Neil C., et al. "Machine theory of mind." arXiv preprint arXiv:1802.07740 (2018).
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information
processing systems. 2015.
Tay, Yi, et al. "Efficient transformers: A survey." arXiv preprint arXiv:2009.06732 (2020).
Xie, Tian, and Jeffrey C. Grossman. "Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable
Prediction of Material Properties." Physical review letters 120.14 (2018): 145301.
You, Jiaxuan, et al. "GraphRNN: Generating realistic graphs with deep auto-regressive models." ICML (2018).
16/11/2020 51
References (cont.)

More Related Content

What's hot

Deep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeakin University
 
Memory advances in Neural Turing Machines
Memory advances in Neural Turing MachinesMemory advances in Neural Turing Machines
Memory advances in Neural Turing MachinesDeakin University
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryDeakin University
 
Deep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeakin University
 
AI/ML as an empirical science
AI/ML as an empirical scienceAI/ML as an empirical science
AI/ML as an empirical scienceDeakin University
 
Deep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeakin University
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphsDeakin University
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
CI image processing mns
CI image processing mnsCI image processing mns
CI image processing mnsMeenakshi Sood
 
Deep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeakin University
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care Meenakshi Sood
 
Introduction to soft computing V 1.0
Introduction to soft computing  V 1.0Introduction to soft computing  V 1.0
Introduction to soft computing V 1.0Dr. C.V. Suresh Babu
 
An Introduction to Soft Computing
An Introduction to Soft ComputingAn Introduction to Soft Computing
An Introduction to Soft ComputingTameem Ahmad
 

What's hot (20)

Deep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains III
 
Memory advances in Neural Turing Machines
Memory advances in Neural Turing MachinesMemory advances in Neural Turing Machines
Memory advances in Neural Turing Machines
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug Discovery
 
Deep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains II
 
AI/ML as an empirical science
AI/ML as an empirical scienceAI/ML as an empirical science
AI/ML as an empirical science
 
Empirical AI Research
Empirical AI Research Empirical AI Research
Empirical AI Research
 
Deep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains I
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
CI image processing mns
CI image processing mnsCI image processing mns
CI image processing mns
 
Deep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeep learning for genomics: Present and future
Deep learning for genomics: Present and future
 
Basics of Soft Computing
Basics of Soft  Computing Basics of Soft  Computing
Basics of Soft Computing
 
Neural networks report
Neural networks reportNeural networks report
Neural networks report
 
Soft computing
Soft computingSoft computing
Soft computing
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care
 
Introduction to soft computing V 1.0
Introduction to soft computing  V 1.0Introduction to soft computing  V 1.0
Introduction to soft computing V 1.0
 
An Introduction to Soft Computing
An Introduction to Soft ComputingAn Introduction to Soft Computing
An Introduction to Soft Computing
 

Similar to Deep learning 1.0 and Beyond, Part 2

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.pptbutest
 
AI_07_Deep Learning.pptx
AI_07_Deep Learning.pptxAI_07_Deep Learning.pptx
AI_07_Deep Learning.pptxYousef Aburawi
 
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014lebsoftshore
 
Human Level Artificial Intelligence
Human Level Artificial IntelligenceHuman Level Artificial Intelligence
Human Level Artificial IntelligenceRahul Chaurasia
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reasonDeakin University
 
Information Technologies, Methods and Practices for Mind Enhancement
Information Technologies, Methods and Practices for Mind EnhancementInformation Technologies, Methods and Practices for Mind Enhancement
Information Technologies, Methods and Practices for Mind EnhancementDanila Medvedev
 
Machine Learning, Artificial General Intelligence, and Robots with Human Minds
Machine Learning, Artificial General Intelligence, and Robots with Human MindsMachine Learning, Artificial General Intelligence, and Robots with Human Minds
Machine Learning, Artificial General Intelligence, and Robots with Human MindsUniversity of Huddersfield
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DLRehan Guha
 
AI_01_introduction.pptx
AI_01_introduction.pptxAI_01_introduction.pptx
AI_01_introduction.pptxYousef Aburawi
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Amr Rashed
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Vincenzo Lomonaco
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionDarian Frajberg
 
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...John Mathon
 

Similar to Deep learning 1.0 and Beyond, Part 2 (20)

Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
AI_07_Deep Learning.pptx
AI_07_Deep Learning.pptxAI_07_Deep Learning.pptx
AI_07_Deep Learning.pptx
 
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014
Lebanon SoftShore Artificial Intelligence Seminar - March 38, 2014
 
Human Level Artificial Intelligence
Human Level Artificial IntelligenceHuman Level Artificial Intelligence
Human Level Artificial Intelligence
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reason
 
Information Technologies, Methods and Practices for Mind Enhancement
Information Technologies, Methods and Practices for Mind EnhancementInformation Technologies, Methods and Practices for Mind Enhancement
Information Technologies, Methods and Practices for Mind Enhancement
 
Machine Learning, Artificial General Intelligence, and Robots with Human Minds
Machine Learning, Artificial General Intelligence, and Robots with Human MindsMachine Learning, Artificial General Intelligence, and Robots with Human Minds
Machine Learning, Artificial General Intelligence, and Robots with Human Minds
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DL
 
AI_01_introduction.pptx
AI_01_introduction.pptxAI_01_introduction.pptx
AI_01_introduction.pptx
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
 
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
 

More from Deakin University

Deep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeakin University
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...Deakin University
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsDeakin University
 
Generative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeDeakin University
 
AI for tackling climate change
AI for tackling climate changeAI for tackling climate change
AI for tackling climate changeDeakin University
 
Deep learning for episodic interventional data
Deep learning for episodic interventional dataDeep learning for episodic interventional data
Deep learning for episodic interventional dataDeakin University
 
Deep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining IIDeep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining IIDeakin University
 
Deep learning for biomedicine
Deep learning for biomedicineDeep learning for biomedicine
Deep learning for biomedicineDeakin University
 

More from Deakin University (11)

Deep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advances
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of Materials
 
Generative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI Landscape
 
AI in the Covid-19 pandemic
AI in the Covid-19 pandemicAI in the Covid-19 pandemic
AI in the Covid-19 pandemic
 
AI for tackling climate change
AI for tackling climate changeAI for tackling climate change
AI for tackling climate change
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Deep learning for episodic interventional data
Deep learning for episodic interventional dataDeep learning for episodic interventional data
Deep learning for episodic interventional data
 
Deep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining IIDeep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining II
 
AI that/for matters
AI that/for mattersAI that/for matters
AI that/for matters
 
Deep learning for biomedicine
Deep learning for biomedicineDeep learning for biomedicine
Deep learning for biomedicine
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Deep learning 1.0 and Beyond, Part 2

  • 1. 16/11/2020 1 A/Prof Truyen Tran With contribution from Vuong Le, Hung Le, Thao Le, Tin Pham & Dung Nguyen Deakin University December 2020 Deep learning 1.0 and Beyond A tutorial Part II @truyenoz truyentran.github.io truyen.tran@deakin.edu.au letdataspeak.blogspot.com goo.gl/3jJ1O0 linkedin.com/in/truyen-tran
  • 2. 16/11/2020 2 “[By 2023] … Emergence of the generally agreed upon "next big thing" in AI beyond deep learning.” Rodney Brooks rodneybrooks.com “[…] general-purpose computer programs, built on top of far richer primitives than our current differentiable layers—[…] we will get to reasoning and abstraction, the fundamental weakness of current models.” Francois Chollet blog.keras.io “Software 2.0 is written in neural network weights” Andrej Karpathy medium.com/@karpathy
  • 3. DL 1.0 has been fantastic, but has serious limitations (but not always its fault) DL builds glorified function approximators using gradient descent  Great at interpolating. Think GPT-X.  One-step input/output mapping  Require differentiability Little systematic generalization #REF: Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018). Data hungry to cover all possible patterns  Computation demanding to process large data  Energy inefficient  Prohibitive for small labs to compete  Engineering effort is huge  Technical debt A little too much heuristic. Lack of theory.
  • 4. DL 1.0 has been fantastic, but has serious limitations (but not always its fault) (cont.) #REF: Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018). Lack natural mechanism to incorporate prior knowledge, e.g., common sense Assume stationaries  Changes cause trouble  Expensive retraining  No causality  Random correlations can be “learnt” Sensitive to adversarial attacks Lack of reasoning  Pure pattern recognizer  Little explainability   Trust issue To be fair, may of these problems are common issues of statistical learning!
  • 5. DL 1.0 is great, but it is struggled to solve many AI/ML problems Learn to organize and remember ultra- long sequences Learn to generate arbitrary objects, with zero supports Reasoning about object, relation, causality, self and other agents Imagine scenarios, act on the world and learn from the feedbacks Continual learning, never-ending, across tasks, domains, representations Learn by socializing Learn just by observing and self-prediction Organizing and reasoning about (common- sense) knowledge Automated discovery of physical laws Solve genetics, neuroscience and healthcare Automate physical sciences Automate software engineering
  • 6. Neural memories Theory of mind Neural reasoning A system view Deep learning 2.0 16/11/2020 6 Classic models Transformers Graph neural networks Unsupervised learning Deep learning 1.0 Agenda
  • 7. 1960s-1990s  Hand-crafting rules, domain-specific, logic- based  High in reasoning  Can’t scale.  Fail on unseen cases. 16/11/2020 7 2020s-2030s  Learning + reasoning, general purpose, human-like  Has contextual and common- sense reasoning  Requires less data  Adapt to change  Explainable 1990s-present  Machine learning, general purpose, statistics-based  Low in reasoning  Needs lots of data  Less adaptive  Little explanation Photo credit: DARPA
  • 8. 8 System 1: Intuitive System 1: Intuitive System 1: Intuitive • Fast • Implicit/automatic • Pattern recognition • Multiple System 2: Analytical • Slow • Deliberate/rational • Careful analysis • Single, sequential • Hypothetical thought • Decoupled from data rep Single Memory • Facts • Semantics • Events and relational associations • Working space – temporal buffer Pattern recognition Reasoning
  • 9. Current neural networks offerings 16/11/2020 9 No storage of intermediate results Little choices over what to compute and what to use Lack of conditional computation Little support for complex chained reasoning Little support for rapid switching of tasks Credit: hexahedria
  • 10. What is missing? A memory Use multiple pieces of information Store intermediate results (RAM like) Episodic recall of previous tasks (Tape like) Encode/compress & generate/decompress long sequences Learn/store programs (e.g., fast weights) Store and query external knowledge Spatial memory for navigation 16/11/2020 10 Rare but important events (e.g., snake bite) Needed for complex control Short-cuts for ease of gradient propagation = constant path length Division of labour: program, execution and storage Working-memory is an indicator of IQ in human
  • 11. Memory enables reasoning Expert reasoning was enabled by a large long-term memory, acquired through experience Working memory for analytic reasoning  WM is a system to support information binding to a coordinate system  Reasoning as deliberative hypothesis testing  memory-retrieval based hypothesis generation  Higher order cognition = creating & manipulating relations  representation of premises, temporarily stored in WM. Reasoning over concepts & relations requires semantic memory Memory is critical for episodic future thinking (mental simulation) 16/11/2020 11 “[…] one cannot hope to understand reasoning without understanding the memory processes […]” (Thompson and Feeney, 2014)
  • 12. Neural memories Theory of mind Neural reasoning A system view Deep learning 2.0 16/11/2020 12 Classic models Transformers Graph neural networks Unsupervised learning Deep learning 1.0 Agenda
  • 13. Recall: Memory networks  Input is a set  Load into memory, which is NOT updated.  State is a RNN with attention reading from inputs  Concepts: Query, key and content + Content addressing.  Deep models, but constant path length from input to output.  Equivalent to a RNN with shared input set. 16/11/2020 13 Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015.
  • 14. MANN: Memory-Augmented Neural Networks (a constant path length) Long-term dependency E.g., outcome depends on the far past Memory is needed (e.g., as in LSTM) Complex program requires multiple computational steps Each step can be selective (attentive) to certain memory cell Operations: Encoding | Decoding | Retrieval
  • 15. 16/11/2020 15 Learning a Turing machine  Can we learn a (neural) program that learns to program from data? Visual reasoning is a specific program of two inputs (visual, linguistic)
  • 16. Neural Turing machine (NTM) (simulating a differentiable Turing machine) A controller that takes input/output and talks to an external memory module. Memory has read/write operations. The main issue is where to write, and how to update the memory state. All operations are differentiable. Source: rylanschaeffer.github.io
  • 18. 16/11/2020 18 NTM unrolled in time with LSTM as controller #Ref: https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315
  • 19. MANN for reasoning Three steps:  Store data into memory  Read query, process sequentially, consult memory  Output answer Behind the scene:  Memory contains data & results of intermediate steps Drawbacks of current MANNs:  No memory of controllers  Less modularity and compositionality when query is complex  No memory of relations  Much harder to chain predicates. 16/11/2020 19 Source: rylanschaeffer.github.io
  • 20. Failures of item-only MANNs for reasoning Relational representation is NOT stored  Can’t reuse later in the chain A single memory of items and relations  Can’t understand how relational reasoning occurs The memory-memory relationship is coarse since it is represented as either dot product, or weighted sum. 16/11/2020 20
  • 21. Self-attentive associative memories (SAM) Learning relations automatically over time 16/11/2020 21 Hung Le, Truyen Tran, Svetha Venkatesh, “Self- attentive associative memory”, ICML'20.
  • 22. NUTM = NTM + NSM Hung Le, Truyen Tran, Svetha Venkatesh, “Neural stored-program memory”, ICLR'20.
  • 23. Computing devices vs neural counterparts FSM (1943) ↔ RNNs (1982) PDA (1954) ↔ Stack RNN (1993) TM (1936) ↔ NTM (2014) UTM/VNA (1936/1945) ↔ NUTM (2019)
  • 24. Neural memories Theory of mind Neural reasoning A system view Deep learning 2.0 16/11/2020 24 Classic models Transformers Graph neural networks Unsupervised learning Deep learning 1.0 Agenda
  • 25. 25 What color is the thing with the same size as the blue cylinder? blue • Requires multi-step reasoning: find blue cylinder ➔ locate other object of the same size ➔ determine its color (green). A testbed: Visual QA
  • 26. 26 Reasoning Qualitative spatial reasoning Relational, temporal inference Commonsense Object recognition Scene graphs Computer Vision Natural Language Processing Machine learning Visual QA Parsing Symbol binding Systematic generalisation Learning to classify entailment Unsupervised learning Reinforcement learning Program synthesis Action graphs Event detection Object discovery
  • 27. Learning to reason Learning is to improve itself by experiencing ~ acquiring knowledge & skills Reasoning is to deduce knowledge from previously acquired knowledge in response to a query (or a cues) Learning to reason is to improve the ability to decide if a knowledge base entails a predicate.  E.g., given a video f, determines if the person with the hat turns before singing. Hypotheses:  Reasoning as just-in-time program synthesis.  It employs conditional computation. 16/11/2020 27 Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM (JACM) 44.5 (1997): 697-725. (Dan Roth; ACM Fellow; IJCAI John McCarthy Award)
  • 28. Why neural reasoning? Reasoning is not necessarily achieved by making logical inferences There is a continuity between [algebraically rich inference] and [connecting together trainable learning systems] Central to reasoning is composition rules to guide the combinations of modules to address new tasks 16/11/2020 28 “When we observe a visual scene, when we hear a complex sentence, we are able to explain in formal terms the relation of the objects in the scene, or the precise meaning of the sentence components. However, there is no evidence that such a formal analysis necessarily takes place: we see a scene, we hear a sentence, and we just know what they mean. This suggests the existence of a middle layer, already a form of reasoning, but not yet formal or logical.” Bottou, Léon. "From machine learning to machine reasoning." Machine learning 94.2 (2014): 133-149.
  • 29. The two approaches to neural reasoning Implicit chaining of predicates through recurrence:  Step-wise query-specific attention to relevant concepts & relations.  Iterative concept refinement & combination, e.g., through a working memory.  Answer is computed from the last memory state & question embedding. Explicit program synthesis:  There is a set of modules, each performs an pre-defined operation.  Question is parse into a symbolic program.  The program is implemented as a computational graph constructed by chaining separate modules.  The program is executed to compute an answer. 16/11/2020 29
  • 30. MACNet: Composition-Attention- Control (reasoning by progressive refinement of selected data) 16/11/2020 30 Hudson, Drew A., and Christopher D. Manning. "Compositional attention networks for machine reasoning." arXiv preprint arXiv:1803.03067 (2018).
  • 31. LOGNet: Relational object reasoning with language binding 31 • Key insight: Reasoning is chaining of relational predicates to arrive at a final conclusion → Needs to uncover spatial relations, conditioned on query → Chaining is query-driven → Objects/language needs binding → Object semantics is query-dependent → Very thing is end-to-end differentiable System 1: visual representation System 2: High-level reasoning Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Dynamic Language Binding in Relational Visual Reasoning”, IJCAI’20.
  • 32. 32 Language-binding Object Graph Network for VQA Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Dynamic Language Binding in Relational Visual Reasoning”, IJCAI’20.
  • 34. Transformer as implicit reasoning Reasoning as (free-) energy minimisation The classic Belief Propagation algorithm is minimization algorithm of the Bethe free-energy! Transformer has relational, iterative state refinement makes it a great candidate for implicit relational reasoning. 16/11/2020 34 Heskes, Tom. "Stable fixed points of loopy belief propagation are local minima of the bethe free energy." Advances in neural information processing systems. 2003. Ramsauer, Hubert, et al. "Hopfield networks is all you need." arXiv preprint arXiv:2008.02217 (2020).
  • 36. 16/11/2020 36 Anonymous, “Neural spatio-temporal reasoning with object-centric self- supervised learning”, https://openreview.net/pdf?id=rEaz5uTcL6Q Answer place holder
  • 37. 38 Mao, Jiayuan, et al. "The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision." International Conference on Learning Representations. 2019. NS-CL: Neuro-Symbolic Concept Learner Question parser
  • 38. Extract object proposals from the image from which a feature vector is obtained usingRoI Align. Each object feature is donated as 𝑜𝑜𝑖𝑖 Object concepts of the same attribute is mapped into a embedding space. For example, sphere, cube, and cylinder are mapped into shape embedding space. This mapping is a classification problem! = σ < 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠. 𝑜𝑜𝑜𝑜 𝑜𝑜𝑖𝑖, 𝑣𝑣 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 > −γ /τ Where  𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠. 𝑜𝑜𝑜𝑜 is a neural networks  𝑣𝑣𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 is the concept embedding to be learned of cube  σ : sigmoid function  γ and τ are scaling constants. 39 Concept learner
  • 39. Program execution Work on object-based visual representation An intermediate set of objects is represented by a vector, as attention mask over all object in the scene. For example, Filter(Green_cube) outputs a mask (0,1,0,0). The output mask is fed into the next module (e.g Relate) 40
  • 40. Neural memories Theory of mind Neural reasoning A system view Deep learning 2.0 16/11/2020 41 Classic models Transformers Graph neural networks Unsupervised learning Deep learning 1.0 Agenda
  • 41. Contextualized recursive reasoning Thus far, QA tasks are straightforward and objective: Questioner: I will ask about what I don’t know. Answerer: I will answer what I know. Real life can be tricky, more subjective: Questioner: I will ask only questions I think they can answer. Answerer 1: This is what I think they want from an answer. Answerer 2: I will answer only what I think they think I can. 16/11/2020 42 Source: religious studies project  We need Theory of Mind to function socially.
  • 42. Sally and Anne Sally Anne Sally puts her cake into her basket Sally’s basket Anne’s box Sally goes out of the room. Anne takes Sally’s cake out of Sally’s basket and put this cake into Anne’s box Sally comes back to the room 1 2 4 5 3 Photo: wikipedia
  • 43. Social dilemma: Stag Hunt games Difficult decision: individual outcomes (selfish) or group outcomes (cooperative).  Together hunt Stag (both are cooperative): Both have more meat.  Solely hunt Hare (both are selfish): Both have less meat.  One hunts Stag (cooperative), other hunts Hare (selfish): Only one hunts hare has meat. Human evidence: Self-interested but considerate of others (cultures vary). Idea: Belief-based guilt-aversion  One experiences loss if it lets other down.  Necessitates Theory of Mind: reasoning about other’s mind.
  • 44. A neural theory of mind Successor representationsnext-step action probability goal Rabinowitz, Neil C., et al. "Machine theory of mind." arXiv preprint arXiv:1802.07740 (2018).
  • 45. Theory of Mind Agent with Guilt Aversion (ToMAGA) Update Theory of Mind  Predict whether other’s behaviour are cooperative or uncooperative  Updated the zero-order belief (what other will do)  Update the first-order belief (what other think about me) Guilt Aversion  Compute the expected material reward of other based on Theory of Mind  Compute the psychological rewards, i.e. “feeling guilty”  Reward shaping: subtract the expected loss of the other. Nguyen, Dung, et al. "Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning." Asian Conference on Machine Learning. PMLR, 2020.
  • 46. 47 System 1: Intuitive System 1: Intuitive System 1: Intuitive • Fast • Implicit/automatic • Pattern recognition • Multiple System 2: Analytical • Slow • Deliberate/rational • Careful analysis • Single, sequential • Hypothetical thought • Decoupled from data rep Single Memory • Facts • Semantics • Events and relational associations • Working space – temporal buffer Pattern recognition Reasoning
  • 47. Neural memories Theory of mind Neural reasoning A system view Deep learning 2.0 16/11/2020 48 Classic models Transformers Graph neural networks Unsupervised learning Deep learning 1.0 Summary
  • 48. End of part II 16/11/2020 49
  • 49. References Anonymous, “Neuralspatio-temporal reasoning with object-centric self-supervised learning”, https://openreview.net/pdf?id=rEaz5uTcL6Q Bello, Irwan, et al. "Neural optimizer search with reinforcement learning." arXiv preprint arXiv:1709.07417 (2017). Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." IEEE transactions on pattern analysis and machine intelligence 35.8 (2013): 1798-1828. Bottou, Léon. "From machine learning to machine reasoning." Machine learning 94.2 (2014): 133-149. Dehghani, Mostafa, et al. "Universal Transformers." International Conference on Learning Representations. 2018. Kien Do, Truyen Tran, and Svetha Venkatesh. "Graph Transformation Policy Network for Chemical Reaction Prediction." KDD’19. Kien Do, Truyen Tran, Svetha Venkatesh, “Learning deep matrix representations”,arXiv preprint arXiv:1703.01454 Gilmer, Justin, et al. "Neural message passing for quantum chemistry."arXiv preprint arXiv:1704.01212 (2017). Ha, David, Andrew Dai, and Quoc V. Le. "Hypernetworks." arXiv preprint arXiv:1609.09106 (2016). Heskes, Tom. "Stable fixed points of loopy belief propagation are local minima of the bethe free energy." Advances in neural information processing systems. 2003. Hudson, Drew A., and Christopher D. Manning. "Compositional attention networks for machine reasoning."arXiv preprint arXiv:1803.03067 (2018). Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196. Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM (JACM) 44.5 (1997): 697-725. Hung Le, Truyen Tran, Svetha Venkatesh, “Self-attentive associative memory”, ICML'20. Hung Le, Truyen Tran, Svetha Venkatesh, “Neural stored-program memory”, ICLR'20. 16/11/2020 50
  • 50. Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Dynamic Language Binding in Relational Visual Reasoning”, IJCAI’20. Le-Khac, Phuc H., Graham Healy, and Alan F. Smeaton. "Contrastive Representation Learning: A Framework and Review." arXiv preprint arXiv:2010.05113 (2020). Liu, Xiao, et al. "Self-supervised learning: Generative or contrastive." arXiv preprint arXiv:2006.08218 (2020). Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018). Mao, Jiayuan, et al. "The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision." International Conference on Learning Representations. 2019. Nguyen, Dung, et al. "Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning." Asian Conference on Machine Learning. PMLR, 2020. Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X-ray structure of dopamine transporter elucidates antidepressant mechanism." Nature 503.7474 (2013): 85-90. Pham, Trang, et al. "Column Networks for Collective Classification."AAAI. 2017. Ramsauer, Hubert, et al. "Hopfield networks is all you need." arXiv preprint arXiv:2008.02217 (2020). Rabinowitz, Neil C., et al. "Machine theory of mind." arXiv preprint arXiv:1802.07740 (2018). Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015. Tay, Yi, et al. "Efficient transformers: A survey." arXiv preprint arXiv:2009.06732 (2020). Xie, Tian, and Jeffrey C. Grossman. "Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties." Physical review letters 120.14 (2018): 145301. You, Jiaxuan, et al. "GraphRNN: Generating realistic graphs with deep auto-regressive models." ICML (2018). 16/11/2020 51 References (cont.)