Neural network approaches to machine reasoning

Neural approaches to
machine reasoning
July 17th 2018
O'Reilly Institute, Trinity College Dublin
David Mack
david@octavian.ai
https://octavian.ai

David Mack
https://octavian.ai
david@octavian.ai

Today I’ll cover:
1. What is machine reasoning
2. Knowledge
3. Neural reasoning approaches
4. Iterative reasoning with MACnets

Goals for this session
• Introduce you to interesting ideas
• Get you excited about neural reasoning!
• We will not cover the full technical details, but I’ve
included links to all of the material

What is machine reasoning?
• A system that answers questions about knowledge using
deduction or induction
• That is, doing something more complex than search-and-
retrieve
• E.g. “Who is most likely to win the world cup”
• E.g. “Which bus line visits the most pubs?”
• Can be single-shot (e.g. Google search) or interactive
(e.g. Chat bot)

What is machine reasoning?
• Many systems do reasoning on a limited set of questions
• We’re most interested in general(izeable) systems: How can we
answer a broad range of questions?
Google
maps

Knowledge
• Reasoning requires knowledge
• Many ways to represent knowledge:
• Sequences (e.g. language strings)
• Images
• Vectors
• Graphs

Knowledge graphs
• Can represent a diverse range of
information
• Can be continually extended
• Google’s Knowledge Graph has
over 1bn entities and helps
answer 30bn monthly searches
• Wikidata contains 50bn entities
and is freely available

A brief survey of neural reasoning approaches
Recurrent cell
(LSTM/GRU)
RNNTranslation
Question ->
Answer
Question ->
Database Query
Neural turing
machines
MACnets
Interactive Question
Answering
Reinforcement
learning
MacGraph
Note: These are just selected highlights, there are many many variations of these ideas in the literature

Recurrent neural network
Image source: Colah RNN tutorial
Time / Sequence ->

RNN extensions
Image source: Distil

Attention a differentiable way to query information from an array

Attention
Softmax
becomes
Attention mechanism
• Normalised sum of exponentials
• Result sums to 1.0
• “Increases contrast”

Iterative reasoning with
MACnets

The challenge: answer questions
about images
• The CLEVR dataset
• Synthetic
• Question, Answer, Image
• Question comes as
English and Functional
program
Image source: COMPOSITIONAL ATTENTION NETWORKS FOR MACHINE REASONING

Memory, Attention, and Composition
network (MACnet)
• Introduced by Drew
Hudson and Christopher
Manning at ICLR April
2018
• Answer questions on
CLEVR dataset to 99%
accuracy (humans get
93%)

Key idea: use RNN iteration as
instruction cycle (from Neural Turing Machines)
Input
Answer

Key idea: Attention over image and
text gives interpretability

Key idea: Use question words as the
instructions
Attention
Question
words
Control state Next control
state
Can we achieve recursion/algorithms
through self-talk?

Key idea: have separate control and
memory states
Memory
Control c1 c2 c3 c4
m1 m2 m3 m4
Time →

Key idea: Preprocess image and text
through existing architectures
Image passed through
ResNet101 Text passed through biLSTM
“question words”
“question”

MAC network performs iterative
reasoning
Attention
Attention

MAC cell

Results

Read more about everything
mentioned:
• Online Safari book on RNNs: https://www.safaribooksonline.com/library/view/neural-networks-
and/9781492037354/ch04.html
• Introduction to RNN/LSTM: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
• Attention and Augmented Recurrent Neural Networks https://distill.pub/2016/augmented-rnns/
• TensorFlow Neural Machine Translation Tutorial https://github.com/tensorflow/nmt
• Differentiable neural computers (DeepMind nature publication) https://deepmind.com/blog/differentiable-neural-
computers/
• MACnets https://arxiv.org/abs/1803.03067
• IQA: Visual Question Answering in Interactive Environments (Reinforcement learning reasoning):
https://arxiv.org/abs/1712.03316
• English to Cypher translation https://medium.com/octavian-ai/answering-english-questions-using-knowledge-graphs-
and-sequence-translation-2acbaa35a21d
• Applying MACnet to knowledge graph (work in progress): https://github.com/Octavian-ai/mac-graph
• Octavian’s research https://www.octavian.ai/articles https://twitter.com/Octavian_ai

Q&A
https://octavian.ai
David Mack
david@octavian.ai

Question to database
query translation

CLEVR-Graph: Answering questions
about mass transit graphs
• Synthetic dataset
• Question, Answer, Graph
triples
• Each question comes as
English, Functional
program and Cypher

Question to cypher query translation

Question to cypher query translation
How clean is Spoon Street?
MATCH (var1)
WHERE var1.name="Spoon Street"
WITH 1 AS foo, var1.cleanliness AS var2
RETURN var2
= DIRTY

Seq2Seq encodes then decodes
Image source: TensorFlow tutorials

Encoding the input
embedding
Encoded state
Encoded state

… then decode

In reality the output elements often
derive from specific input elements

This input-output mapping is hard work for the
RNN since everything is encoded together

… therefore use attention
FREQUENTLY USED TECHNIQUE

… therefore use attention Softmax
becomes
Attention mechanism
• Normalised sum of
exponentials
• Result sums to 1.0
• “Increases contrast”
FREQUENTLY USED TECHNIQUE

… therefore use attention

Seq2seq Results
• 100% translation accuracy on (reasonably simple)
CLEVR-graph question – cypher pairs
• Google: “Human evaluations show that [Seq2Seq] has
reduced translation errors by 60% compared to our
previous phrase-based system”

Control cell: Decides what system
should do next

Read cell: Reads from knowledge
using memory and control states

Write cell: Updates memory state
(similar to LSTM)

Attention
1. Compare query to each element in array giving scores
2. Apply softmax to normalise and focus scores
3. Multiply each element by its score
4. Sum all the elements

Neural graph memory
• Store a table of nodes and table of edges
• Use attention (aka content addressing) to retrieve data
from_node edge_props to_node
node_id node_props
node_id node_props
node_id node_props
node_id node_props
Nodes Edges

Let RNN cell read from a memory
1 2 3 4

What is a neural network?
• Neural network is one which transforms signals through trainable
layers

What is a neural network?
• Trained via backpropagation of errors and gradient descent
ERROR

LSTM cell
Long term state
passes straight
through
Short term state
Short term state
Long term state
“If you consider the LSTM cell as a black box, it can be used very much like a basic cell, except it will perform much
better; training will converge faster and it will detect long-term dependencies in the data.” -- Safari Books
https://www.safaribooksonline.com/library/view/neural-networks-and/9781492037354/ch04.html

Neural network approaches to machine reasoning

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Neural network approaches to machine reasoning