In this talk, we present how the Watson program, IBM's famous Jeopardy playing computer, works (based on papers published by IBM), we look at some aspects of potential scoring approaches, and we examine how Watson compares to several well known systems and some preliminary thoughts on using it in future artificial intelligence and cognitive science approaches.
1. Tetherless World Constellation
Why Watson Won: A
cognitive perspective
Jim Hendler
and Simon Ellis
Tetherless World Professor of Computer,Web and Cognitive Sciences
Director, Rensselaer Institute for Data Exploration and Applications
Rensselaer Polytechnic Institute (RPI)
http://www.cs.rpi.edu/~hendler
@jahendler (twitter)
5. Is Watson cognitive?
???
“The computer‟s techniques for unraveling Jeopardy! clues sounded
just like mine. That machine zeroes in on key words in a clue, then
combs its memory (in Watson‟s case, a 15-terabyte data bank of
human knowledge) for clusters of associations with those words. It
rigorously checks the top hits against all the contextual information it
can muster: the category name; the kind of answer being sought; the
time, place, and gender hinted at in the clue; and so on. And when it
feels „sure‟ enough, it decides to buzz. This is all an instant, intuitive
process for a human Jeopardy! player, but I felt convinced that under
the hood my brain was doing more or less the same thing.”
— Ken Jennings
9. ???
Question analysis
What is the question asking for?
Which terms in the question refer to the answer?
Given any natural language question, how can Watson
accurately discover this information?
Who is the president of
Rensselaer Polytechnic Institute?
Question
Analysis
Focus Terms:
“Who”, “president of Rensselaer
Polytechnic Institute”
Answer Types: Person, President
10. Parsing and semantic analysis
What information about a previously unseen piece of
English text can Watson determine?
???
How is this information useful?
Natural Language Parsing
Semantic Analysis
- grammatical structure
- meanings of words, phrases, etc.
- parts of speech
- synonyms, entailment
- relationships between words
- hypernyms, hyponyms
- ...etc.
- ...etc.
13. Primary Search
???
Primary Search is used to generate the corpus of
information from which to take candidate answers,
passages, supporting evidence, and essentially all textual
input to the system
It formulates queries based on the results of Question
Analysis
These queries are passed into a (cached) search engine
which returns a set number of highly relevant documents
and their ranks.
on the open Web this could be a regular search engine (our
14. Candidate Generation
???
Candidate Generation generates a wide net of possible
answers for the question from each document.
Using each document, and the passages created by
Search Result Processing, we generate candidates using
three techniques:
Title of Document (T.O.D.): Adds the title of the document as a
candidate.
Wikipedia Title Candidate Generation: Adds any noun phrases
within the document‟s passage texts that are also the titles of
Wikipedia articles.
Anchor Text Candidate Generation: Adds candidates based on
the hyperlinks and metadata within the document.
17. Scoring
Analyzes how well a candidate answer relates to the
question
Two basic types of scoring algorithm
Context-independent scoring
Context-dependent scoring
???
18. Types of scorers
Context-independent
Question Analysis
Ontologies (DBpedia, YAGO, etc)
???
Type hierarchy reasoning
Context-dependent
Analyzes feature of the natural language environment where
candidates were found
Relies on “passages” found during search
Many special purpose ones used in Jeopardy
19. Scorers
Passage Term Match
Textual Alignment
???
Skip-Bigram
Each of these scores supportive evidence
These scores are then merged to produce a single candidate
score
20. Example:Textual Alignment
???
Finds an optimal alignment of a question and a passage
Assigns “partial credit” for close matches
“Who is the President of RPI?”
Who
President of RPI.
Shirley Ann Jackson is the President of RPI.
21. Skip-Bigram
Constructs a graph
Nodes represent terms (syntactic objects)
???
Edges represent relations
Extracts skip-bigrams
A skip-bigram is a pair of nodes either directly connected or
which have only one intermediate node
Skip-bigrams represent close relationships between terms
Scores based on number of common skip-bigrams
23. Watson Summary
Tetherless World Constellation
• Watson works by
– Analyzing the question
• natural language parsing
• text extraction
– Generating a large number of
candidates
• mostly search heuristics
– Scoring each
• through multiple scorers
• with weights adjusted by learning algorithm
– Returning top candidate
24. MiniDeepQA (Not Watson!)
???
RPI students implementing a DeepQA pipeline to explore
the principles underlying this kind of Q/A system
(THIS IS NOT WATSON!)
Pipeline development
Data caching
Graphical and command line interfaces
Parsing
Scoring
30. Scoring
???
One of DeepQA‟s main strengths is aggregating a
number of different scoring algorithms capable of
running in parallel.
RPI scorers are primitive compared to IBM‟s, but
allow us to explore the principles
allow us to explore different algorithms for computing
scores
allow us to create new ones not tried by IBM
31. Scoring Principles: combine
evidence
He was the Prime Minister of Canada in 1993.
candidates could include Trudeau, Harper, Campbell, Chretien,
Mulroney…
Try (Research):
Trudeau was Prime Minister of Canada in 1993 (doesn‟t match)
Campbell was Prime Minister of Canada in 1993 (MATCH)
Chretien was Prime Minister of Canada in 1993 (MATCH)
Scoring Research & type match
Trudeau: Re-search NO; Type: Yes
Campbell: Re-search YES; Type: No
Chretien: Re-search YES; Type: Yes
WHO WAS CHRETIEN?
???
32. New Scoring types
???
We can explore how new kinds of information can be
added to the Watson scoring pipeline
Example: new NLP extraction techniques
Example: Specialized Web Sources
Adding a ML-based extractor built by Heng Ji
Database advisor project
Example: More complex inferencing
Jeopardy questions are unambiguous, real world questions aren‟t
•
•
Where is Montreal?
Who is Jim Hendler?
Example: Special purpose reasoning…
33. Special purpose reasoning
???
• Can we match simulation (or steer) large scale
simulations to help answer NL questions?
-
eg. Answer questions such as “Why” and “How”
integrated with large scale simulations
34. Alternate Universe Reasoning
(Contexts)
How can a Watson reasoner appropriately use Q/A
contexts?
Where was Yoda born?
Very little is known about Yoda's early life.
He was from a remote planet,
but which one remains a mystery.
Where was Yoda made?
designed and built by Stuart Freeborn
The Yoda puppet was originally
for LucasFilm and Industrial Light & Magic.
Where did Yoda live?
Jedi Master Yoda went into voluntary exile on Dagobah
Where did Yoda live in the Phantom Menace?
???
35. But back to the original question
Tetherless World Constellation
• Q: How does Watson fare as a
cognitive model?
• A: Poorly
– no conversational ability
– no concept of self
– no deeper reasoning
…
• Q: How does Watson fare as a model
of question answering?
36. Watson and Q/A
Tetherless World Constellation
• Watson’s feed-forward pipeline has
the following properties
– lots of candidates generated
• the more the better
– “ad hoc” filtering pipelines
• domain independent usually score lower
than domain dependent
– no “counter-reasoning” between
answers
• separately scored, only comparison is
numbers
37. Production rules, modules, etc
Tetherless World Constellation
Production Rule style Architectures cf ACT-R (Anderson 1974; …2012)
- modularization, but not Watson style
- parallelization, but in rule productions (procedural memory)
- declarative memory is fact based
Watson is not well correlated, except for using search for declarative
memory
38. Network based
Tetherless World Constellation
Network based architectures (cf. spreading activation (Collins 75),
marker-passing (Hendler 86) … Microsaint 2006)
- positive activations
- inhibitory nodes (or other negative enforcers)
Watson has no negative inhibition, does use network-based scorers
39. MAC/FAC
Tetherless World Constellation
MAC/FAC (Gentner & Forbus, 1991)
Many are chosen, few are called model of analogic reasoning
Strong correspondence in performance, not in mechanism
New work by Forbus (SME) uses a more feed-forward mechanism
(Discussions in progress)
40. Cognitive Architecture? Watson as “component”
Decision Making
Memory
Reasoning
Watson, Cogito, and Clarion
Office of Research
41. Summary
Tetherless World Constellation
• Watson won by a combination of
–
–
–
–
–
natural language processing
search technologies
semantic typing (minimal reasoning)
scoring heuristics
machine learning (scorer tuning)
• Watson Q/A has some interesting analogies to
cognitive architectures of the past
– but mainly at a “level of abstraction”
• Watson as a memory component in a more
complex cognitive system is a very intriguing
possibility