CS626: NLP, Speech and the Web
Pushpak Bhattacharyya
CSE Dept., IIT Bombay
Lecture 16, 18: Probabilistic parsing;
Parsing phenomena 26
thAugust, 2014
(lect 17 was on MDL and morphology by Vasu)
Example of Sentence labeling:
Parsing
[
S1[
S[
S[
VP[
VBCome][
NP[
NNPJuly]]]]
[
,,]
[
CCand]
[
S[
NP[
DTthe] [
JJIIT] [
NNcampus]]
[
VP[
AUXis]
[
ADJP[
JJabuzz]
[
PP[
INwith]
[
NP[
ADJP[
JJnew] [
CCand] [
VBGreturning]]
[
NNSstudents]]]]]]
[
..]]]
Noisy Channel Modeling
Source sentence
Target parse
T*= argmax [P(T|S)]
T
= argmax [P(T).P(S|T)]
T
= argmax [P(T)], since given the parse the T sentence is completely
determined and P(S|T)=1
Language Models
N-grams: sequence of n consecutive words/characters
Probabilistic / Stochastic Context Free Grammars:
Simple probabilistic models capable of handling recursion
A CFG with probabilities attached to rules
Rule probabilities how likely is it that a
particular rewrite rule is used?
PCFGs
Why PCFGs?
Intuitive probabilistic models for tree-structured languages
Algorithms are extensions of HMM algorithms
Better than the n-gram model for language
modeling.
Data driven parsing is tied up with what is seen in the training data
“Stand right walk left.” Often a message on the airport escalators.
“Walk left” can come from :- “After climbing , we realized the walk left was still considerable.”
“Stand right” can come from: - “The stand right though it seemed at that time, does not stand the scrutiny now.”
Then the given sentence “Stand right walk left” will never be parsed correctly.
Chomsky Hierarchy
Properties of the hierarchy:-
The containment is proper.
Grammar is not
learnable even for the lowest type from
positive examples alone.
Type 3 or regular grammar Type 2 or CFG Type 1 or CSG
Type 0
3 2
1
0 Type Type Type
Type
Laws of machine learning
“Learning from vaccuum (zero knowledge) is impossible”.
“Inductive bias decides what to learn”.
In case of POS tagger, bias is the tagset we assume
Some language phenomena
PP-attachment
Happens for verb phrases
Phenomen
Basic form
Example:-
Saw boys with bat_and_ball
Saw mountains with telescope
2
1
P - N
N -
V
Tree for V-N1-P-N2
VP
V NP
N PP
N
1 PN
P
1N
2VP
V N
P N
PP
N
1P N
P
1N
2Noun Attachement Verb Attachement
Wh-pronoun dropping
Category A:-
I know the boy whom you met yesterday.
I know the boy you met yesterday.
Category B:-
I know the boy who met you yesterday.
*I know the boy met you yesterday. (disregard the possibility of “...know <that>...”
Rule: Wh-pronoun can be dropped if it refers to the object of the sentence
corresponding to the subordinate clause
Wh-pronoun (2)
Complex sentences
Main Clause Subordinate clause
A) “I know the boy” “You met the boy yesterday”
B) “I know the boy” “The boy met you
yesterday”
Wh-pronoun (3)
Category B
The man who lives in Delhi saw the picture.
The man lives in Delhi saw the picture. (*)
Category A
1.
The man whom you met yesterday saw the picture.
The man you met yesterday saw the picture.
2.
The man whom you argued with saw the picture.
The man you argued with saw the picture.
Parsing example:
constituency, dependency,
deep semantics
How to parse?
Example
“Deeds people do, follow them to the end.”
Basic Sentence Structure
Subject Predicate
Verb Complement
(“Object” in some cases)
Constituency Parsing
Sentences are said to be composed of constituents (or phrases).
Constituent parse tree for the core of our example: “Deeds follow them.”
S
NP V
P NN
S Deeds
VP V
follow them
Dependency Parsing
Captures dependencies between entities
A step closer to deep semantics
Dependency parse tree for the core of our example: “Deeds follow them ”
root
follow
Deeds
them (V_main)
nsubj dobj(direct object)
Deep Semantics
WSD resolved
Attachment resolved
Co-reference resolved
follow
deeds them
agent object
(synonym: pursue)
Complex Sentence
A sentence is said to be complex if it has more than one verb.
“Deeds people do, follow them”
Constituency
S
NP VP
NNS S’ V NP
Deeds
NP VP
NNS
people V
do
follow PRON
them
“Deeds people do, follow them”
Dependency
Root
follow (v-main)
Deeds them
do
people
nsubj dobj
Wh-clause
nsubj
“Deeds people do, follow them”
Deep Semantics
follow (synonym: pursue)
: 01 them
agent object
: 01 do
people deeds
agent object
“Deeds people do, follow them”
Constituency
S
NP VP
NNS S’ V NP
Deeds
NP VP
NNS
people V
do
follow PRON them
“Deeds people do, follow them to the end”
PP
P NP
to DT NN
the end
Dependency
Root
follow (v-main)
Deeds them
do
people
nsubj dobj
Wh-clause
nsubj
to
end
the
“Deeds people do, follow them to the end”
pobj
det
Deep Semantics
follow (synonym: pursue)
:
01 them
agent object
: 01 do
people Deeds
agent object
“Deeds people do, follow them to the end”
end
go1
@definite
(Entry-point) (Entry-point)
Probability of a sentence
Probability of sentence
P(W0, W1, W2, W3…., Wi,…Wn)
•Frequentist approach
corpora the
in sentences S S
P #
) #
(
•N-grams based approach
) ...
| ( )...
| ( )
| ( ).
| ( ).
( )
...., (
)
( 1 0 1 0 2 0 1 3 0 1 2 0 1
0
P W W Wn P W P W W P W W W P W W W W P Wn W Wn S
P
assumption bigram
,
)
| ( )
(
1
1
1
0
n
i
i
i W
W P W
P
Probability of sentence
• POS tag based
•W0 – t0 W1 – t1 W2 – t2 …. Wn – tn
sequnece Tag
T
T W P S
P ( ) ( , )
•Parse tree based
trees parse All Tr
Tr W P W
P( ) ( , )
)
| ( ).
( )
,
( W Tr P Tr P W Tr
P
)
| ( . )
| (
)
| ( ).
( )
, (
1
0
1 i i
n
i
i
i
t P w t
t P
T W P T P T
S P
) (Tr
P
Uses of P(W)
1. Quality ensuring in translation output
2. Natural language generation. (Generating sentences from non textual inputs like pie charts)
3. Automatic summary generation
4. Speech to text and text to speech – next word prediction
Back to probabilistic parsing
Several Important Points
Phrases capture hierarchy
Span of phrases?
What pairs form dependency?
What is the label on the dependency arc?
Which disambiguation?
Words
Attachments
Co-reference
All these are amenable to machine learning
Formal Definition of PCFG
A PCFG consists of
A set of terminals {w
k}, k = 1,….,V
{w
k} = { child, teddy, bear, played…}
A set of non-terminals {N
i}, i = 1,…,n
{N
i} = { NP, VP, DT…}
A designated start symbol N
1
A set of rules {N
i
j}, where
jis a sequence of terminals & non-terminals
NP DT NN
A corresponding set of rule probabilities
Rule Probabilities
Rule probabilities are such that
E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5
P( NP NP PP) = 0.3
P( NP DT NN) = 0.2
Means 20 % of the training data parses use the rule NP DT NN
i
P(N
i j) 1
i
Probabilistic Context Free Grammars
S NP VP 1.0
NP DT NN 0.5
NP NNS 0.3
NP NP PP 0.2
PP P NP 1.0
VP VP PP 0.6
VP VBD NP 0.4
DT the 1.0
NN gunman 0.5
NN building 0.5
VBD sprayed 1.0
NNS bullets 1.0
P with 1.0
Example Parse t 1
The gunman sprayed the building with bullets.
S1.0
NP0.5 VP0.6 DT1.0 NN0.5
VBD1.0 NP0.5
PP1.0
DT1.0 NN0.5
P1.0 NP0.3
NNS1.0
bullets with
building the
The gunman
sprayed
P (t1) = 1.0 *
0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0
* 0.5 * 1.0 * 0.5 * 1.0 * 1.0 *
0.3 * 1.0 =
0.00225 VP0.4
Another Parse t 2
S1.0
NP0.5 VP0.4
DT1.0 NN0.5VBD1.0
NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3
NNS1.0 bullet s
buildingwith th
e The gunman sprayed
NP0.2
P (t2)
= 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0
= 0.0015
The gunman sprayed the building with bullets.
Probability of a sentence
Notation :
w
ab– subsequence w
a….w
b
N
jdominates w
a….w
bor yield(N
j) = w
a….w
b wa………..wbNj
1
1 1
1
: ( )
( ) ( , )
= ( ) ( | ) ( )
m
m m
t
m t
t yield t w
P w P w t
P t P w t P t
Where t is a parse tree of the
sentence
the..sweet..teddy ..bear NP
• Probability of a sentence = P(w
1m)
(
1m| ) = 1 P w t
If t is a parse tree
for the sentence w
1m, this will be 1
!!
Assumptions of the PCFG model
Place invariance :
P(NP DT NN) is same in locations 1 and 2
Context-free :
P(NP DT NN | anything outside “The child”)
= P(NP DT NN)
Ancestor free : At 2,
P(NP DT NN|its ancestor is VP)
= P(NP DT NN)
S NP
The child
VP
NP
The toy
1
2
Probability of a parse tree
Domination :We say N j dominates from k to l, symbolized as , if W k,l is derived from N j
P (tree |sentence) = P (tree | S 1,l )
where S1,l means that the start symbol S dominates the word sequence W1,l
P (t |s) approximately equals joint probability
of constituent non-terminals dominating the
sentence fragments (next slide)
Probability of a parse tree (cont.)
S1,l
NP1,2 VP3,l
N2,2 V3,3 PP4,l
P4,4 NP5,l W2,2
W4,4 DT1,1
W1,1 W3,3
W5,5 Wl,l P ( t|s ) = P (t | S1,l )
=
P ( NP
1,2, DT
1,1, w
1,1N
2,2, w
2,2VP
3,l, V
3,3, w
3,3PP
4,l, P
4,4, w
4,4NP
5,l, w
5…l|
S1,l)
= P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) *
P(w1,1 | DT1,1) * P (w2,2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3,3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4,4|P4,4) * P (w5…l | NP5,l)
(Using Chain Rule, Context Freeness and Ancestor Freeness )
Exercise: Probabilistic Dependency Tree
Take the cue from PCFG
Probability of a sentence is the probability that the sentence is the yield of S, or the probability that S dominates the sentence
This probability is the probability of different non-terminals dominating segments in the sentence
Get a probabilistic formulation of
dependency tree
Problems with probabilities in PCFG
Grammar rules
S -> aSb P = 0.7
S -> ε P = 0.3
Language of the above grammar
P( t| ) = 0.7*0.7*0.3 = 0.147
Where is the rest of 0.853?
ε, ab, a b , a b a b ...
:
L
2 2 3 3
n n2 2
b a
S
S
S a
a
b
b
ε
Problems with probabilities
Grammar rules
S -> aSb P = 0.7
S -> ab P = 0.2
S -> ε P = 0.1
Language of the above grammar
P( t| ) = 0.2 + 0.7*0.1 = 0.27
Where is the rest of 0.73?
ε, ab, a b , a b a b ...
:
L
2 2 3 3
n nb a
S (0.7)
S (0.1)
a b
ε S (0.2)
ab
HMM ↔ PCFG
O observed sequence ↔ w 1m sentence
X state sequence ↔ t parse tree
model ↔ G grammar
Three fundamental questions
HMM ↔ PCFG
How likely is a certain observation given the model? ↔ How likely is a sentence given the grammar?
How to choose a state sequence which best
explains the observations? ↔ How to choose a parse which best supports the sentence?
arg max ( | , )
X
P X O arg max ( |
1m, )
t
P t w G
↔
(
1m| ) P w G ( | )
P O ↔
HMM ↔ PCFG
How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences?
arg max ( P O | )
arg max (
1m| )
G
P w G
↔
Interesting Probabilities
The gunman sprayed the building with bullets 1 2 3 4 5 6 7 8
(4, 5)
NPN
1NP
(4, 5)
NPWhat is the probability of having a NP at this position such that it will derive
“the building” ? -
What is the probability of starting from N
1and deriving “The gunman sprayed”, a NP and “with bullets” ? -
Inside Probabilities
Outside Probabilities
Interesting Probabilities
Random variables to be considered
The non-terminal being expanded.
E.g., NP
The word-span covered by the non-terminal.
E.g., (4,5) refers to words “the building”
While calculating probabilities, consider:
The rule to be used for expansion : E.g., NP DT NN
The probabilities associated with the RHS non-
terminals : E.g., DT subtree’s inside/outside
probabilities & NN subtree’s inside/outside
probabilities
Outside Probability
j (p,q) : The probability of beginning with N 1
& generating the non-terminal N j pq and all words outside w p ..w q
1( 1) ( 1)
( , ) ( ,
j, | )
j
p q P w
pN
pqw
q mG
w
1………w
p-1w
p…w
qw
q+1……… w
mN
1N
jInside Probabilities
j (p,q) : The probability of generating the words w p ..w q starting with the non-terminal
N j pq . ( , ) ( |
j, )
j
p q P w
pqN
pqG
w
1………w
p-1w
p…w
qw
q+1……… w
m
N
1N
jOutside & Inside Probabilities:
example
The gunman sprayed the building with bullets
1 2 3 4 5 6 7
4,5
(4, 5) for "the building"
(The gunman sprayed, , with bullets | )
NP
P NP G
N
1NP
(4, 5) for "the building" (the building |
4,5, )
NP
P NP G
Shared Sub-Problems: Example
CKY Parsing: CNF
CKY parsing requires that the grammar consist of ε-free, binary rules =
Chomsky Normal Form
All rules of the form:
A BC or Aa
What does the tree look like?
What if my CFG isn’t in CNF?
A → B C D → w
CKY Algorithm
Illustrating CYK [Cocke, Younger, Kashmi] Algo
S NP VP 1.0
NP DT NN 0.5
NP NNS 0.3
NP NP PP 0.2
PP P NP 1.0
VP VP PP 0.6
VP VBD NP 0.4
• DT the 1.0
• NN gunman 0.5
• NN building 0.5
• VBD sprayed 1.0
• NNS bullets 1.0
• P with 1.0
CYK: Start with (0,1)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1
1 ---
2 --- --- -- 3 --- ---
--
--- -
4 ---
-
--- --
--- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK: Keep filling diagonals
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1
1 --- NN1-2 2 --- ---
-- 3 --- ---
--
--- -
4 ---
-
--- --
--- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK: Try getting higher level structures
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 1 --- NN1-2 2 --- ---
-- 3 --- ---
--
--- -
4 ---
-
--- --
--- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK: Diagonal continues
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 1 --- NN1-2 2 --- ---
--
VBD2-3 3 --- ---
--
--- -
4 ---
-
--- --
--- -
--- --
5 ---
-
--- --
--- -
--- --
--- -
6 ---
-
--- --
--- -
--- --
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- 1 --- NN1-2 --- 2 --- ---
--
VBD2-3 3 --- ---
--
---
4 ---
-
--- --
--- --- ---
5 ---
-
--- --
--- --- ---
--- -
6 ---
-
--- --
--- --- ---
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- 1 --- NN1-2 --- 2 --- ---
--
VBD2-3 3 --- ---
--
--- DT3-4
4 ---
-
--- --
--- --- ---
5 ---
-
--- --
--- --- ---
--- -
6 ---
-
--- --
--- --- ---
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- --- -- 1 --- NN1-2 --- ---
-- 2 --- ---
--
VBD2-3 --- -- 3 --- ---
--
--- DT3-4 4 --- ---
--
--- --- --
NN4-5 5 --- ---
--
--- --- --
--- --- 6 --- ---
--
--- --- --
--- ---
--- -
CYK: starts filling the 5 th column
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -- 1 --- NN1-2 ---
-
--- -- 2 --- ---
--
VBD2-3 --- -- 3 --- ---
--
--- -
DT3-4 NP3-5
4 ---
-
--- --
--- -
--- --
NN4-5
5 ---
-
--- --
--- -
--- --
--- -
6 ---
-
--- --
--- -
--- --
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
1 --- NN1-2 --- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 3 --- ---
--
--- -
DT3-4 NP3-5
4 ---
-
--- --
--- -
--- -
NN4-5
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
1 --- NN1-2 --- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 3 --- ---
--
--- -
DT3-4 NP3-5
4 ---
-
--- --
--- -
--- -
NN4-5
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK: S found, but NO termination!
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 1 --- NN1-2 ---
-
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 3 --- ---
--
--- -
DT3-4 NP3-5
4 ---
-
--- --
--- -
--- -
NN4-5
5 ---
-
--- --
--- -
--- -
--- -
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 1 --- NN1-2 ---
-
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 3 --- ---
--
--- -
DT3-4 NP3-5
4 ---
-
--- --
--- -
--- -
NN4-5
5 ---
-
--- --
--- -
--- -
--- -
P5-6
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
1 --- NN1-2 --- -
--- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 --- -
3 --- --- --
--- -
DT3-4 NP3-5 --- -
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6
6 ---
-
--- --
--- -
--- -
--- -
--- -
CYK: Control moves to last column
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
1 --- NN1-2 --- -
--- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 --- -
3 --- --- --
--- -
DT3-4 NP3-5 --- -
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6
6 ---
-
--- --
--- -
--- -
--- -
--- -
NP6-7 NNS6-7
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
1 --- NN1-2 --- -
--- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 --- -
3 --- --- --
--- -
DT3-4 NP3-5 --- -
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6 PP5-7
6 ---
-
--- --
--- -
--- -
--- -
--- -
NP6-7 NNS6-7
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- --
S0-5 --- -
1 --- NN1-2 --- -
--- --
--- -
--- -
2 --- --- --
VBD2-3 --- --
VP2-5 --- -
3 --- --- --
--- -
DT3-4 NP3-5 --- -
NP3-7
4 ---
-
--- --
--- -
--- --
NN4-5 --- -
--- -
5 ---
-
--- --
--- -
--- --
--- -
P5-6 PP5-7
6 ---
-
--- --
--- -
--- --
--- -
--- -
NP6-7 NNS6-7
CYK (cont…)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
1 --- NN1-2 --- -
--- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 --- -
VP2-7 3 --- ---
--
--- -
DT3-4 NP3-5 --- -
NP3-7
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6 PP5-7
6 ---
-
--- --
--- -
--- -
--- -
--- -
NP6-7 NNS6-7
CYK: filling the last column
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
1 --- NN1-2 --- -
--- -
--- -
--- -
--- -
2 --- --- --
VBD2-
3
--- -
VP2-5 --- -
VP2-7 3 --- ---
--
--- -
DT3-4 NP3-5 --- -
NP3-7
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6 PP5-7
6 ---
-
--- --
--- -
--- -
--- -
--- -
NP6-7 NNS6-7
CYK: terminates with S in (0,7)
0
The
1gunman
2sprayed
3the
4building
5with
6bullets
7.
To From
1 2 3 4 5 6 7
0 DT0-1 NP0-2 --- -
--- -
S0-5 --- -
S0-7 1 --- NN1-2 ---
-
--- -
--- -
--- -
--- -
2 --- --- --
VBD2-3 --- -
VP2-5 --- -
VP2-7 3 --- ---
--
--- -
DT3-4 NP3-5 --- -
NP3-7
4 ---
-
--- --
--- -
--- -
NN4-5 --- -
--- -
5 ---
-
--- --
--- -
--- -
--- -
P5-6 PP5-7
6 ---
-
--- --
--- -
--- -
--- -
--- -
NP6-7 NNS6-7
CYK: Extracting the Parse Tree
The parse tree is obtained by keeping back pointers.
S (0-7) NP (0-
2)
VP (2- 7)
VBD (2- 3)
NP (3- 7)
DT (0- 1)
NN (1- 2)
The gunma
n
sprayed
NP (3- 5)
PP (5- 7) DT (3-
4)
NN (4- 5)
P (5- 6)
NP (6-7) NNS (6-7) the building with
bullets
Probabilistic parse tree
construction
Probabilistic Context Free Grammars
S NP VP 1.0
NP DT NN 0.5
NP NNS 0.3
NP NP PP 0.2
PP P NP 1.0
VP VP PP 0.6
VP VBD NP 0.4
DT the 1.0
NN gunman 0.5
NN building 0.5
VBD sprayed 1.0
NNS bullets 1.0
P with 1.0
Calculating Inside probabilities j (p,q)
Base case:
( , ) ( |
j, ) (
j| )
j
k k P w
kN
kkG P N
kkw
kG
Base case is used for rules which derive the words or terminals directly
E.g., Suppose N j = NN is being considered &
NN building is one of the rules with probability 0.5
5,5
5,5
(5, 5) ( | , )
( | ) 0.5
NN
P building NN G P NN building G
Induction Step: Assuming Grammar in Chomsky Normal Form
Induction step :
w
pN
jN
rN
sw
dw
d+1w
q1
,
( , ) ( | , )
( ) *
( , ) * ( 1, )
j
j pq pq
q
j r s
r s d p r s
p q P w N G
P N N N
p d
d q
Consider different splits of the words - indicated by d E.g., the huge building
Consider different non-terminals to be used in the rule:
NP DT NN, NP DT NNS are available options Consider summation over all these.
Split here for d=2 d=3
The Bottom-Up Approach
The idea of induction
Consider “the gunman”
Base cases : Apply unary rules DT the Prob = 1.0 NN gunman Prob = 0.5
Induction : Prob that a NP covers these 2 words
= P (NP DT NN) * P (DT deriving the word
“the”) * P (NN deriving the word “gunman”)
= 0.5 * 1.0 * 0.5 = 0.25
The gunman
NP0.5
DT1.0 NN0.5
Parse Triangle (probability of output observation, the sentence)
1 1
1 1 1 1 1
(
m| ) (
m| ) (
m|
m, ) (1, )
P w G P N w G P w N G m
• A parse triangle is constructed for calculating
j(p,q)
• Probability of a sentence using
j(p,q):
Parse Triangle
The (1)
gunman (2)
sprayed (3)
the (4)
building (5)
with (6)
bullets (7) 0
1 2 3 4 5 6
• Fill diagonals with
j( , ) k k
DT
1.0
NN
0.5
NN
0.5
NNS
1.0
VBD
1.0
DT
1.0
P
1.0
from to
Parse Triangle
The (1)
gunman (2)
sprayed (3)
the (4)
building (5)
with (6)
bullets (7) 1
2 3 4 5 6 7
• Calculate using induction formula
DT
1.0
NN
0.5
NN
0.5
NNS
1.0
VBD
1.0
DT
1.0
P
1.0
(1, 2) (the gunman |
1,2, )
( ) * (1,1) * (2, 2)
0.5 *1.0 * 0.5 0.25
NP
DT NN
P NP G
P NP DT NN
NP
0.25
Example Parse t 1
S1.0
NP0.5 VP0.6
DT1.0 NN0.5
VBD1.0 NP0.5
PP1.0
DT1.0 NN0.5
P1.0 NP0.3
NNS1.0
bullet s
with building
th e The gunman
sprayed VP0.4
Rule used here is
VP VP PP
The gunman sprayed the building with bullets.
Another Parse t 2
S1.0
NP0.5 VP0.4
DT1.0 NN0.5VBD1.0
NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3
NNS1.0 bullet s
buildingwith th
e The gunmansprayed
NP0.2
Rule used here is
VP VBD NP
The gunman sprayed the building with bullets.
β calculation
(3, 7) (sprayed the building with bullets |
3,7, )
( ) * (3, 5) * (6, 7)
( ) * (3, 3) * (4, 7)
0.6 *1.0 * 0.3 0.4 *1.0 * 0.015 0.186
VP
VP PP
VBD NP
P VP G
P VP VP PP P VP VBD NP
0 The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7
Parse tree for the given sentence using probabilistic CYK parsing
•Two parse trees are possible because the sentence has attachment ambiguity .
• Total 16 multiplications are required to make both the parse trees using probabilistic CYK.
•Number of multiplications is less in comparison to a probabilistic
parsing which prepares the two parse trees independently with 28
multiplication.
The 1
gunman 2
Sprayed 3
the 4
Building 5
with 6
Bullets 7 0 βDT (0-1)
=1.0
βNP (0-2)
=0.25
βS(0-7)
=0.006
1 βNN (1-2)
=0.5
2 βVBD(2-3)
=1.0
βVP (2-5)
=0.1
βVP(2-7)
=0.024
3 βDT(3-4)
=1.0
βNP (3-5)
=0.25
βNP(3-7)
=0.015
4 βNN (4-5)
=0.5
5 βP(5-6)
=1.0
βPP(5-7)
=0.3
6 βNP/NNS(6-7)
=1.0