• No results found

CS626: NLP, Speech and the Web

N/A
N/A
Protected

Academic year: 2022

Share "CS626: NLP, Speech and the Web"

Copied!
167
0
0

Loading.... (view fulltext now)

Full text

(1)

CS626: NLP, Speech and the Web

Pushpak Bhattacharyya

CSE Dept., IIT Bombay

Lecture 16, 18: Probabilistic parsing;

Parsing phenomena 26

th

August, 2014

(lect 17 was on MDL and morphology by Vasu)

(2)

Example of Sentence labeling:

Parsing

[

S1

[

S

[

S

[

VP

[

VB

Come][

NP

[

NNP

July]]]]

[

,

,]

[

CC

and]

[

S

[

NP

[

DT

the] [

JJ

IIT] [

NN

campus]]

[

VP

[

AUX

is]

[

ADJP

[

JJ

abuzz]

[

PP

[

IN

with]

[

NP

[

ADJP

[

JJ

new] [

CC

and] [

VBG

returning]]

[

NNS

students]]]]]]

[

.

.]]]

(3)

Noisy Channel Modeling

Source sentence

Target parse

T*= argmax [P(T|S)]

T

= argmax [P(T).P(S|T)]

T

= argmax [P(T)], since given the parse the T sentence is completely

determined and P(S|T)=1

(4)

Language Models

N-grams: sequence of n consecutive words/characters

Probabilistic / Stochastic Context Free Grammars:

Simple probabilistic models capable of handling recursion

A CFG with probabilities attached to rules

Rule probabilities  how likely is it that a

particular rewrite rule is used?

(5)

PCFGs

Why PCFGs?

Intuitive probabilistic models for tree-structured languages

Algorithms are extensions of HMM algorithms

Better than the n-gram model for language

modeling.

(6)

Data driven parsing is tied up with what is seen in the training data

“Stand right walk left.” Often a message on the airport escalators.

“Walk left” can come from :- “After climbing , we realized the walk left was still considerable.”

“Stand right” can come from: - “The stand right though it seemed at that time, does not stand the scrutiny now.”

Then the given sentence “Stand right walk left” will never be parsed correctly.

(7)

Chomsky Hierarchy

Properties of the hierarchy:-

The containment is proper.

Grammar is not

learnable even for the lowest type from

positive examples alone.

Type 3 or regular grammar Type 2 or CFG Type 1 or CSG

Type 0

3 2

1

0 Type Type Type

Type   

(8)

Laws of machine learning

“Learning from vaccuum (zero knowledge) is impossible”.

“Inductive bias decides what to learn”.

In case of POS tagger, bias is the tagset we assume

(9)

Some language phenomena

(10)

PP-attachment

Happens for verb phrases

Phenomen

Basic form

Example:-

Saw boys with bat_and_ball

Saw mountains with telescope

2

1

P - N

N -

V 

(11)

Tree for V-N1-P-N2

VP

V NP

N PP

N

1 P

N

P

1

N

2

VP

V N

P N

PP

N

1

P N

P

1

N

2

Noun Attachement Verb Attachement

(12)

Wh-pronoun dropping

Category A:-

I know the boy whom you met yesterday.

I know the boy you met yesterday.

Category B:-

I know the boy who met you yesterday.

*I know the boy met you yesterday. (disregard the possibility of “...know <that>...”

Rule: Wh-pronoun can be dropped if it refers to the object of the sentence

corresponding to the subordinate clause

(13)

Wh-pronoun (2)

Complex sentences

Main Clause Subordinate clause

A) “I know the boy” “You met the boy yesterday”

B) “I know the boy” “The boy met you

yesterday”

(14)

Wh-pronoun (3)

Category B

The man who lives in Delhi saw the picture.

The man lives in Delhi saw the picture. (*)

Category A

1.

The man whom you met yesterday saw the picture.

The man you met yesterday saw the picture.

2.

The man whom you argued with saw the picture.

The man you argued with saw the picture.

(15)

Parsing example:

constituency, dependency,

deep semantics

(16)

How to parse?

Example

“Deeds people do, follow them to the end.”

Basic Sentence Structure

Subject Predicate

Verb Complement

(“Object” in some cases)

(17)

Constituency Parsing

Sentences are said to be composed of constituents (or phrases).

Constituent parse tree for the core of our example: “Deeds follow them.”

S

NP V

P NN

S Deeds

VP V

follow them

(18)

Dependency Parsing

Captures dependencies between entities

A step closer to deep semantics

Dependency parse tree for the core of our example: “Deeds follow them ”

root

follow

Deeds

them (V_main)

nsubj dobj(direct object)

(19)

Deep Semantics

WSD resolved

Attachment resolved

Co-reference resolved

follow

deeds them

agent object

(synonym: pursue)

(20)

Complex Sentence

A sentence is said to be complex if it has more than one verb.

“Deeds people do, follow them”

(21)

Constituency

S

NP VP

NNS S’ V NP

Deeds

NP VP

NNS

people V

do

follow PRON

them

“Deeds people do, follow them”

(22)

Dependency

Root

follow (v-main)

Deeds them

do

people

nsubj dobj

Wh-clause

nsubj

“Deeds people do, follow them”

(23)

Deep Semantics

follow (synonym: pursue)

: 01 them

agent object

: 01 do

people deeds

agent object

“Deeds people do, follow them”

(24)

Constituency

S

NP VP

NNS S’ V NP

Deeds

NP VP

NNS

people V

do

follow PRON them

“Deeds people do, follow them to the end”

PP

P NP

to DT NN

the end

(25)

Dependency

Root

follow (v-main)

Deeds them

do

people

nsubj dobj

Wh-clause

nsubj

to

end

the

“Deeds people do, follow them to the end”

pobj

det

(26)

Deep Semantics

follow (synonym: pursue)

:

01 them

agent object

: 01 do

people Deeds

agent object

“Deeds people do, follow them to the end”

end

go1

@definite

(Entry-point) (Entry-point)

(27)

Probability of a sentence

(28)

Probability of sentence

P(W0, W1, W2, W3…., Wi,…Wn)

•Frequentist approach

corpora the

in sentences S S

P #

) #

( 

•N-grams based approach

) ...

| ( )...

| ( )

| ( ).

| ( ).

( )

...., (

)

( 1 0 1 0 2 0 1 3 0 1 2 0 1

0

P W W Wn P W P W W P W W W P W W W W P Wn W Wn S

P

assumption bigram

,

)

| ( )

(

1

1

1

0

n

i

i

i W

W P W

P

(29)

Probability of sentence

POS tag based

•W0 – t0 W1 – t1 W2 – t2 …. Wn – tn

sequnece Tag

T

T W P S

P ( ) ( , )

•Parse tree based

trees parse All Tr

Tr W P W

P( ) ( , )

)

| ( ).

( )

,

( W Tr P Tr P W Tr

P

)

| ( . )

| (

)

| ( ).

( )

, (

1

0

1 i i

n

i

i

i

t P w t

t P

T W P T P T

S P

) (Tr

P

(30)

Uses of P(W)

1. Quality ensuring in translation output

2. Natural language generation. (Generating sentences from non textual inputs like pie charts)

3. Automatic summary generation

4. Speech to text and text to speech – next word prediction

(31)

Back to probabilistic parsing

(32)

Several Important Points

Phrases capture hierarchy

Span of phrases?

What pairs form dependency?

What is the label on the dependency arc?

Which disambiguation?

Words

Attachments

Co-reference

All these are amenable to machine learning

(33)

Formal Definition of PCFG

A PCFG consists of

A set of terminals {w

k

}, k = 1,….,V

{w

k

} = { child, teddy, bear, played…}

A set of non-terminals {N

i

}, i = 1,…,n

{N

i

} = { NP, VP, DT…}

A designated start symbol N

1

A set of rules {N

i

 

j

}, where 

j

is a sequence of terminals & non-terminals

NP  DT NN

A corresponding set of rule probabilities

(34)

Rule Probabilities

Rule probabilities are such that

E.g., P( NP  DT NN) = 0.2 P( NP  NN) = 0.5

P( NP  NP PP) = 0.3

P( NP  DT NN) = 0.2

Means 20 % of the training data parses use the rule NP  DT NN

i

P(N

i j

) 1

i

   

(35)

Probabilistic Context Free Grammars

S  NP VP 1.0

NP  DT NN 0.5

NP  NNS 0.3

NP  NP PP 0.2

PP  P NP 1.0

VP  VP PP 0.6

VP  VBD NP 0.4

DT  the 1.0

NN  gunman 0.5

NN  building 0.5

VBD  sprayed 1.0

NNS  bullets 1.0

P  with 1.0

(36)

Example Parse t 1

The gunman sprayed the building with bullets.

S1.0

NP0.5 VP0.6 DT1.0 NN0.5

VBD1.0 NP0.5

PP1.0

DT1.0 NN0.5

P1.0 NP0.3

NNS1.0

bullets with

building the

The gunman

sprayed

P (t1) = 1.0 *

0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0

* 0.5 * 1.0 * 0.5 * 1.0 * 1.0 *

0.3 * 1.0 =

0.00225 VP0.4

(37)

Another Parse t 2

S1.0

NP0.5 VP0.4

DT1.0 NN0.5VBD1.0

NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3

NNS1.0 bullet s

buildingwith th

e The gunman sprayed

NP0.2

P (t2)

= 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0

= 0.0015

The gunman sprayed the building with bullets.

(38)

Probability of a sentence

Notation :

w

ab

– subsequence w

a

….w

b

N

j

dominates w

a

….w

b

or yield(N

j

) = w

a

….w

b wa………..wb

Nj

1

1 1

1

: ( )

( ) ( , )

= ( ) ( | ) ( )

m

m m

t

m t

t yield t w

P w P w t

P t P w t P t

Where t is a parse tree of the

sentence

the..sweet..teddy ..bear NP

• Probability of a sentence = P(w

1m

)

(

1m

| ) = 1 P w t

 If t is a parse tree

for the sentence w

1m

, this will be 1

!!

(39)

Assumptions of the PCFG model

Place invariance :

P(NP  DT NN) is same in locations 1 and 2

Context-free :

P(NP  DT NN | anything outside “The child”)

= P(NP  DT NN)

Ancestor free : At 2,

P(NP  DT NN|its ancestor is VP)

= P(NP DT NN)

S NP

The child

VP

NP

The toy

1

2

(40)

Probability of a parse tree

Domination :We say N j dominates from k to l, symbolized as , if W k,l is derived from N j

P (tree |sentence) = P (tree | S 1,l )

where S1,l means that the start symbol S dominates the word sequence W1,l

P (t |s) approximately equals joint probability

of constituent non-terminals dominating the

sentence fragments (next slide)

(41)

Probability of a parse tree (cont.)

S1,l

NP1,2 VP3,l

N2,2 V3,3 PP4,l

P4,4 NP5,l W2,2

W4,4 DT1,1

W1,1 W3,3

W5,5 Wl,l P ( t|s ) = P (t | S1,l )

=

P ( NP

1,2

, DT

1,1

, w

1,1

N

2,2

, w

2,2

VP

3,l

, V

3,3

, w

3,3

PP

4,l

, P

4,4

, w

4,4

NP

5,l

, w

5…l

|

S1,l

)

= P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) *

P(w1,1 | DT1,1) * P (w2,2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3,3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4,4|P4,4) * P (w5…l | NP5,l)

(Using Chain Rule, Context Freeness and Ancestor Freeness )

(42)

Exercise: Probabilistic Dependency Tree

Take the cue from PCFG

Probability of a sentence is the probability that the sentence is the yield of S, or the probability that S dominates the sentence

This probability is the probability of different non-terminals dominating segments in the sentence

Get a probabilistic formulation of

dependency tree

(43)

Problems with probabilities in PCFG

Grammar rules

S -> aSb P = 0.7

S -> ε P = 0.3

Language of the above grammar

P( t| ) = 0.7*0.7*0.3 = 0.147

Where is the rest of 0.853?

ε, ab, a b , a b a b ...

:

L

2 2 3 3

n n

2 2

b a

S

S

S a

a

b

b

ε

(44)

Problems with probabilities

Grammar rules

S -> aSb P = 0.7

S -> ab P = 0.2

S -> ε P = 0.1

Language of the above grammar

P( t| ) = 0.2 + 0.7*0.1 = 0.27

Where is the rest of 0.73?

ε, ab, a b , a b a b ...

:

L

2 2 3 3

n n

b a

S (0.7)

S (0.1)

a b

ε S (0.2)

ab

(45)

HMM ↔ PCFG

O observed sequence ↔ w 1m sentence

X state sequence ↔ t parse tree

 model ↔ G grammar

Three fundamental questions

(46)

HMM ↔ PCFG

How likely is a certain observation given the model? ↔ How likely is a sentence given the grammar?

How to choose a state sequence which best

explains the observations? ↔ How to choose a parse which best supports the sentence?

arg max ( | , )

X

P X O arg max ( |

1m

, )

t

P t w G

(

1m

| ) P w G ( | )

P O

(47)

HMM ↔ PCFG

How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences?

arg max ( P O | )

arg max (

1m

| )

G

P w G

(48)

Interesting Probabilities

The gunman sprayed the building with bullets 1 2 3 4 5 6 7 8

(4, 5)

NP

N

1

NP

(4, 5)

NP

What is the probability of having a NP at this position such that it will derive

“the building” ? -

What is the probability of starting from N

1

and deriving “The gunman sprayed”, a NP and “with bullets” ? -

Inside Probabilities

Outside Probabilities

(49)

Interesting Probabilities

Random variables to be considered

The non-terminal being expanded.

E.g., NP

The word-span covered by the non-terminal.

E.g., (4,5) refers to words “the building”

While calculating probabilities, consider:

The rule to be used for expansion : E.g., NP  DT NN

The probabilities associated with the RHS non-

terminals : E.g., DT subtree’s inside/outside

probabilities & NN subtree’s inside/outside

probabilities

(50)

Outside Probability

j (p,q) : The probability of beginning with N 1

& generating the non-terminal N j pq and all words outside w p ..w q

1( 1) ( 1)

( , ) ( ,

j

, | )

j

p q P w

p

N

pq

w

q m

G

w

1

………w

p-1

w

p

…w

q

w

q+1

……… w

m

N

1

N

j

(51)

Inside Probabilities

j (p,q) : The probability of generating the words w p ..w q starting with the non-terminal

N j pq . ( , ) ( |

j

, )

j

p q P w

pq

N

pq

G

w

1

………w

p-1

w

p

…w

q

w

q+1

……… w

m

N

1

N

j

(52)

Outside & Inside Probabilities:

example

The gunman sprayed the building with bullets

1 2 3 4 5 6 7

4,5

(4, 5) for "the building"

(The gunman sprayed, , with bullets | )

NP

P NP G

N

1

NP

(4, 5) for "the building" (the building |

4,5

, )

NP

P NP G

(53)

Shared Sub-Problems: Example

(54)

CKY Parsing: CNF

CKY parsing requires that the grammar consist of ε-free, binary rules =

Chomsky Normal Form

All rules of the form:

A  BC or Aa

What does the tree look like?

What if my CFG isn’t in CNF?

A → B C D → w

(55)

CKY Algorithm

(56)

Illustrating CYK [Cocke, Younger, Kashmi] Algo

S  NP VP 1.0

NP  DT NN 0.5

NP  NNS 0.3

NP  NP PP 0.2

PP  P NP 1.0

VP  VP PP 0.6

VP  VBD NP 0.4

• DT  the 1.0

• NN  gunman 0.5

• NN  building 0.5

• VBD  sprayed 1.0

• NNS  bullets 1.0

• P  with 1.0

(57)

CYK: Start with (0,1)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1

1 ---

2 --- --- -- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(58)

CYK: Keep filling diagonals

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1

1 --- NN1-2 2 --- ---

-- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(59)

CYK: Try getting higher level structures

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 1 --- NN1-2 2 --- ---

-- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(60)

CYK: Diagonal continues

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 1 --- NN1-2 2 --- ---

--

VBD2-3 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- --

5 ---

-

--- --

--- -

--- --

--- -

6 ---

-

--- --

--- -

--- --

--- -

--- -

(61)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- 1 --- NN1-2 --- 2 --- ---

--

VBD2-3 3 --- ---

--

---

4 ---

-

--- --

--- --- ---

5 ---

-

--- --

--- --- ---

--- -

6 ---

-

--- --

--- --- ---

--- -

--- -

(62)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- 1 --- NN1-2 --- 2 --- ---

--

VBD2-3 3 --- ---

--

--- DT3-4

4 ---

-

--- --

--- --- ---

5 ---

-

--- --

--- --- ---

--- -

6 ---

-

--- --

--- --- ---

--- -

--- -

(63)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- --- -- 1 --- NN1-2 --- ---

-- 2 --- ---

--

VBD2-3 --- -- 3 --- ---

--

--- DT3-4 4 --- ---

--

--- --- --

NN4-5 5 --- ---

--

--- --- --

--- --- 6 --- ---

--

--- --- --

--- ---

--- -

(64)

CYK: starts filling the 5 th column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -- 1 --- NN1-2 ---

-

--- -- 2 --- ---

--

VBD2-3 --- -- 3 --- ---

--

--- -

DT3-4 NP3-5

4 ---

-

--- --

--- -

--- --

NN4-5

5 ---

-

--- --

--- -

--- --

--- -

6 ---

-

--- --

--- -

--- --

--- -

--- -

(65)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

1 --- NN1-2 --- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 3 --- ---

--

--- -

DT3-4 NP3-5

4 ---

-

--- --

--- -

--- -

NN4-5

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(66)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

1 --- NN1-2 --- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 3 --- ---

--

--- -

DT3-4 NP3-5

4 ---

-

--- --

--- -

--- -

NN4-5

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(67)

CYK: S found, but NO termination!

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 1 --- NN1-2 ---

-

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 3 --- ---

--

--- -

DT3-4 NP3-5

4 ---

-

--- --

--- -

--- -

NN4-5

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(68)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 1 --- NN1-2 ---

-

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 3 --- ---

--

--- -

DT3-4 NP3-5

4 ---

-

--- --

--- -

--- -

NN4-5

5 ---

-

--- --

--- -

--- -

--- -

P5-6

6 ---

-

--- --

--- -

--- -

--- -

--- -

(69)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

1 --- NN1-2 --- -

--- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 --- -

3 --- --- --

--- -

DT3-4 NP3-5 --- -

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6

6 ---

-

--- --

--- -

--- -

--- -

--- -

(70)

CYK: Control moves to last column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

1 --- NN1-2 --- -

--- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 --- -

3 --- --- --

--- -

DT3-4 NP3-5 --- -

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP6-7 NNS6-7

(71)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

1 --- NN1-2 --- -

--- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 --- -

3 --- --- --

--- -

DT3-4 NP3-5 --- -

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6 PP5-7

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP6-7 NNS6-7

(72)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- --

S0-5 --- -

1 --- NN1-2 --- -

--- --

--- -

--- -

2 --- --- --

VBD2-3 --- --

VP2-5 --- -

3 --- --- --

--- -

DT3-4 NP3-5 --- -

NP3-7

4 ---

-

--- --

--- -

--- --

NN4-5 --- -

--- -

5 ---

-

--- --

--- -

--- --

--- -

P5-6 PP5-7

6 ---

-

--- --

--- -

--- --

--- -

--- -

NP6-7 NNS6-7

(73)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

1 --- NN1-2 --- -

--- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 --- -

VP2-7 3 --- ---

--

--- -

DT3-4 NP3-5 --- -

NP3-7

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6 PP5-7

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP6-7 NNS6-7

(74)

CYK: filling the last column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

1 --- NN1-2 --- -

--- -

--- -

--- -

--- -

2 --- --- --

VBD2-

3

--- -

VP2-5 --- -

VP2-7 3 --- ---

--

--- -

DT3-4 NP3-5 --- -

NP3-7

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6 PP5-7

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP6-7 NNS6-7

(75)

CYK: terminates with S in (0,7)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT0-1 NP0-2 --- -

--- -

S0-5 --- -

S0-7 1 --- NN1-2 ---

-

--- -

--- -

--- -

--- -

2 --- --- --

VBD2-3 --- -

VP2-5 --- -

VP2-7 3 --- ---

--

--- -

DT3-4 NP3-5 --- -

NP3-7

4 ---

-

--- --

--- -

--- -

NN4-5 --- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

P5-6 PP5-7

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP6-7 NNS6-7

(76)

CYK: Extracting the Parse Tree

The parse tree is obtained by keeping back pointers.

S (0-7) NP (0-

2)

VP (2- 7)

VBD (2- 3)

NP (3- 7)

DT (0- 1)

NN (1- 2)

The gunma

n

sprayed

NP (3- 5)

PP (5- 7) DT (3-

4)

NN (4- 5)

P (5- 6)

NP (6-7) NNS (6-7) the building with

bullets

(77)

Probabilistic parse tree

construction

(78)

Probabilistic Context Free Grammars

S  NP VP 1.0

NP  DT NN 0.5

NP  NNS 0.3

NP  NP PP 0.2

PP  P NP 1.0

VP  VP PP 0.6

VP  VBD NP 0.4

DT  the 1.0

NN  gunman 0.5

NN  building 0.5

VBD  sprayed 1.0

NNS  bullets 1.0

P  with 1.0

(79)

Calculating Inside probabilities  j (p,q)

Base case:

( , ) ( |

j

, ) (

j

| )

j

k k P w

k

N

kk

G P N

kk

w

k

G

  

Base case is used for rules which derive the words or terminals directly

E.g., Suppose N j = NN is being considered &

NN  building is one of the rules with probability 0.5

5,5

5,5

(5, 5) ( | , )

( | ) 0.5

NN

P building NN G P NN building G

  

(80)

Induction Step: Assuming Grammar in Chomsky Normal Form

Induction step :

w

p

N

j

N

r

N

s

w

d

w

d+1

w

q

1

,

( , ) ( | , )

( ) *

( , ) * ( 1, )

j

j pq pq

q

j r s

r s d p r s

p q P w N G

P N N N

p d

d q

 

 

Consider different splits of the words - indicated by d E.g., the huge building

Consider different non-terminals to be used in the rule:

NP  DT NN, NP  DT NNS are available options Consider summation over all these.

Split here for d=2 d=3

(81)

The Bottom-Up Approach

The idea of induction

Consider “the gunman”

Base cases : Apply unary rules DT  the Prob = 1.0 NN  gunman Prob = 0.5

Induction : Prob that a NP covers these 2 words

= P (NP  DT NN) * P (DT deriving the word

“the”) * P (NN deriving the word “gunman”)

= 0.5 * 1.0 * 0.5 = 0.25

The gunman

NP0.5

DT1.0 NN0.5

(82)

Parse Triangle (probability of output observation, the sentence)

1 1

1 1 1 1 1

(

m

| ) (

m

| ) (

m

|

m

, ) (1, )

P w GP Nw GP w N G m

• A parse triangle is constructed for calculating

j

(p,q)

• Probability of a sentence using 

j

(p,q):

(83)

Parse Triangle

The (1)

gunman (2)

sprayed (3)

the (4)

building (5)

with (6)

bullets (7) 0

1 2 3 4 5 6

• Fill diagonals with

j

( , ) k k

DT

1.0

NN

0.5

NN

0.5

NNS

1.0

VBD

1.0

DT

1.0

P

1.0

from to

(84)

Parse Triangle

The (1)

gunman (2)

sprayed (3)

the (4)

building (5)

with (6)

bullets (7) 1

2 3 4 5 6 7

• Calculate using induction formula

DT

1.0

NN

0.5

NN

0.5

NNS

1.0

VBD

1.0

DT

1.0

P

1.0

(1, 2) (the gunman |

1,2

, )

( ) * (1,1) * (2, 2)

0.5 *1.0 * 0.5 0.25

NP

DT NN

P NP G

P NP DT NN

 

 

NP

0.25

(85)

Example Parse t 1

S1.0

NP0.5 VP0.6

DT1.0 NN0.5

VBD1.0 NP0.5

PP1.0

DT1.0 NN0.5

P1.0 NP0.3

NNS1.0

bullet s

with building

th e The gunman

sprayed VP0.4

Rule used here is

VP  VP PP

The gunman sprayed the building with bullets.

(86)

Another Parse t 2

S1.0

NP0.5 VP0.4

DT1.0 NN0.5VBD1.0

NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3

NNS1.0 bullet s

buildingwith th

e The gunmansprayed

NP0.2

Rule used here is

VP  VBD NP

The gunman sprayed the building with bullets.

(87)

β calculation

(3, 7) (sprayed the building with bullets |

3,7

, )

( ) * (3, 5) * (6, 7)

( ) * (3, 3) * (4, 7)

0.6 *1.0 * 0.3 0.4 *1.0 * 0.015 0.186

VP

VP PP

VBD NP

P VP G

P VP VP PP P VP VBD NP

 

 

  

(88)

0 The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7

Parse tree for the given sentence using probabilistic CYK parsing

•Two parse trees are possible because the sentence has attachment ambiguity .

• Total 16 multiplications are required to make both the parse trees using probabilistic CYK.

•Number of multiplications is less in comparison to a probabilistic

parsing which prepares the two parse trees independently with 28

multiplication.

(89)

The 1

gunman 2

Sprayed 3

the 4

Building 5

with 6

Bullets 7 0 βDT (0-1)

=1.0

βNP (0-2)

=0.25

βS(0-7)

=0.006

1 βNN (1-2)

=0.5

2 βVBD(2-3)

=1.0

βVP (2-5)

=0.1

βVP(2-7)

=0.024

3 βDT(3-4)

=1.0

βNP (3-5)

=0.25

βNP(3-7)

=0.015

4 βNN (4-5)

=0.5

5 βP(5-6)

=1.0

βPP(5-7)

=0.3

6 βNP/NNS(6-7)

=1.0

References

Related documents

Going backward from final winner sequence which ends in state S 2 (indicated By the 2 nd tuple), we recover the sequence... The HMM,

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence..2. The HMM,

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence..2. A N*T array called SEQSCORE to

One day, Sam left his small, yellow home to head towards the meat-packing plant where he worked, a task which was never completed, as on his way, he tripped, fell, and went

„ One day, Sam left his small, yellow home to head towards the meat-packing plant where he worked, a task which was never completed, as on his way, he tripped, fell, and went

 Wordnet is a network of words linked by lexical and semantic relations..  The first wordnet in the world was for English developed at Princeton over

Sentence: I went with my friend, John, to the bank to withdraw some money but was disappointed to find it closed.. ISSUES Part

Lecture 5-6: Parsing (deterministic): constituency and dependency.. Morphology POS tagging Chunking Parsing Semantics.. Discourse and