• No results found

(Lecture 20– CYK; Inside Probability; Parse Tree construction)

N/A
N/A
Protected

Academic year: 2022

Share "(Lecture 20– CYK; Inside Probability; Parse Tree construction)"

Copied!
51
0
0

Loading.... (view fulltext now)

Full text

(1)

CS460/626 : Natural Language

Processing/Speech, NLP and the Web

(Lecture 20– CYK; Inside Probability; Parse Tree construction)

Pushpak Bhattacharyya CSE Dept.,

IIT Bombay

16

th

Feb, 2012

(2)

CYK Parsing

(3)

Shared Sub-Problems: Example

(4)

CKY Parsing: CNF

CKY parsing requires that the grammar consist of ε-free, binary rules =

Chomsky Normal Form

A → B C D → w

Chomsky Normal Form

All rules of the form:

A BC or A a

What does the tree look like?

(5)

CKY Algorithm

(6)

Illustrating CYK [Cocke, Younger, Kashmi] Algo

S → NP VP 1.0

NP → DT NN 0.5

NP → NNS 0.3

NP → NP PP 0.2

• DT → the 1.0

• NN → gunman 0.5

• NN → building 0.5

• VBD → sprayed 1.0

NP → NP PP 0.2

PP → P NP 1.0

VP → VP PP 0.6

VP → VBD NP 0.4

• VBD → sprayed 1.0

• NNS → bullets 1.0

(7)

CYK: Start with (0,1)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT

1 ---

2 --- --- 2 --- ---

-- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(8)

CYK: Keep filling diagonals

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT

1 --- NN

2 --- --- 2 --- ---

-- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

(9)

CYK: Try getting higher level structures

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP

1 --- NN

2 --- --- 2 --- ---

-- 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(10)

CYK: Diagonal continues

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP

1 --- NN

2 --- --- VBD

2 --- --- --

VBD 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

(11)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

1 --- NN ---

-- 2 --- ---

--

VBD 3 --- ---

--

--- -

4 ---

-

--- --

--- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(12)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

1 --- NN ---

-- 2 --- ---

--

VBD 3 --- ---

--

--- -

DT 4 --- --- --- ---

(13)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

1 --- NN ---

-

--- -

- -

2 --- --- --

VBD --- -

3 --- --- --

--- -

DT

4 ---

-

--- --

--- -

--- -

NN

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(14)

CYK: starts filling the 5 th column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

1 --- NN ---

-

--- -

- -

2 --- --- --

VBD --- -

3 --- --- --

--- -

DT NP

4 --- --- --- --- NN

(15)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

1 --- NN ---

-

--- -

- -

2 --- --- --

VBD --- -

VP 3 --- ---

--

--- -

DT NP

4 ---

-

--- --

--- -

--- -

NN

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(16)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

1 --- NN ---

-

--- -

--- -

- - -

2 --- --- --

VBD --- -

VP 3 --- ---

--

--- -

DT NP

4 --- --- --- --- NN

(17)

CYK: S found, but NO termination!

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S

1 --- NN ---

-

--- -

--- -

- - -

2 --- --- --

VBD --- -

VP 3 --- ---

--

--- -

DT NP

4 ---

-

--- --

--- -

--- -

NN

5 ---

-

--- --

--- -

--- -

--- -

6 ---

-

--- --

--- -

--- -

--- -

--- -

(18)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S

1 --- NN ---

-

--- -

--- -

- - -

2 --- --- --

VBD --- -

VP 3 --- ---

--

--- -

DT NP

4 --- --- --- --- NN

(19)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S --- -

1 --- NN ---

-

--- -

--- -

--- -

- - - -

2 --- --- --

VBD --- -

VP --- -

3 --- --- --

--- -

DT NP ---

-

4 ---

-

--- --

--- -

--- -

NN --- -

5 ---

-

--- --

--- -

--- -

--- -

P

6 ---

-

--- --

--- -

--- -

--- -

--- -

(20)

CYK: Control moves to last column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

1 --- NN ---

-

--- -

--- -

--- -

2 --- --- --

VBD --- -

VP --- -

3 --- --- --

--- -

DT NP ---

-

4 ---

-

--- --

--- -

--- -

NN --- -

(21)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

1 --- NN ---

-

--- -

--- -

--- -

- - - -

2 --- --- --

VBD --- -

VP --- -

3 --- --- --

--- -

DT NP ---

-

4 ---

-

--- --

--- -

--- -

NN --- -

5 ---

-

--- --

--- -

--- -

--- -

P PP

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP NNS

(22)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

1 --- NN ---

-

--- -

--- -

--- -

2 --- --- --

VBD --- -

VP --- -

3 --- --- --

--- -

DT NP ---

-

NP

4 ---

-

--- --

--- -

--- -

NN --- -

--- -

(23)

CYK (cont…)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

1 --- NN ---

-

--- -

--- -

--- -

- - - -

2 --- --- --

VBD --- -

VP --- -

VP 3 --- ---

--

--- -

DT NP ---

-

NP

4 ---

-

--- --

--- -

--- -

NN --- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

P PP

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP NNS

(24)

CYK: filling the last column

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

1 --- NN ---

-

--- -

--- -

--- -

--- -

- - - - -

2 --- --- --

VBD --- -

VP --- -

VP 3 --- ---

--

--- -

DT NP ---

-

NP 4 --- --- --- --- NN --- ---

(25)

CYK: terminates with S in (0,7)

0

The

1

gunman

2

sprayed

3

the

4

building

5

with

6

bullets

7

.

To From

1 2 3 4 5 6 7

0 DT NP ---

-

--- -

S ---

-

S

1 --- NN ---

-

--- -

--- -

--- -

--- -

- - - - -

2 --- --- --

VBD --- -

VP --- -

VP 3 --- ---

--

--- -

DT NP ---

-

NP

4 ---

-

--- --

--- -

--- -

NN --- -

--- -

5 ---

-

--- --

--- -

--- -

--- -

P PP

6 ---

-

--- --

--- -

--- -

--- -

--- -

NP NNS

(26)

CYK: Extracting the Parse Tree

The parse tree is obtained by keeping back pointers.

S (0-7) NP (0-

2)

VP (2- 7)

2) 7)

VBD (2- 3)

NP (3- 7)

DT (0- 1)

NN (1- 2)

The gunma

n

NP (3- 5)

PP (5- 7)

(27)

Probabilistic parse tree

construction

(28)

Interesting Probabilities

N

1

NP

(4, 5) β

NP

What is the probability of having a NP at this position such that it will derive

“the building” ? -

Inside Probabilities

The gunman sprayed the building with bullets 1 2 3 4 5 6 7

Outside Probabilities

(29)

Interesting Probabilities

Random variables to be considered

The non-terminal being expanded.

E.g., NP

The word-span covered by the non-terminal.

E.g., (4,5) refers to words “the building”

While calculating probabilities, consider:

The rule to be used for expansion : E.g., NP → DT NN

The probabilities associated with the RHS non-

terminals : E.g., DT subtree’s inside/outside

probabilities & NN subtree’s inside/outside

probabilities

(30)

Outside Probability

α j (p,q) : The probability of beginning with N 1

& generating the non-terminal N j pq and all words outside w p ..w q

1( 1) ( 1)

( , ) ( ,

j

, | )

j

p q P w

p

N

pq

w

q m

G

α =

+

N

1

N

j

(31)

Inside Probabilities

β j (p,q) : The probability of generating the words w p ..w q starting with the non-terminal

N j pq . ( , ) ( |

j

, )

j

p q P w

pq

N

pq

G

β =

N

1

w

1

………w

p-1

w

p

…w

q

w

q+1

……… w

m

β

α N

1

N

j

(32)

Outside & Inside Probabilities: example

4,5

(4, 5) for "the building"

(The gunman sprayed, , with bullets | )

NP

P NP G

α

=

N

1

(4, 5) for "the building" (the building |

4,5

, )

NP

P NP G

β =

NP

(33)

Calculating Inside probabilities β j (p,q)

Base case:

( , ) ( |

j

, ) (

j

| )

j

k k P w

k

N

kk

G P N

kk

w

k

G

β = = →

Base case is used for rules which derive the words or terminals directly

words or terminals directly

E.g., Suppose N j = NN is being considered &

NN → building is one of the rules with probability 0.5

5,5

5,5

(5, 5) ( | , )

( | ) 0.5

NN

P building NN G P NN building G

β =

= → =

(34)

Induction Step: Assuming Grammar in Chomsky Normal Form

Induction step :

w

N

j

N

r

N

s

w w w

1

,

( , ) ( | , )

( ) *

( , ) * ( 1, )

j

j pq pq

q

j r s

r s d p r

p q P w N G P N N N

p d d q β

β β

=

=

= →

+

∑ ∑

w

p

w

d

w

d+1

w

q

β

s

( d + 1, ) q

Consider different splits of the words - indicated by d E.g., the huge building

Split here for d=2 d=3

(35)

The Bottom-Up Approach

The idea of induction

Consider “the gunman”

Base cases : Apply unary rules DT → the Prob = 1.0

The gunman

NP

0.5

DT

1.0

NN

0.5

DT → the Prob = 1.0 NN → gunman Prob = 0.5

Induction : Prob that a NP covers these 2 words

= P (NP → DT NN) * P (DT deriving the word

“the”) * P (NN deriving the word “gunman”)

= 0.5 * 1.0 * 0.5 = 0.25

The gunman

(36)

Parse Triangle

1 1

( | ) ( | ) ( | , ) (1, )

P w G = P Nw G = P w N G = β m

• A parse triangle is constructed for calculating β

j

(p,q)

• Probability of a sentence using β

j

(p,q):

1 1

1 1 1 1 1

(

m

| ) (

m

| ) (

m

|

m

, ) (1, )

P w G = P Nw G = P w N G = β m

(37)

Parse Triangle

The (1)

gunman (2)

sprayed (3)

the (4)

building (5)

with (6)

bullets (7) 0

1 2

DT

1.0 β =

NN

0.5 β =

VBD

1.0 β =

from to

2 3 4 5 6

• Fill diagonals with β

j

( , ) k k

NN

0.5 β =

NNS

1.0 β =

VBD

1.0 β =

DT

1.0 β =

P

1.0

β =

(38)

Parse Triangle

The (1)

gunman (2)

sprayed (3)

the (4)

building (5)

with (6)

bullets (7) 1

2 3 4

DT

1.0 β =

NN

0.5 β =

VBD

1.0 β =

DT

1.0 β =

NP

0.25 β =

4 5 6 7

• Calculate using induction formula

NN

0.5 β =

NNS

1.0 β =

DT

1.0 β =

P

1.0

β =

(39)

Example Parse t 1

S

1.0

NP

0.5

VP

0.6

DT NN

0.5

VP PP

Rule used here is

VP → VP PP

The gunman sprayed the building with bullets.

DT

1.0

NN

0.5

VBD

1.0

NP

0.5

PP

1.0

DT

1.0

NN

0.5

P

1.0

NP

0.3

NNS

1.0

bullet s

with building

th e The gunman

sprayed

VP

0.4

(40)

Another Parse t 2

S

1.0

NP

0.5

VP

0.4

DT

1.0

NN

0.5

VBD

1.0

NP

0.2

Rule used here is

VP → VBD NP

The gunman sprayed the building with bullets.

DT

1.0

NN

0.5

VBD

1.0

NP

0.5

PP

1.0

DT

1.0

NN

0.5

P

1.0

NP

0.3

The gunmansprayed

NP

0.2

(41)

Parse Triangle

The (1) gunman (2)

sprayed (3)

the (4)

building (5)

with (6)

bullets (7) 1

2 3 4

DT

1.0 β =

NN

0.5 β =

VBD

1.0 β =

DT

1.0 β =

NP

0.25 β =

NP

0.25 β =

VP

1.0 β =

0.015 β

NP

=

0.186 β

VP

=

0.0465 β

S

=

5 6 7

NN

0.5 β =

NNS

1.0 β =

P

1.0 β =

(3, 7) (sprayed the building with bullets |

3,7

, )

( ) * (3, 5) * (6, 7)

( ) * (3, 3) * (4, 7)

0.6 *1.0 * 0.3 0.4 *1.0 * 0.015 0.186

VP

VP PP

VBD NP

P VP G

P VP VP PP P VP VBD NP

β

β β

β β

=

= →

+ →

= + =

PP

0.3

β =

(42)

Different Parses

Consider

Different splitting points : E.g., 5th and 3

rd

position E.g., 5th and 3

rd

position

Using different rules for VP expansion : E.g., VP → VP PP, VP → VBD NP

Different parses for the VP “sprayed the

(43)

The Viterbi-like Algorithm for PCFGs

i

( , ) highest inside probability parse of N

pq

i

p q

δ =

• Very similar to calculation of inside probabilities β

ii

(p,q)

• Instead of summing over all ways of constructing the parse for w

pq

– Choose only the best way (the maximum probability one!)

(44)

Calculation of δ i (p,q)

(3, 7) (sprayed the building with bullets |

3,7

, )

max{ ( ) * (3, 5) * (6, 7),

( ) * (3, 3) * (4, 7)}

VP

VP PP

VBD NP

P VP G

P VP VP PP P VP VBD NP δ

δ δ

δ δ

=

= →

This rule is →

= max{0.6 *1.0 * 0.3, 0.4 *1.0 * 0.015} = 0.18

This rule is chosen

VP

0.4

VP PP

VP

0.4

VBD NP

(45)

Viterbi-like Algorithm

Base case:

Induction :

ψ

i

(p,q) stores

RHS of the rule selected

Position of splitting

( , ) ( , )

i

k k

i

k k

δ = β

Example : ψ

VP

(3,7) stores VP, PP and split position = 5 because VP → VP PP is the

rule used.

Backtracing : Start from ψ 1 (1,7)

and δ 1 (1,7) and backtrace.

(46)

Example

ψ

1

(1,7) records S → NP VP & split position as 2

S

NP VP

ψ

NP

(1,2) records NP → DT NN & split position as 1

ψ

VP

(3,7) records VP → VP PP & split position as 5

The gunman sprayed the building with bullets

1 2 3 4 5 6 7

(47)

Example

S

NP VP

DT NN VP PP

The gunman sprayed the building with bullets

1 2 3 4 5 6 7

DT NN VP PP

(48)

Grammar Induction

Annotated corpora like Penn Treebank

Counts used as follows:

Sample training data:

# NP DT NN is used P(NP DT NN) =

# An NP rule is used

→ → 2

= 5

Sample training data:

NP

DT NN

NP

DT NNS

NP

NNS

NP

DT NN

NP

PRP

(49)

Grammar Induction for Unannotated Corpora: EM algorithm

Start with initial estimates for rule probabilities

Compute probability of each parse of a sentence according to current estimates of rule probabilities

current estimates of rule probabilities

Compute expectation of how often a rule is used (summing probabilities of rules used in previous step)

Refine rule probabilities so that training corpus likelihood increases

EXPECTATION PHASE

MAXIMIZATION PHASE

(50)

Outside Probabilities α j (p,q)

Base case:

Inductive step for calculating :

1

(1, ) 1 for start symbol (1, ) 0 for 1

j

m

m j

α α

=

= ≠

( , )

f

p e

α

N

1

( , )

j p q

α

N

fpe

N

jpq

N

g

( , )

f

p e

α

( 1, )

g

q e

β +

(

f j g

)

P NN N

(51)

Probability of a Sentence

• Joint probability of a sentence w

1m

and that there is a constituent spanning words w

p

to w

q

is given as:

1 1

(

m

,

pq

| ) (

m

|

pqj

, )

j

( , )

j

( , )

j j

P w N G = ∑ P w N G = ∑ α p q β p q

N

1

(The gunman....bullets,

4,5

| )

(The gunman...bullets |

j

, )

P N G

P N G

= ∑

The gunman sprayed the building with bullets

1 2 3 4 5 6 7

N

1

NP

(The gunman...bullets |

4,5

, ) (4, 5) (4, 5)

(4, 5) (4, 5) ...

j j

NP NP

VP VP

P N G

α β

α β

=

=

+ +

References

Related documents

„ Is there a parsing algorithm for arbitrary CFGs that combines dynamic programming and top-down control. „ Yes:

Now we can compute the probability of sentences like I want English food or I want Chinese food by simply multiplying the appropriate bigram probabilities to- gether, as follows:.

However, the previous algorithm produced a parse DAG, because it maintains a set of formulas as node of the parse tree.... Let sub(F ) denote the set of subformulas

However, the previous algorithm produced a parse DAG, because it maintains a set of formulas as node of the parse tree.... Let sub(F ) denote the set of subformulas

Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a

„ Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a. Attachment Parse Tree (i.e., I used a

Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a

The Macroeconomic Policy and Financing for Development Division of ESCAP is undertaking an evaluation of this publication, A Review of Access to Finance by Micro, Small and Medium