• No results found

CSE Dept., CSE Dept., IIT Bombay

N/A
N/A
Protected

Academic year: 2022

Share "CSE Dept., CSE Dept., IIT Bombay "

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

CS621: Artificial Intelligence

Pushpak Bhattacharyya

CSE Dept., CSE Dept., IIT Bombay

Lecture 35–HMM; Forward and Backward Probabilities

19 th Oct, 2010

(2)

HMM Definition

Set of states: S where |S|=N

Start state S

0

/*P(S

0

)=1*/

Output Alphabet: O where |O|=M

Transition Probabilities: A= {a

ij

} /*state i to state j*/

ij

state j*/

Emission Probabilities : B= {b

j

(o

k

)} /*prob. of emitting or absorbing o

k

from state j*/

Initial State Probabilities: Π={p

1

,p

2

,p

3

,…p

N

}

Each p

i

=P(o

0

=ε,S

i

|S

0

)

(3)

Three basic problems (contd.)

Problem 1: Likelihood of a sequence

Forward Procedure

Backward Procedure

Backward Procedure

Problem 2: Best state sequence

Viterbi Algorithm

Problem 3: Re-estimation

Baum-Welch ( Forward-Backward

Algorithm )

(4)

Forward and Backward

Probability Calculation

(5)

Forward probability F(k,i)

Define F(k,i)= Probability of being in state S i having seen o 0 o 1 o 2 …o k

F(k,i)=P(o 0 o 1 o 2 …o k , S i )

With m as the length of the observed

With m as the length of the observed sequence

P(observed sequence)=P(o 0 o 1 o 2 ..o m )

p=0,N P(o 0 o 1 o 2 ..o m , S p )

p=0,N F(m , p)

(6)

Forward probability (contd.)

F(k , q)

= P(o

0

o

1

o

2

..o

k

, S

q

)

= P(o

0

o

1

o

2

..o

k

, S

q

)

= P(o

0

o

1

o

2

..o

k-1

, o

k

,S

q

)

= Σ

p=0,N

P(o

0

o

1

o

2

..o

k-1

, S

p

, o

k

,S

q

)

= Σ

p=0,N

P(o

0

o

1

o

2

..o

k-1

, S

p

).

= Σ

p=0,N

P(o

0

o

1

o

2

..o

k-1

, S

p

).

P(o

m

,S

q

|o

0

o

1

o

2

..o

k-1

, S

p

)

= Σ

p=0,N

F(k-1,p). P(o

k

,S

q

|S

p

)

= Σ

p=0,N

F(k-1,p). P(S

p

S

q

)

ok

O

0

O

1

O

2

O

3

… O

k

O

k+1

… O

m-1

O

m

S

0

S

1

S

2

S

3

… S

p

S

q

… S

m

S

final

(7)

Backward probability B(k,i)

Define B(k,i)= Probability of seeing

o k o k+1 o k+2 …o m given that the state was S i

B(k,i)=P(o k k k+1 k+2 o k+1 o k+2 …o m m \ S i i )

With m as the length of the observed sequence

P(observed sequence)=P(o 0 o 1 o 2 ..o m )

= P(o 0 o 1 o 2 ..o m | S 0 )

=B(0,0)

(8)

Backward probability (contd.)

B(k , p)

= P(o

k

o

k+1

o

k+2

…o

m

\ S

p

)

= P(o

k+1

o

k+2

…o

m

, o

k

|S

p

)

= Σ

q=0,N

P(o

k+1

o

k+2

…o

m

, o

k

, S

q

|S

p

)

= Σ

q=0,N

P(o

k

,S

q

|S

p

)

P(o

k+1

o

k+2

…o

m

|o

k

,S

q

,S

p

)

= Σ

q=0,N

P(o

k+1

o

k+2

…o

m

|S

q

). P(o

k

, S

q

|S

p

)

= Σ

q=0,N

B(k+1,q). P(S

p

S

q

)

ok

O

0

O

1

O

2

O

3

… O

k

O

k+1

… O

m-1

O

m

S

0

S

1

S

2

S

3

… S

p

S

q

… S

m

S

final

(9)

Continuing with the Urn example

Urn 1 Urn 2 Urn 3

Colored Ball choosing

Urn 1

# of Red = 30

# of Green = 50

# of Blue = 20

Urn 3

# of Red =60

# of Green =10

# of Blue = 30 Urn 2

# of Red = 10

# of Green = 40

# of Blue = 50

(10)

Example (contd.)

U

1

U

2

U

3

U

1

0.1 0.4 0.5 U

2

0.6 0.2 0.2 U

3

0.3 0.4 0.3

Given :

Observation : RRGGBRGR

and

R G B

U

1

0.3 0.5 0.2 U

2

0.1 0.4 0.5 U

3

0.6 0.1 0.3

Transition Probability Observation/output Probability

Observation : RRGGBRGR

What is the corresponding state sequence ?

(11)

Diagrammatic representation (1/2)

U U

0.1

0.3 0.3

R, 0.6 B, 0.2

R, 0.3 G, 0.5

U1

U2

U3

0.1

0.2

0.4 0.6

0.4

0.5

0.2

R, 0.6 G, 0.1 B, 0.3

R, 0.1

B, 0.5 G, 0.4

(12)

Diagrammatic representation (2/2)

U R,0.15 U

R,0.18 G,0.03 B,0.09

R,0.18 R,0.03

G,0.05 B,0.02

U1

U2

U3

R,0.02 G,0.08 B,0.10

R,0.24 G,0.04 B,0.12 R,0.06

G,0.24 B,0.30 R, 0.08

G, 0.20 B, 0.12

R,0.15 G,0.25 B,0.10

R,0.18 G,0.03 B,0.09

R,0.02 G,0.08 B,0.10

(13)

Observations and states

O 1 O 2 O 3 O 4 O 5 O 6 O 7 O 8 OBS: R R G G B R G R State: S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8

S

i

= U

1

/U

2

/U

3

; A particular state S: State sequence

O: Observation sequence

S* = “best” possible state (urn) sequence

Goal: Maximize P(S*|O) by choosing “best” S

(14)

Grouping terms

P(S).P(O|S)

= [P(O

0

|S

0

).P(S

1

|S

0

)].

[P(O

1

|S

1

). P(S

2

|S

1

)].

We introduce the states S

0

and S

9

as initial and final states

O

0

O

1

O

2

O

3

O

4

O

5

O

6

O

7

O

8

Obs:

ε R R G G B R G R

State:

S

0

S

1

S

2

S

3

S

4

S

5

S

6

S

7

S

8

S

9

[P(O

1

|S

1

). P(S

2

|S

1

)].

[P(O

2

|S

2

). P(S

3

|S

2

)].

[P(O

3

|S

3

).P(S

4

|S

3

)].

[P(O

4

|S

4

).P(S

5

|S

4

)].

[P(O

5

|S

5

).P(S

6

|S

5

)].

[P(O

6

|S

6

).P(S

7

|S

6

)].

[P(O

7

|S

7

).P(S

8

|S

7

)].

[P(O

8

|S

8

).P(S

9

|S

8

)].

and final states respectively.

After S

8

the next state

is S

9

with probability

1, i.e., P(S

9

|S

8

)=1

O

0

is ε-transition

(15)

Introducing useful notation

S0 S1 S7

S2 S3 S4 S5 S6

O

0

O

1

O

2

O

3

O

4

O

5

O

6

O

7

O

8

Obs:

ε R R G G B R G R

State:

S

0

S

1

S

2

S

3

S

4

S

5

S

6

S

7

S

8

S

9

ε R R G G B R

S0 S1

S8

S9 S2

G

R

P(O

k

|S

k

).P(S

k+1

|S

k

)=P(S

k

S

k+1

)

Ok

(16)

Viterbi Algorithm for the Urn problem (first two symbols)

S0

U U U

0.5

0.3

0.2 ε

U1 U2 U3

U1 U2 U3

0.03

0.08

0.15

U1 U2 U3 U1 U2 U3

0.06

0.02

0.02

0.18

0.24

0.18

0.015 0.04 0.075* 0.018 0.006 0.006 0.048* 0.036

*: winner sequences R

(17)

Probabilistic FSM

(a1:0.3)

(a2:0.4)

(a1:0.1) (a1:0.3)

S

1

S

2 (a1:0.2)

(a2:0.3)

(a2:0.2) (a2:0.2)

The question here is:

“what is the most likely state sequence given the output sequence seen”

S

1

S

2

(18)

Developing the tree

Start

S1 S2

S1 S2 S1 S2

1.0 0.0

0.1 0.3 0.2 0.3

1*0.1=0.1 . 0.3 . 0.0 0.0

a

1

S1 S2 S1 S2

S1 S2 S1 S2

1*0.1=0.1 0.3 0.0 0.0

0.1*0.2=0.02 0.1*0.4=0.04 0.3*0.3=0.09 0.3*0.2=0.06

. .

a

2

Choose the winning sequence per state per iteration

0.2 0.4 0.3 0.2

(19)

Tree structure contd…

S1 S2

S1 S2 S1 S2

0.1 0.3 0.2 0.3

0.027 . 0.012 .

0.09 0.06

0.09*0.1=0.009 0.018

a

1

S1

0.3

0.0081

S2 0.2

0.0054

S2 0.4

0.0048 S1

0.2

0.0024

.

a

2

The problem being addressed by this tree is S * arg max P ( S | a

1

a

2

a

1

a

2,µ

)

s

=

a1-a2-a1-a2 is the output sequence and µ the model or the machine

(20)

Path found :

(working backward)

S

1

S

2

S

1

S

2

S

1

a

2

a

1

a

1

a

2

Problem statement : Find the best possible sequence

) ,

| max (

* arg P S O µ S

s

=

Machine or

Model Seq,

Output Seq,

State

,

SO

µ

where

,

S

State Seq,

O

Output Seq, µ

Model or Machine

where

} , , , { Machine

or

Model = S

0

S A T

Start symbol State collection Alphabet set

Transitions

T is defined as P ( S

i

 → a

k

S

j

) ∀

i, j, k

(21)

Tabular representation of the tree

€ a

1

a

2

a

1

a

2

S

1.0 (1.0*0.1,0.0*0.2 (0.02, (0.009, 0.012) (0.0024, Ending state

Latest symbol observed

S

1 1.0 (1.0*0.1,0.0*0.2 )=(0.1,0.0)

(0.02, 0.09)

(0.009, 0.012) (0.0024, 0.0081)

S

2 0.0 (1.0*0.3,0.0*0.3 )=(0.3,0.0)

(0.04,0.0 6)

(0.027,0.018) (0.0048,0.005 4)

Note: Every cell records the winning probability ending in that state Final winner The bold faced values in each cell shows the

sequence probability ending in that state. Going backward from final winner sequence which ends in state S2 (indicated By the 2nd tuple), we recover the sequence.

References

Related documents

Suppose the path formed is not optimal Let G be expanded in a

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence..2. The HMM,

Going backward from final winner sequence which ends in state S 2 (indicated By the 2 nd tuple), we recover the sequence... The HMM,

Going backward from final winner sequence which ends in state S 2 (indicated By the 2 nd tuple), we recover the sequence... The HMM,

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence... The HMM,

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence..2. The HMM,

Going backward from final winner sequence which ends in state S2 (indicated By the 2 nd tuple), we recover the sequence..2. A N*T array called SEQSCORE to

Cant theorize about the phenomenon due to variables which may be unknown or/and in large numbers -> Keep recording the data and infer from the data. Create S,X, I and D