• No results found

CS621: Artificial Intelligence

N/A
N/A
Protected

Academic year: 2022

Share "CS621: Artificial Intelligence"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

CS621: Artificial Intelligence

Pushpak Bhattacharyya

CSE Dept., CSE Dept., IIT Bombay

Lecture 31,32– Hidden Markov Model

5th Oct, 12th Oct, 2010

(lecture 30 was on brain computation by Prof.

Rohit Manchanda)

(2)

Observations leading to why probability is needed

Many intelligence tasks are sequence labeling tasks

Tasks carried out in layers

Within a layer, there are limited

Within a layer, there are limited windows of information

This naturally calls for strategies for dealing with uncertainty

Probability and Markov process give a way

(3)

“I went with my friend to the bank to withdraw some money, but was disappointed to find it

closed”

POS Sense

Pronoun drop

Bank (N/V) closed (V/ adj)

Bank (financial institution) withdraw (take away) But I/friend/money/bank was disappointed

Pronoun drop SCOPE

Co-referencing

But I/friend/money/bank was disappointed With my friend

It -> bank

(4)

HMM

(5)

A Motivating Example

Urn 1 Urn 2 Urn 3

Colored Ball choosing

Urn 1

# of Red = 30

# of Green = 50

# of Blue = 20

Urn 3

# of Red =60

# of Green =10

# of Blue = 30 Urn 2

# of Red = 10

# of Green = 40

# of Blue = 50

U1 U2 U3

U1 0.1 0.4 0.5

U2 0.6 0.2 0.2

U3 0.3 0.4 0.3

Probability of transition to another Urn after picking a ball:

(6)

Example (contd.)

U1 U2 U3 U1 0.1 0.4 0.5 U2 0.6 0.2 0.2 U3 0.3 0.4 0.3

Given :

Observation : RRGGBRGR

and

R G B

U1 0.3 0.5 0.2 U2 0.1 0.4 0.5 U3 0.6 0.1 0.3

Observation : RRGGBRGR

State Sequence : ??

Not so Easily Computable.

(7)

Diagrammatic representation (1/2)

U U

0.1

0.3 0.3

R, 0.6 B, 0.2

R, 0.3 G, 0.5

U1

U2

U3

0.1

0.2

0.4 0.6

0.4

0.5

0.2

R, 0.6 G, 0.1 B, 0.3

R, 0.1

B, 0.5 G, 0.4

(8)

Diagrammatic representation (2/2)

U R,0.15 U

R,0.18 G,0.03 B,0.09

R,0.18 R,0.03

G,0.05 B,0.02

U1

U2

U3

R,0.02 G,0.08 B,0.10

R,0.24 G,0.04 B,0.12 R,0.06

G,0.24 B,0.30 R, 0.08

G, 0.20 B, 0.12

R,0.15 G,0.25 B,0.10

R,0.18 G,0.03 B,0.09

R,0.02 G,0.08 B,0.10

(9)

Example (contd.)

Here :

S = {U1, U2, U3}

V = { R,G,B}

For observation:

U1 U2 U3 U1 0.1 0.4 0.5 U2 0.6 0.2 0.2

A =

For observation:

O ={o1… on}

And State sequence

Q ={q1… qn}

π is

U2 0.6 0.2 0.2 U3 0.3 0.4 0.3

R G B

U1 0.3 0.5 0.2 U2 0.1 0.4 0.5 U3 0.6 0.1 0.3

B=

)

( 1 i

i = P q =U

π

(10)

Observations and states

O1 O2 O3 O4 O5 O6 O7 O8 OBS: R R G G B R G R State: S1 S2 S3 S4 S5 S6 S7 S8

Si = U1/U2/U3; A particular state S: State sequence

O: Observation sequence

S* = “best” possible state (urn) sequence

Goal: Maximize P(S*|O) by choosing “best” S

(11)

Goal

Maximize P(S|O) where S is the State Sequence and O is the Observation Sequence

Sequence

))

| (

( max

arg

* P S O

S = S

(12)

False Start

) ,

| ( )...

,

| ( ).

,

| ( ).

| ( )

| (

)

| (

)

| (

7 1 8 2

1 3 1

2 1

8 1 8 1

O S

S P O

S S P O S S P O S P O

S P

O S

P O

S P

=

=

O1 O2 O3 O4 O5 O6 O7 O8

OBS: R R G G B R G R

State: S1 S2 S3 S4 S5 S6 S7 S8

By Markov Assumption (a state

depends only on the previous state)

) ,

| ( )...

,

| ( ).

,

| ( ).

| ( )

|

(S O P S1 O P S2 S1 O P S3 S2 O P S8 S7 O

P =

(13)

Baye’s Theorem

) (

/ )

| (

).

( )

|

( A B P A P B A P B

P =

P(A) -: Prior P(A) -: Prior

P(B|A) -: Likelihood

)

| (

).

( max

arg )

| ( max

arg

S

P S O =

S

P S P O S

(14)

State Transitions Probability

)

| ( )...

| ( ).

| ( ).

| ( ).

( )

(

) (

) (

7 1 8 3

1 4 2

1 3 1

2 1

8 1

=

=

S S P S

S P S

S P S

S P S

P S

P

S P S

P

By Markov Assumption (k=1) By Markov Assumption (k=1)

)

| ( )...

| ( ).

| ( ).

| ( ).

( )

(S P S1 P S2 S1 P S3 S2 P S4 S3 P S8 S7 P =

(15)

Observation Sequence probability

) ,

| ( )...

,

| ( ).

,

| ( ).

| ( )

|

(O S =P O1 S18 P O2 O1 S18 P O3 O12 S18 P O8 O17 S18

P

Assumption that ball drawn depends only on the Urn chosen

on the Urn chosen

)

| (

)...

| (

).

| (

).

| (

)

|

(O S P O1 S 1 P O 2 S 2 P O 3 S 3 P O8 S 8

P =

)

| (

)...

| (

).

| (

).

| (

).

| (

)...

| (

).

| (

).

| (

).

( )

| (

)

| ( ).

( )

| (

8 8

3 3

2 2

1 1

7 8

3 4

2 3

1 2

1

S O

P S

O P S

O P S

O P

S S

P S

S P S

S P S

S P S

P O

S P

S O P S

P O

S P

=

=

(16)

Grouping terms

P(S).P(O|S)

= [P(O0|S0).P(S1|S0)].

[P(O1|S1). P(S2|S1)].

We introduce the states S0 and S9 as initial and final states

O0 O1 O2 O3 O4 O5 O6 O7 O8

Obs: ε R R G G B R G R

State: S0 S1 S2 S3 S4 S5 S6 S7 S8 S9

[P(O1|S1). P(S2|S1)].

[P(O2|S2). P(S3|S2)].

[P(O3|S3).P(S4|S3)].

[P(O4|S4).P(S5|S4)].

[P(O5|S5).P(S6|S5)].

[P(O6|S6).P(S7|S6)].

[P(O7|S7).P(S8|S7)].

[P(O8|S8).P(S9|S8)].

and final states respectively.

After S8 the next state is S9 with probability 1, i.e., P(S9|S8)=1 O0 is ε-transition

(17)

Introducing useful notation

S0 S1 S7

S2 S3 S4 S5 S6

O0 O1 O2 O3 O4 O5 O6 O7 O8

Obs: ε R R G G B R G R

State: S0 S1 S2 S3 S4 S5 S6 S7 S8 S9

ε R R G G B R

S0 S1

S8

S9 S2

G

R

P(Ok|Sk).P(Sk+1|Sk)=P(SkSk+1)

Ok

(18)

Probabilistic FSM

(a1:0.3)

(a2:0.4)

(a1:0.1) (a1:0.3)

S1 S

2 (a1:0.2)

(a2:0.3)

(a2:0.2) (a2:0.2)

The question here is:

“what is the most likely state sequence given the output sequence seen”

S1 S

2

(19)

Developing the tree

Start

S1 S2

S1 S2 S1 S2

1.0 0.0

0.1 0.3 0.2 0.3

1*0.1=0.1 . 0.3 . 0.0 0.0

a1

S1 S2 S1 S2

S1 S2 S1 S2

1*0.1=0.1 0.3 0.0 0.0

0.1*0.2=0.02 0.1*0.4=0.04 0.3*0.3=0.09 0.3*0.2=0.06

. .

a2

Choose the winning sequence per state per iteration

0.2 0.4 0.3 0.2

(20)

Tree structure contd…

S1 S2

S1 S2 S1 S2

0.1 0.3 0.2 0.3

0.027 . 0.012 .

0.09 0.06

0.09*0.1=0.009 0.018

a1

S1

0.3

0.0081

S2 0.2

0.0054

S2 0.4

0.0048 S1

0.2

0.0024

.

a2

The problem being addressed by this tree is S* argmax P(S | a1 a2 a1 a2,µ)

s

=

a1-a2-a1-a2 is the output sequence and µ the model or the machine

(21)

Path found:

(working backward)

S1 S

2 S

1 S

2 S

1

a2

a1

a1 a

2

Problem statement: Find the best possible sequence

) ,

| max (

* arg P S O µ S

s

=

Machine or

Model Seq,

Output Seq,

State

, S O µ

where, S StateSeq,O Output Seq,µ Modelor Machine where

} , , , { Machine

or

Model = S0 S A T

Start symbol State collection Alphabet set

Transitions

T is defined as P(Si →ak Sj) i, j, k

References

Related documents

 The agent has complete information of the domain (perception is perfect), actions are instantaneous and their effects are deterministic..  The agent knows the world completely,

• Execution of a plan: achieved through a data structure called Triangular Table....

Lecture 18- Robotic Planning; same as lecture 17; mainly discussed the

Sentence: I went with my friend, John, to the bank to withdraw some money but was disappointed to find it closed.. ISSUES Part

Sentence: I went with my friend, John, to the bank to withdraw some money but was disappointed to find it closed.. ISSUES Part

Is it true that the left road leads to the Is it true that the left road leads to the capital if and only if you speak the truth. Exercise: A more well known form

Sentence: I went with my friend, John, to the bank to withdraw some money but was disappointed to find it closed. ISSUES Part

Is it true that the left road leads to the Is it true that the left road leads to the capital if and only if you speak the truth. Exercise: A more well known form