**CS 208: Automata Theory and Logic**

**Lecture 5: Pumping Lemma and Myhill-Nerode Theorem**
Ashutosh Trivedi

A

start B

b

∀x(La(x)→ ∃y.(x<y)∧Lb(y))

a

b

a

**Ashutosh Trivedi – 2 of 15**

**Closure Properties of Regular Languages**

Operations that preserveregularityof languages:

– union, intersection, complement, difference

– concatenation and Kleene closure (star) – Reversal

– reversalwof a stringwis defined as: w=

(ε ifw=ε

ax ifw=xawherex∈Σ^{∗}anda∈Σ

– L={w : w∈L}.

– Swap initial and accepting states, and reverse the transitions, i.e.
δ(s,a) =s^{0}iffδ(s^{0},a) =s.

– Proof of correctness is via structural induction over regular expressions – Homomorphismandinverse-homomorphism

– String homomorphism is a functionh: Σ→Γ^{∗}
– Extended string homomorphismˆh: Σ^{∗}→Γ^{∗}

– ForL∈Σ^{∗}we defineh(L)⊆Γ^{∗}ash(L) ={ˆh(w) : w∈L}.
– ForL∈Γ^{∗}we defineh^{−1}(L)⊆Σ^{∗}ash^{−1}(L) ={w : ˆh(w)∈L}.

**Closure Properties of Regular Languages**

Operations that preserveregularityof languages:

– union, intersection, complement, difference – concatenation and Kleene closure (star)

– Reversal

– reversalwof a stringwis defined as: w=

(ε ifw=ε

ax ifw=xawherex∈Σ^{∗}anda∈Σ

– L={w : w∈L}.

– Swap initial and accepting states, and reverse the transitions, i.e.
δ(s,a) =s^{0}iffδ(s^{0},a) =s.

– Proof of correctness is via structural induction over regular expressions – Homomorphismandinverse-homomorphism

– String homomorphism is a functionh: Σ→Γ^{∗}
– Extended string homomorphismˆh: Σ^{∗}→Γ^{∗}

– ForL∈Σ^{∗}we defineh(L)⊆Γ^{∗}ash(L) ={ˆh(w) : w∈L}.
– ForL∈Γ^{∗}we defineh^{−1}(L)⊆Σ^{∗}ash^{−1}(L) ={w : ˆh(w)∈L}.

**Ashutosh Trivedi – 2 of 15**

**Closure Properties of Regular Languages**

Operations that preserveregularityof languages:

– union, intersection, complement, difference – concatenation and Kleene closure (star) – Reversal

– reversalwof a stringwis defined as:

w=

(ε ifw=ε

ax ifw=xawherex∈Σ^{∗}anda∈Σ

– L={w : w∈L}.

– Swap initial and accepting states, and reverse the transitions, i.e.

δ(s,a) =s^{0}iffδ(s^{0},a) =s.

– Proof of correctness is via structural induction over regular expressions

– Homomorphismandinverse-homomorphism
– String homomorphism is a functionh: Σ→Γ^{∗}
– Extended string homomorphismˆh: Σ^{∗}→Γ^{∗}

– ForL∈Σ^{∗}we defineh(L)⊆Γ^{∗}ash(L) ={ˆh(w) : w∈L}.
– ForL∈Γ^{∗}we defineh^{−1}(L)⊆Σ^{∗}ash^{−1}(L) ={w : ˆh(w)∈L}.

**Closure Properties of Regular Languages**

Operations that preserveregularityof languages:

– union, intersection, complement, difference – concatenation and Kleene closure (star) – Reversal

– reversalwof a stringwis defined as:

w=

(ε ifw=ε

ax ifw=xawherex∈Σ^{∗}anda∈Σ

– L={w : w∈L}.

– Swap initial and accepting states, and reverse the transitions, i.e.

δ(s,a) =s^{0}iffδ(s^{0},a) =s.

– Proof of correctness is via structural induction over regular expressions – Homomorphismandinverse-homomorphism

– String homomorphism is a functionh: Σ→Γ^{∗}
– Extended string homomorphismˆh: Σ^{∗}→Γ^{∗}

**Ashutosh Trivedi – 3 of 15**

**Closure under Homomorphism**

Example: Leth(0) =abandh(1) =εandL=10^{∗}1 thenh(L) = (ab)^{∗}.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

### Proof.

– Consider the regular expressionE(L)characterizingL, – Replace the alphabetsainE(L)by stringh(a)

– It is easy to see (by structural induction) that the corresponding expression is also a regular expression.

### Corollary

Regular languages are closed under projections (dropping of certain alphabets).

### Theorem (Closure under Substitution)

For a substitution h: Σ→REGEX(Γ)if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

**Closure under Homomorphism**

Example: Leth(0) =abandh(1) =εandL=10^{∗}1 thenh(L) = (ab)^{∗}.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

### Proof.

– Consider the regular expressionE(L)characterizingL, – Replace the alphabetsainE(L)by stringh(a)

– It is easy to see (by structural induction) that the corresponding expression is also a regular expression.

### Corollary

Regular languages are closed under projections (dropping of certain alphabets).

### Theorem (Closure under Substitution)

For a substitution h: Σ→REGEX(Γ)if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

**Ashutosh Trivedi – 3 of 15**

**Closure under Homomorphism**

Example: Leth(0) =abandh(1) =εandL=10^{∗}1 thenh(L) = (ab)^{∗}.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

### Proof.

– Consider the regular expressionE(L)characterizingL, – Replace the alphabetsainE(L)by stringh(a)

– It is easy to see (by structural induction) that the corresponding expression is also a regular expression.

### Corollary

Regular languages are closed under projections (dropping of certain alphabets).

### Theorem (Closure under Substitution)

For a substitution h: Σ→REGEX(Γ)if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

**Closure under Homomorphism**

Example: Leth(0) =abandh(1) =εandL=10^{∗}1 thenh(L) = (ab)^{∗}.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

### Proof.

– Consider the regular expressionE(L)characterizingL, – Replace the alphabetsainE(L)by stringh(a)

### Corollary

Regular languages are closed under projections (dropping of certain alphabets).

### Theorem (Closure under Substitution)

For a substitution h: Σ→REGEX(Γ)if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

**Ashutosh Trivedi – 3 of 15**

**Closure under Homomorphism**

Example: Leth(0) =abandh(1) =εandL=10^{∗}1 thenh(L) = (ab)^{∗}.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

### Proof.

– Consider the regular expressionE(L)characterizingL, – Replace the alphabetsainE(L)by stringh(a)

### Corollary

Regular languages are closed under projections (dropping of certain alphabets).

### Theorem (Closure under Substitution)

For a substitution h: Σ→REGEX(Γ)if L⊆Σ^{∗}is regular then so is h(L)⊆Γ^{∗}.

**Closure under Inverse-Homomorphism**

Example: Leth(0) =abandh(1) =εandL= (ab)^{∗}thenh^{−1}(L) = (0+1)∗.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Γ^{∗}is regular then so is h^{−1}(L)⊆Σ^{∗}.

### Proof.

– Consider the DFAA(L) = (S,Σ, δ,s0,F)characterizingL,
– The DFA corresponding toh^{−1}(L)is(S,Γ, γ,s0,F)such that

γ(s,a) = ˆδ(s,h(a)).

– Proof via induction on string size thatˆγ(s,w) = ˆδ(s,h(w)).

**Ashutosh Trivedi – 4 of 15**

**Closure under Inverse-Homomorphism**

Example: Leth(0) =abandh(1) =εandL= (ab)^{∗}thenh^{−1}(L) = (0+1)∗.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Γ^{∗}is regular then so is h^{−1}(L)⊆Σ^{∗}.

### Proof.

– Consider the DFAA(L) = (S,Σ, δ,s0,F)characterizingL,
– The DFA corresponding toh^{−1}(L)is(S,Γ, γ,s0,F)such that

γ(s,a) = ˆδ(s,h(a)).

– Proof via induction on string size thatˆγ(s,w) = ˆδ(s,h(w)).

**Closure under Inverse-Homomorphism**

Example: Leth(0) =abandh(1) =εandL= (ab)^{∗}thenh^{−1}(L) = (0+1)∗.

### Theorem (Closure under Homomorphism)

For a homomorphism h: Σ→Γ^{∗}if L⊆Γ^{∗}is regular then so is h^{−1}(L)⊆Σ^{∗}.

### Proof.

– Consider the DFAA(L) = (S,Σ, δ,s0,F)characterizingL,
– The DFA corresponding toh^{−1}(L)is(S,Γ, γ,s0,F)such that

γ(s,a) = ˆδ(s,h(a)).

– Proof via induction on string size thatˆγ(s,w) = ˆδ(s,h(w)).

**Ashutosh Trivedi – 5 of 15**

Pumping Lemma

Myhill-Nerode Theorem

**Some languages are not regular!**

Let’s do mental computations again.

– The language{0^{n}1^{n} : n≥0}

– The set of strings having an equal number of 0’s and 1’s

– The set of strings with an equal number of occurrences of 01 and 10.

– The language{ww : w∈ {0,1}^{∗}}
– The language{ww : w∈ {0,1}^{∗}}
– The language{0^{i}1^{j} : i>j}
– The language{0^{i}1^{j} : i≤j}

– The language of palindromes of{0,1}

**Ashutosh Trivedi – 6 of 15**

**Some languages are not regular!**

Let’s do mental computations again.

– The language{0^{n}1^{n} : n≥0}

– The set of strings having an equal number of 0’s and 1’s

– The set of strings with an equal number of occurrences of 01 and 10.

– The language{ww : w∈ {0,1}^{∗}}
– The language{ww : w∈ {0,1}^{∗}}
– The language{0^{i}1^{j} : i>j}
– The language{0^{i}1^{j} : i≤j}

– The language of palindromes of{0,1}

**A simple observation about DFA**

start E O

0 1

0 1

computation start E

E

O

string .

.

. 0

1

computation E start

E

E

string .

.

. 0

0

**Ashutosh Trivedi – 8 of 15**

**A simple observation about DFA**

Image source: Wikipedia

– LetA= (S,Σ, δ,s0,F)be a DFA.

– For every stringw∈Σ^{∗}of the length greater than or equal to the
number of states ofA, i.e.|w| ≥ |S|, we have that

– the uniquecomputationofAonwre-visits at least one state.

**Pumping Lemma**

### Theorem (Pumping Lemma for Regular Languages)

If L is a regular language, then

there exists a constant (pumping length) p such that for every string w∈L s.t.|w| ≥p

there exists a division of w in strings x,y,and z s.t. w=xyz such that 1. |y|>0,

2. |xy| ≤p, and

3. for all i≥0we have that xy^{i}z∈L.

– LetAbe the DFA acceptingLandpbe the set of states inA. – Letw= (a1a2. . .ak)∈Lbe any string of length≥p.

– Lets0a1s1a2s2. . .akskbe the run ofwonA.

– Letibe the index of first state that the run revisits and letjbe the index of second occurrence of that state, i.e.si=sj,

– Letx=a1a2. . .ai−1andy=aiai+1. . .aj−1, andz=ajaj+1. . .ak. – notice that|y|>0 and|xy| ≤n

– Also, notice that for alli≥0 the stringxy^{i}zis also inL.

**Ashutosh Trivedi – 9 of 15**

**Pumping Lemma**

### Theorem (Pumping Lemma for Regular Languages)

If L is a regular language, then

there exists a constant (pumping length) p such that for every string w∈L s.t.|w| ≥p

there exists a division of w in strings x,y,and z s.t. w=xyz such that 1. |y|>0,

2. |xy| ≤p, and

3. for all i≥0we have that xy^{i}z∈L.

– LetAbe the DFA acceptingLandpbe the set of states inA.

– Letw= (a1a2. . .ak)∈Lbe any string of length≥p.

– Lets0a1s1a2s2. . .akskbe the run ofwonA.

– Letibe the index of first state that the run revisits and letjbe the index of second occurrence of that state, i.e.si=sj,

– Letx=a1a2. . .ai−1andy=aiai+1. . .aj−1, andz=ajaj+1. . .ak. – notice that|y|>0 and|xy| ≤n

– Also, notice that for alli≥0 the stringxy^{i}zis also inL.

**Applying Pumping Lemma**

How to show that a languageLis non-regular.

1. Assume thatLis regular and get contradiction withpumping lemma.

2. Letnbe the pumping length.

3. (Cleverly) find arepresentativestringwofLof size greater or equal to n.

4. Try out all ways to break the string intoxyztriplet satisfying that

|y|>0 and|xy| ≤n. If the step 3 was clever enough, there will be finitely many cases to consider.

5. For every triplet show that for someithe stringxy^{i}zis not inL, and
hence it yields contradiction with pumping lemma.

Examples: 1.73, 1.74, 1.75, and 1.77.

**Ashutosh Trivedi – 11 of 15**

Pumping Lemma

Myhill-Nerode Theorem

**Equivalence and Minimization of DFA**

Minimization of a DFA:

– Two statesq,q^{0}areequivalent,q≡q^{0}, if for all stringswwe have that
ˆδ(q,w)∈Fif and only ifˆδ(q^{0},w)∈F.

– It is easy to see that≡is anequivalence relationand thus it partitions the set of all states intoequivalence classes.

– States in the same class can be merged without changing the language of the DFA.

– Quotient Construction: To minimize a DFA find all classes of equivalent states and merge them.

– Given such an equivalence relation,≡, formalize this quotient construction and prove its correctness.

**Ashutosh Trivedi – 12 of 15**

**Equivalence and Minimization of DFA**

Minimization of a DFA:

– Two statesq,q^{0}areequivalent,q≡q^{0}, if for all stringswwe have that
ˆδ(q,w)∈Fif and only ifˆδ(q^{0},w)∈F.

– It is easy to see that≡is anequivalence relationand thus it partitions the set of all states intoequivalence classes.

– States in the same class can be merged without changing the language of the DFA.

– Quotient Construction: To minimize a DFA find all classes of equivalent states and merge them.

– Given such an equivalence relation,≡, formalize this quotient construction and prove its correctness.

**Equivalence and Minimization of DFA**

How to find equivalent states:

– Notice that an accepting stateqis distinguishable from a
non-accepting stateq^{0}asδ(ˆq, ε)∈Fwhileδ(ˆq^{0}, ε)6∈F.

– We can mark such state pairsdistinguishable.

– Then iteratively keep on marking statesdistinguishableif in one step after reading a same alphabet they respectively reach to two

distinguishable states.

– If in a step no new distinguishable state is marked then the process terminates.

– This process suggests an algorithm that is known astable filling algorithm.

**Ashutosh Trivedi – 13 of 15**

**Equivalence and Minimization of DFA**

How to find equivalent states:

– Notice that an accepting stateqis distinguishable from a
non-accepting stateq^{0}asδ(ˆq, ε)∈Fwhileδ(ˆq^{0}, ε)6∈F.

– We can mark such state pairsdistinguishable.

– Then iteratively keep on marking statesdistinguishableif in one step after reading a same alphabet they respectively reach to two

distinguishable states.

– If in a step no new distinguishable state is marked then the process terminates.

– This process suggests an algorithm that is known astable filling algorithm.

**Equivalence and Minimization of DFA**

How to find equivalent states:

– Notice that an accepting stateqis distinguishable from a
non-accepting stateq^{0}asδ(ˆq, ε)∈Fwhileδ(ˆq^{0}, ε)6∈F.

– We can mark such state pairsdistinguishable.

– Then iteratively keep on marking statesdistinguishableif in one step after reading a same alphabet they respectively reach to two

distinguishable states.

– If in a step no new distinguishable state is marked then the process terminates.

– This process suggests an algorithm that is known astable filling algorithm.

**Ashutosh Trivedi – 13 of 15**

**Equivalence and Minimization of DFA**

How to find equivalent states:

^{0}asδ(ˆq, ε)∈Fwhileδ(ˆq^{0}, ε)6∈F.

– We can mark such state pairsdistinguishable.

distinguishable states.

– If in a step no new distinguishable state is marked then the process terminates.

– This process suggests an algorithm that is known astable filling algorithm.

**Myhill-Nerode Theorem**

– LetLbe a language

– Two stringsxandyaredistinguishableinLif there existszsuch that exactly one ofxzandyzinL.

– We define a relationRL(Myhill-Nerode relation) such that stringsx,y we have that(x,y)∈RLis ifxandyare not distinguishable inL.

– It is easy to see thatRAis anequivalence relationand thus it partitions the set of all strings intoequivalence classes.

### Theorem (Myhill-Nerode Theorem)

A language L is regular if and only if RLhas a finite number of equivalence classes. Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of RL.

### Corollary

There exists a unique minimal DFA for every regular language.

**Ashutosh Trivedi – 14 of 15**

**Myhill-Nerode Theorem**

– LetLbe a language

– Two stringsxandyaredistinguishableinLif there existszsuch that exactly one ofxzandyzinL.

– We define a relationRL(Myhill-Nerode relation) such that stringsx,y we have that(x,y)∈RLis ifxandyare not distinguishable inL.

– It is easy to see thatRAis anequivalence relationand thus it partitions the set of all strings intoequivalence classes.

### Theorem (Myhill-Nerode Theorem)

A language L is regular if and only if RLhas a finite number of equivalence classes. Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of RL.

### Corollary

There exists a unique minimal DFA for every regular language.

**Myhill-Nerode Theorem**

– LetLbe a language

– Two stringsxandyaredistinguishableinLif there existszsuch that exactly one ofxzandyzinL.

– We define a relationRL(Myhill-Nerode relation) such that stringsx,y we have that(x,y)∈RLis ifxandyare not distinguishable inL.

– It is easy to see thatRAis anequivalence relationand thus it partitions the set of all strings intoequivalence classes.

### Theorem (Myhill-Nerode Theorem)

A language L is regular if and only if RLhas a finite number of equivalence classes. Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of RL.

### Corollary

**Ashutosh Trivedi – 15 of 15**

**Myhill-Nerode Theorem**

### Theorem (Myhill-Nerode Theorem)

A language L is regular if and only if RLhasa finite number of equivalence classes. Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of RL.

### Proof.

The “Only if” direction:

– LetLbe regular and DFAA= (S,Σ, δ,s0,F)accepts this languages.

– The indistinguishability relationRLis defined using states ofA(L): two strings are indistinguishable ifδ(ˆs0,x) = ˆδ(s0,y).

– Notice that this relation has finitely many partitions (number of states ofAand strings in one class are indistinguishable.

**Myhill-Nerode Theorem**

### Theorem (Myhill-Nerode Theorem)

A language L is regular if and only if RLhasa finite number of equivalence classes. Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of RL.

### Proof.

The “if” direction:

– LetRLbe the indistinguishability relation with finitely many equivalence classes.

– Let each class represent a state of a DFA, where starting state is the class containingε, and the set final states is the set of equivalence classes containing strings inL.

– For two equivalence classescandc^{0}we have thatδ(c,a) =c^{0}if for
some arbitrary stringwincwe have thatwa∈c^{0}. By definition of