## CS310 : Automata Theory 2020

### Lecture 7: Regular expressions

Instructor: S. Akshay

IIT Bombay, India

27-01-2020

### Recap

I Deterministic finite state automata (DFA)

I Formal definition, run, accepting run, (extended) transition relation I Regular languages, closure under Union, intersection and

complementation

I Non-deterministic finite state automata (NFA)

I From NFA to DFA: The Subset construction, worst-case exponential blowup

I Applications to text search using automata.

I -NFA and conversion to NFA.

I Today: An algebraic view of regular languages.

### Recap

I Deterministic finite state automata (DFA)

I Formal definition, run, accepting run, (extended) transition relation I Regular languages, closure under Union, intersection and

complementation

I Non-deterministic finite state automata (NFA)

I From NFA to DFA: The Subset construction, worst-case exponential blowup

I Applications to text search using automata.

I -NFA and conversion to NFA.

I Today: An algebraic view of regular languages.

### Recap

I Deterministic finite state automata (DFA)

I Formal definition, run, accepting run, (extended) transition relation I Regular languages, closure under Union, intersection and

complementation

I Non-deterministic finite state automata (NFA)

I From NFA to DFA: The Subset construction, worst-case exponential blowup

I Applications to text search using automata.

I -NFA and conversion to NFA.

I Today: An algebraic view of regular languages.

### Adding transitions

An -nondeterministic finite-state automaton (-NFA) is a 5-tuple

(Q,Σ, δ,q0,F) where

I Q is a finite set of states

I Σ is a finite alphabet, i.e., set of input symbols

I δ :Q×(Σ∪)→P(Q) is a function that takes a state and input symbol and returns the set of possible next states,

I q_{0}∈Q is the start/initial state

I F ⊆Q is the set of final/accepting states.

### -closure

Definition

Let (Q,Σ, δ,q_{0},F) be an-NFA. For each setS ⊆Q,EClose(S) is the set of
states reachable via -transitions from S.

Inductive definition ofEClose(S): I for all q ∈S,q ∈EClose(S)

I if q∈EClose(S) andq^{0} ∈δ(q, ), then q^{0} ∈EClose(S).

### -closure

Definition

Let (Q,Σ, δ,q_{0},F) be an-NFA. For each setS ⊆Q,EClose(S) is the set of
states reachable via -transitions from S.

Inductive definition of EClose(S):

I for all q ∈S,q ∈EClose(S)

I if q∈EClose(S) andq^{0} ∈δ(q, ), then q^{0} ∈EClose(S).

### Extended transition relation for -NFA

Extended transition relation

Let ˆδ:Q×Σ^{∗}→ P(Q) be defined as:

δ(q, ) =ˆ EClose({q}) δ(q,ˆ wa) = [

q^{0}∈ˆδ(q,w)

EClose(δ(q^{0},a))

Acceptance

An-NFAA acceptsw iff ˆδ(q0,w)∩F 6=∅.

### Extended transition relation for -NFA

Extended transition relation

Let ˆδ:Q×Σ^{∗}→ P(Q) be defined as:

δ(q, ) =ˆ EClose({q}) δ(q,ˆ wa) = [

q^{0}∈ˆδ(q,w)

EClose(δ(q^{0},a))

Acceptance

An -NFAA acceptsw iff ˆδ(q0,w)∩F 6=∅.

### Removing -transitions

Theorem

For any-NFAA, there exists an NFA A^{0} (without -transitions) such that
L(A) =L(A^{0}).

Proof: Let A= (Q,Σ, δ,q0,F) be an-NFA. Then, we construct NFA
A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) s.t.,

I Q^{0} =Q∪˜q_{0},q_{0}^{0} = ˜q_{0}, Σ is the same but no -transitions are used.

I F =

(F ∪ {˜q0} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q0,a))) ifq = ˜q0

EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA, there exists an NFA A^{0} (without -transitions) such that
L(A) =L(A^{0}).

Proof: Let A= (Q,Σ, δ,q0,F) be an-NFA. Then, we construct NFA
A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) s.t.,

I Q^{0} =Q∪˜q_{0},q_{0}^{0} = ˜q_{0}, Σ is the same but no -transitions are used.

I F =

(F ∪ {˜q0} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q0,a))) ifq = ˜q0

EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA, there exists an NFA A^{0} (without -transitions) such that
L(A) =L(A^{0}).

Proof: Let A= (Q,Σ, δ,q0,F) be an-NFA. Then, we construct NFA
A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) s.t.,

I Q^{0} =Q∪˜q_{0},q_{0}^{0} = ˜q_{0}, Σ is the same but no -transitions are used.

I F =

(F ∪ {˜q0} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q0,a))) ifq = ˜q0

EClose(δ(q,a)) otherwise

### Removing -transitions

q_{0}

start q_{1} q_{2}

b

a

a,b

a

q_{0} q_{1} q_{2}

˜
q_{0}
start

a b a

a

a,b

a a

a

a

Can we simplify this?

### Removing -transitions

q_{0}

start q_{1} q_{2}

b

a

a,b

a

q_{0} q_{1} q_{2}

˜
q_{0}
start

a b a

a

a,b

a a

a

a

### Removing -transitions

Theorem

For any-NFAA= (Q,Σ, δ,q_{0},F), there exists an NFA

A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) (without -transitions) such thatL(A) =L(A^{0}).

Proof: Construction in 3 steps:

1. Saturate: repeatedly add shortcuts that make -transitions redundant.

2. Fix final states: if some state reachable from initial state by -transitions is final, then make initial state as final!

3. Remove-transitions.

Can we do this in one shot?

I Q^{0}=Q,q_{0}^{0} =q0, Σ is the same but no-transitions are used.
I F =

(F ∪ {q_{0}} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q_{0},a))) ifq =q_{0}
EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA= (Q,Σ, δ,q_{0},F), there exists an NFA

A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) (without -transitions) such thatL(A) =L(A^{0}).

Proof: Construction in 3 steps:

1. Saturate: repeatedly add shortcuts that make -transitions redundant.

2. Fix final states: if some state reachable from initial state by -transitions is final, then make initial state as final!

3. Remove-transitions.

Can we do this in one shot?

I Q^{0}=Q,q_{0}^{0} =q0, Σ is the same but no-transitions are used.
I F =

(F ∪ {q_{0}} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q_{0},a))) ifq =q_{0}
EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA= (Q,Σ, δ,q_{0},F), there exists an NFA

A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) (without -transitions) such thatL(A) =L(A^{0}).

Proof: Construction in 3 steps:

1. Saturate: repeatedly add shortcuts that make -transitions redundant.

2. Fix final states: if some state reachable from initial state by -transitions is final, then make initial state as final!

3. Remove-transitions.

Can we do this in one shot?

I Q^{0} =Q,q_{0}^{0} =q0, Σ is the same but no-transitions are used.

I F =

(F ∪ {q_{0}} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q_{0},a))) ifq =q_{0}
EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA= (Q,Σ, δ,q_{0},F), there exists an NFA

A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) (without -transitions) such thatL(A) =L(A^{0}).

Proof: Construction in 3 steps:

1. Saturate: repeatedly add shortcuts that make -transitions redundant.

3. Remove-transitions.

Can we do this in one shot?

I Q^{0} =Q,q_{0}^{0} =q0, Σ is the same but no-transitions are used.

I F =

(F ∪ {q_{0}} ifEClose({q_{0}})∩F 6=∅

F otherwise

I δ^{0}(q,a) =

(EClose(δ(EClose(q_{0},a))) ifq =q_{0}
EClose(δ(q,a)) otherwise

### Removing -transitions

Theorem

For any-NFAA= (Q,Σ, δ,q_{0},F), there exists an NFA

A^{0} = (Q^{0},Σ, δ^{0},q^{0},F^{0}) (without -transitions) such thatL(A) =L(A^{0}).

Proof: Construction in 3 steps:

1. Saturate: repeatedly add shortcuts that make -transitions redundant.

3. Remove-transitions.

Can we do this in one shot?

I Q^{0} =Q,q_{0}^{0} =q0, Σ is the same but no-transitions are used.

I F =

(F ∪ {q_{0}} ifEClose({q_{0}})∩F 6=∅

F otherwise

### Another view of regular languages

Consider any regular language

### Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words

### Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words Can these be used to generate all regular languages?

### Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

### Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

The Kleene Star operation

For a language L, itsKleene closure, denoted L^{∗} is the set of all strings
obtained by taking any number of strings from Lwith possible repetitions
and concatenating all of them.

### Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

The Kleene Star operation

For a language L, itsKleene closure, denoted L^{∗} is the set of all strings
obtained by taking any number of strings from Lwith possible repetitions
and concatenating all of them.

### Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar
L::= Σ| ∅ |L+L|L◦L|L^{∗}

### Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar

L::= Σ| ∅ |L+L|L◦L|L^{∗}

I L is being recursively defined.

I |separates the constructors I + is the union

I ◦ is concatenation, often we just writeL_{1}L_{2}

### Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar

L::= Σ| ∅ |L+L|L◦L|L^{∗}

I L is being recursively defined.

I |separates the constructors I + is the union

I ◦ is concatenation, often we just writeL_{1}L_{2}

I In the above notation, different occurrences ofLare not required to be same expressions

### Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union
I ((0)1) + (0)^{∗} =?

I (01)2 =?

I 01^{∗}+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)^{∗}(0 ++ 11)

### Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union

I ((0)1) + (0)^{∗} =?
I (01)2 =?

I 01^{∗}+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)^{∗}(0 ++ 11)

### Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union
I ((0)1) + (0)^{∗} =?

I (01)2 =?

I 01^{∗}+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)^{∗}(0 ++ 11)

### Examples

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union
I ((0)1) + (0)^{∗} =?

I (01)2 =?

I 01^{∗}+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)^{∗}(0 ++ 11)

### Examples

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union
I ((0)1) + (0)^{∗} =?

I (01)2 =?

I 01^{∗}+ 10 + 11 =?

Is a regular expression unique?

### From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof: exercise! Try structural induction.

### From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof:

exercise! Try structural induction.

### From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof: exercise! Try structural induction.

### From DFA to Regular expressions

Theorem

Every regular language can be defined by a regular expression!