• No results found

Lecture 8: Regular expressions and NFA

N/A
N/A
Protected

Academic year: 2022

Share "Lecture 8: Regular expressions and NFA"

Copied!
48
0
0

Loading.... (view fulltext now)

Full text

(1)

CS310 : Automata Theory 2020

Lecture 8: Regular expressions and NFA

Instructor: S. Akshay

IIT Bombay, India

28-01-2020

(2)

Recap

I Deterministic finite state automata (DFA): run, accepting run, (extended) transition relation

I Regular languages: closure under Union, intersection and complementation

I Non-deterministic finite state automata (NFA): the Subset construction, worst-case exponential blowup

I Applications 1: text search using automata.

I -NFA: conversion to NFA.

I Yesterday and today: Regular expressions

(3)

Recap

I Deterministic finite state automata (DFA): run, accepting run, (extended) transition relation

I Regular languages: closure under Union, intersection and complementation

I Non-deterministic finite state automata (NFA): the Subset construction, worst-case exponential blowup

I Applications 1: text search using automata.

I -NFA: conversion to NFA.

I Yesterday and today: Regular expressions

(4)

Another view of regular languages

Consider any regular language

(5)

Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words

(6)

Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words Can these be used to generate all regular languages?

(7)

Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

(8)

Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

The Kleene Star operation

For a language L, itsKleene closure, denoted L is the set of all strings obtained by taking any number of strings from Lwith possible repetitions and concatenating all of them.

(9)

Another view of regular languages

Consider any regular language I it can be letters 1 ora or etc.

I it can be emptyset.

I it can be a word, e.g., 110: obtained as a (finite) concatenation of letters.

I it can be several words: obtained as a (finite) union of sets of words I it can be something that “repeats”!

The Kleene Star operation

For a language L, itsKleene closure, denoted L is the set of all strings obtained by taking any number of strings from Lwith possible repetitions and concatenating all of them.

L =∪i≥0Li

(10)

Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar L::= Σ| ∅ |L+L|L◦L|L

(11)

Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar

L::= Σ| ∅ |L+L|L◦L|L I L is being recursively defined.

I |separates the constructors I + is the union

I ◦ is concatenation, often we just writeL1L2

(12)

Regular expressions as a grammar

Given an alphabet Σ,

Regular expressions are all those generated by following grammar

L::= Σ| ∅ |L+L|L◦L|L I L is being recursively defined.

I |separates the constructors I + is the union

I ◦ is concatenation, often we just writeL1L2

I In the above notation, different occurrences ofLare not required to be same expressions

(13)

Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union I ((0)1) + (0) =?

I (01)2 =?

I 01+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(14)

Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union

I ((0)1) + (0) =? I (01)2 =?

I 01+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(15)

Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union I ((0)1) + (0) =?

I (01)2 =?

I 01+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(16)

Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union I ((0)1) + (0) =?

I (01)2 =?

I 01+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(17)

Examples

Given regular expressions for the following languages over Σ ={a,b,c} 1. All words containing at least one a.

2. words containing at least one aor oneb.

3. words containing at least one aand one b.

Order of precedence

Paranthesis are used or define precedence order: Kleene>Concat >union I ((0)1) + (0) =?

I (01)2 =?

I 01+ 10 + 11 =?

Is a regular expression unique?

Simplify this: 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(18)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity: Only for +: L+M =M +L

Identity and Annihilators I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication. Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(19)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N

I Commutativity: Only for +: L+M =M +L Identity and Annihilators

I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication. Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(20)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity:

Only for +: L+M =M +L Identity and Annihilators

I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication. Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(21)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity: Only for +: L+M =M+L

Identity and Annihilators I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication. Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(22)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity: Only for +: L+M =M+L

Identity and Annihilators I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication. Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(23)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity: Only for +: L+M =M+L

Identity and Annihilators I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication.

Distributivity: L(M +N) = LM+LN,(M +N)L=ML+NL

(24)

Algebraic laws of regular expressions

Let Land M be regular expressions. We often identify the regular expression with the language of strings it represents.

Associativity and commutativity

I Associativity: L+ (M+N) = (L+M) +N,L(MN) = (LM)N I Commutativity: Only for +: L+M =M+L

Identity and Annihilators I L+∅=∅+L=L I L=L=L I ∅L=∅

Union (denoted +) acts as addition and concat as multiplication.

Distributivity: L(M +N) = LM+LN,(M +N)L =ML+NL

(25)

Algebraic laws of regular expressions (Contd)

I (L) =L I ∅ =

I L+=LL =LL I +L=L I L =+L+

Simplify the following expressions: 1. 0 + 01

2. 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(26)

Algebraic laws of regular expressions (Contd)

I (L) =L I ∅ =

I L+=LL =LL I +L=L I L =+L+

Simplify the following expressions:

1. 0 + 01

2. 1∗01 + 1∗01(0 ++ 11)(0 ++ 11)

(27)

Proving “new” laws

Consider

(L+S) = (LS)

ReplacingLby 0 and S by 1 we get

(0 + 1) = (01)

which is of course same. From this can we conclude identity?

(28)

Proving “new” laws

Consider

(L+S) = (LS) Replacing Lby 0 and S by 1 we get

(0 + 1) = (01) which is of course same.

From this can we conclude identity?

(29)

Proving “new” laws

Consider

(L+S) = (LS) Replacing Lby 0 and S by 1 we get

(0 + 1) = (01)

which is of course same. From this can we conclude identity?

(30)

A test for laws of regular expression

I Let R be a regular expression withL1, ..,Ln as variables

I Let R0 be the concrete regular expression by replacing L1, ...,Ln by fresh symbols a1, ...,an in R.

Theorem

For all languagesL1, . . .Ln, each stringw ∈L(R) can be written as w =w1w2...wk ∈L(R) such thatwi is in some languageLji, and the word aj1, ...,ajn ∈L(R0).

Proof sketch.

SinceR and R0 have same structure, the word produced in R must have a counterpart inR0.

Exercise: (read the) formal proof via structural induction.

Due to the above theorem, we can only analyze concrete languages for proving laws of regular expressions. Use it carefully!

(31)

A test for laws of regular expression

I Let R be a regular expression withL1, ..,Ln as variables

I Let R0 be the concrete regular expression by replacing L1, ...,Ln by fresh symbols a1, ...,an in R.

Theorem

For all languages L1, . . .Ln, each stringw ∈L(R) can be written as w =w1w2...wk ∈L(R) such thatwi is in some languageLji, and the word aj1, ...,ajn ∈L(R0).

Proof sketch.

SinceR and R0 have same structure, the word produced in R must have a counterpart inR0.

Exercise: (read the) formal proof via structural induction.

Due to the above theorem, we can only analyze concrete languages for proving laws of regular expressions. Use it carefully!

(32)

A test for laws of regular expression

I Let R be a regular expression withL1, ..,Ln as variables

I Let R0 be the concrete regular expression by replacing L1, ...,Ln by fresh symbols a1, ...,an in R.

Theorem

For all languages L1, . . .Ln, each stringw ∈L(R) can be written as w =w1w2...wk ∈L(R) such thatwi is in some languageLji, and the word aj1, ...,ajn ∈L(R0).

Proof sketch.

Since R and R0 have same structure, the word produced in R must have a counterpart in R0.

Exercise: (read the) formal proof via structural induction.

Due to the above theorem, we can only analyze concrete languages for proving laws of regular expressions. Use it carefully!

(33)

A test for laws of regular expression

I Let R be a regular expression withL1, ..,Ln as variables

I Let R0 be the concrete regular expression by replacing L1, ...,Ln by fresh symbols a1, ...,an in R.

Theorem

For all languages L1, . . .Ln, each stringw ∈L(R) can be written as w =w1w2...wk ∈L(R) such thatwi is in some languageLji, and the word aj1, ...,ajn ∈L(R0).

Proof sketch.

Since R and R0 have same structure, the word produced in R must have a counterpart in R0.

Exercise: (read the) formal proof via structural induction.

Due to the above theorem, we can only analyze concrete languages for

Use it carefully!

(34)

A test for laws of regular expression

I Let R be a regular expression withL1, ..,Ln as variables

I Let R0 be the concrete regular expression by replacing L1, ...,Ln by fresh symbols a1, ...,an in R.

Theorem

For all languages L1, . . .Ln, each stringw ∈L(R) can be written as w =w1w2...wk ∈L(R) such thatwi is in some languageLji, and the word aj1, ...,ajn ∈L(R0).

Proof sketch.

Since R and R0 have same structure, the word produced in R must have a counterpart in R0.

Exercise: (read the) formal proof via structural induction.

Due to the above theorem, we can only analyze concrete languages for

(35)

Regular expressions and regular languages

Going from regular expressions to DFA and back

1. Given any regular expression, show that the language it defines is regular.

2. Given any regular language, show that there exists a regular expression defining it.

(36)

From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof: exercise! Try structural induction.

(37)

From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof:

exercise! Try structural induction.

(38)

From Regular expressions to DFA

Theorem

Every language defined by a regular expression is regular!

Proof: exercise! Try structural induction.

(39)

Regular expressions to -NFA

Theorem

For a given RE R, there is an-NFAA such thatL(R) =L(A).

Proof.

We have six constructors for RE. Three of them are base cases

I I ∅ I a

The other three are inductive I R1+R2

I R1R2

I R ...

(40)

Regular expressions to -NFA

Theorem

For a given RE R, there is an-NFAA such thatL(R) =L(A).

Proof.

We have six constructors for RE.

Three of them are base cases I

I ∅ I a

The other three are inductive I R1+R2

I R1R2

I R ...

(41)

Regular expressions to -NFA

Theorem

For a given RE R, there is an-NFAA such thatL(R) =L(A).

Proof.

We have six constructors for RE.

Three of them are base cases I

I ∅ I a

The other three are inductive I R1+R2

I R1R2

I R ...

(42)

Regular expressions to -NFA

Theorem

For a given RE R, there is an-NFAA such thatL(R) =L(A).

Proof.

We have six constructors for RE.

Three of them are base cases I

I ∅ I a

The other three are inductive I R1+R2

I R1R2

I R ...

(43)

Regular expressions to NFA II

Proof(contd.) base case:

-NFAs for the base cases I

q1

start q2

I ∅

q1

start q2

I a

q1

start a q2

...

(44)

Regular expressions to NFA II

Proof(contd.) base case:

-NFAs for the base cases I

q1

start q2

I ∅

q1

start q2

I a

q1

start a q2

...

(45)

Regular expressions to NFA II

Proof(contd.) base case:

-NFAs for the base cases I

q1

start q2

I ∅

q1

start q2

I a

q1

start a q2

...

(46)

Regular expressions to NFA III

Proof(contd.) induction step:

-NFAs for the inductive constructors I L1+L2

q0

start

q01 q02

qf1 qf2

qf

L1 L2

I L1L2

q0

start q01 L1 qf1 q02 L2 qf2 qf

I L1

q0

start q01 L1 qf1 qf

(47)

Regular expressions to NFA III

Proof(contd.) induction step:

-NFAs for the inductive constructors I L1+L2

q0

start

q01 q02

qf1 qf2

qf

L1 L2

I L1L2

q0

start q01 L1 qf1 q02 L2 qf2 qf

I L1

q0

start q01 L1 qf1 qf

(48)

Regular expressions to NFA III

Proof(contd.) induction step:

-NFAs for the inductive constructors I L1+L2

q0

start

q01 q02

qf1 qf2

qf

L1 L2

I L1L2

q0

start q01 L1 qf1 q02 L2 qf2 qf

I L1

L1

References

Related documents

Moreover, the number of states is the smallest DFA recognizing L is equal to the number of equivalence classes of R

– Nondeterministic finite state automata (also with ε-transitions) – Kleene’s regular expressions, also appeared as Type-3 languages in.. Chomsky’s

I Text processing: Given input as English alphabet, output can be capitalized words. I Translation: Given input in English, output can be

– Nondeterministic finite state automata (also with ε-transitions) – Kleene’s regular expressions, also appeared as Type-3 languages in!. Chomsky’s

I Non-deterministic finite state automata (NFA): the Subset construction, worst-case exponential blowup.. I Applications 1: text search

Regular expressions in emacs and lexical analyzers I Search function takes regular expressions.. I Try search

cbna CS310 : Automata Theory 2019 Instructor: Ashutosh Gupta IITB, India 2..

All languages arising from MSGs are finitely generated, so the language ac- cepted by the message-passing automaton on Figure 2 shows that not all regular MSC languages can be