Outline
[read Chapter 2]
[suggested exercises 2.2, 2.3, 2.4, 2.6]
Learning from examples
General-to-specic ordering over hypotheses
Version spaces and candidate elimination algorithm
Picking new examples
The need for inductive bias
Note: simple approach assuming no noise, illustrates key concepts
Training Examples for EnjoySport
Sky Temp Humid Wind Water Forecst EnjoySpt Sunny Warm Normal Strong Warm Same Yes Sunny Warm High Strong Warm Same Yes Rainy Cold High Strong Warm Change No Sunny Warm High Strong Cool Change Yes
What is the general concept?
Representing Hypotheses
Many possible representations
Here,
h
is conjunction of constraints on attributes Each constraint can bea specc value (e.g.,
Water
=Warm
)don't care (e.g., \
Water
=?")no value allowed (e.g.,\Water=;") For example,
Sky AirTemp Humid Wind Water Forecst
h
Sunny
? ?Strong
?Same
iPrototypical Concept Learning Task
Given:
{
InstancesX
: Possible days, each described by the attributes Sky, AirTemp, Humidity,Wind, Water, Forecast
{
Target functionc
:EnjoySport
:X
! f0;
1g{
HypothesesH
: Conjunctions of literals. E.g.h?
;Cold;High;
?;
?;
?i:
{
Training examplesD
: Positive and negative examples of the target functionh
x
1;c
(x
1)i;:::
hx
m;c
(x
m)iDetermine:
A hypothesish
inH
such thatThe inductive learning hypothesis:
Anyhypothesis found to approximate the target function well over a suciently large set of training examples will also approximate the target function well over other unobserved examples.
Instance, Hypotheses, and More- General-Than
h = <Sunny, ?, ?, Strong, ?, ?>
h = <Sunny, ?, ?, ?, ?, ?>
h = <Sunny, ?, ?, ?, Cool, ?>
h2
h h3
Instances X Hypotheses H
Specific
General
x1
x2
x = <Sunny, Warm, High, Strong, Cool, Same>
x = <Sunny, Warm, High, Light, Warm, Same>
1
1 2
1 2 3
Find-S Algorithm
1. Initialize
h
to the most specic hypothesis inH
2. For each positive training instancex
For each attribute constraint
a
i inh
If the constraint
a
i inh
is satised byx
Then do nothingElse replace
a
i inh
by the next more general constraint that is satised byx
3. Output hypothesish
Hypothesis Space Search by Find-S
Instances X Hypotheses H
Specific
General
x1
x2 x3
x4
h0 h1 h2,3
h4
+ +
+
x = <Sunny Warm High Strong Cool Change>, + 4
x = <Sunny Warm Normal Strong Warm Same>, +1 x = <Sunny Warm High Strong Warm Same>, +
2
x = <Rainy Cold High Strong Warm Change>, - 3
h = <Sunny Warm Normal Strong Warm Same>1 h = <Sunny Warm ? Strong Warm Same>2
h = <Sunny Warm ? Strong ? ? >
4
h = <Sunny Warm ? Strong Warm Same>
3
h = 0 <∅, ∅, ∅, ∅, ∅, ∅>
-
Complaints about Find-S
Can't tell whether it has learned concept
Can't tell when training data inconsistent
Picks a maximally specic
h
(why?)Depending on
H
, there might be several!Version Spaces
A hypothesis
h
isconsistent
with a set of training examplesD
of target conceptc
if and only ifh
(x
) =c
(x
) for each training exampleh
x;c
(x
)i inD
.Consistent
(h;D
) (8hx;c
(x
)i 2D
)h
(x
) =c
(x
) Theversion space
,V S
H;D, with respect tohypothesis space
H
and training examplesD
, is the subset of hypotheses fromH
consistent with all training examples inD
.V S
H;D fh
2H
jConsistent
(h;D
)gThe List-Then-Eliminate Algorithm:
1.
V ersionSpace
a list containing every hypothesis inH
2. For each training example, h
x;c
(x
)iremove from
V ersionSpace
any hypothesish
for whichh
(x
) 6=c
(x
)3. Output the list of hypotheses in
V ersionSpace
Example Version Space
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>
<Sunny, Warm, ?, Strong, ?, ?>
{ }
G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
Representing Version Spaces
The
General boundary
, G, of version spaceV S
H;D is the set of its maximally general membersThe
Specic boundary
, S, of version spaceV S
H;D is the set of its maximally specic membersEvery member of the version space lies between these boundaries
V S
H;D = fh
2H
j(9s
2S
)(9g
2G
)(g
h
s
)g wherex
y
meansx
is more general or equal toy
Candidate Elimination Algorithm
G
maximally general hypotheses inH S
maximally specic hypotheses inH
For each training exampled
, doIf
d
is a positive example{
Remove fromG
any hypothesis inconsistent withd
{
For each hypothesiss
inS
that is not consistent withd
Remove
s
fromS
Add to
S
all minimal generalizationsh
ofs
such that1.
h
is consistent withd
, and2. some member of
G
is more general thanh
{
Remove fromS
any hypothesis inconsistent withd
{
For each hypothesisg
inG
that is not consistent withd
Remove
g
fromG
Add to
G
all minimal specializationsh
ofg
such that1.
h
is consistent withd
, and2. some member of
S
is more specic thanh
Remove from
G
any hypothesis that is less general than another hypothesis inG
Example Trace
S0: {<Ø, Ø, Ø, Ø, Ø, Ø>}
What Next Training Example?
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>
<Sunny, Warm, ?, Strong, ?, ?>
{ }
G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
How Should These Be Classied?
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>
<Sunny, Warm, ?, Strong, ?, ?>
{ }
G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
h
Sunny Warm Normal Strong Cool Change
ih
Rainy Cool Normal Light Warm Same
iWhat Justies this Inductive Leap?
+ h
Sunny Warm Normal Strong Cool Change
i + hSunny Warm Normal Light Warm Same
iS
: hSunny Warm Normal
? ? ?i Why believe we can classify the unseenh
Sunny Warm Normal Strong Warm Same
iAn UNBiased Learner
Idea: Choose
H
that expresses every teachable concept (i.e.,H
is the power set ofX
)Consider
H
0 = disjunctions, conjunctions, negations over previousH
. E.g.,h
Sunny Warm Normal
? ? ?i _ :h? ? ? ? ?Change
i What areS
,G
in this case?SG
Inductive Bias
Consider
concept learning algorithm
L
instances
X
, target conceptc
training examples
D
c = fhx;c
(x
)iglet
L
(x
i;D
c) denote the classication assigned to the instancex
i byL
after training on dataD
c.Denition
:The
inductive bias
ofL
is any minimal set of assertionsB
such that for any targetconcept
c
and corresponding training examplesD
c(8
x
i 2X
)[(B
^D
c ^x
i) `L
(x
i;D
c)]where
A
`B
meansA
logically entailsB
Inductive Systems and Equivalent Deductive Systems
Candidate Elimination Algorithm
Using Hypothesis Space
Training examples
New instance
Equivalent deductive system
Theorem Prover Training examples
New instance
Classification of new instance, or
"don’t know"
Classification of new instance, or
"don’t know"
Inductive system
H
Assertion " contains the target concept"
H
Three Learners with Dierent Biases
1. Rote learner: Store examples, Classify
x
i it matches previously observed example.2. Version space candidate elimination algorithm 3. Find-S
Summary Points
1. Concept learning as search through
H
2. General-to-specic ordering overH
3. Version space candidate elimination algorithm 4.
S
andG
boundaries characterize learner'suncertainty
5. Learner can generate useful queries
6. Inductive leaps possible only if learner is biased 7. Inductive learners can be modelled by equivalent
deductive systems