Consider a concept learning problem where the data D, which concerns ancient Egyptian vases discovered in archeological excavations, is expressed as tuples of five attributes: damaged, color, material, kingdom, markings. Examples are classified as either valuable (+) or not valuable (-), and D consists of the following: Assume that all possible values of each attribute are represented in D above. (a) What is the size of the hypothesis space searched by the candidate elimination algorithm (CEA) using the data D given above? (b) Suppose the CEA has seen examples 1 and 2 only so far. Show its current specific boundary S_2 and general boundary G_2 for the version space. (c) Show S_3 and G_3 after the CEA also sees example 3. (d) Show S_5 and G_5 after the CEA also sees the final two examples 4 and 5.
Expert Answer
- Initialize
to the set of maximally general hypotheses in
- Initialize
to the set of maximally specific hypotheses in
- For each training example
, do
- If
is a positive example
- Remove from
any hypothesis inconsistent with
- For each hypothesis
in
that is not consistent with
- Remove
from
- Add to
all minimal generalizations
of
such that
is consistent with
, and some member of
is more general than
- Remove from
any hypothesis that is more general than another hypothesis is
- Remove
- Remove from
- If
is a negative example
- Remove from
any hypothesis inconsistent with
- For each hypothesis
in
that is not consistent with
- Remove
from
- Add to
all minimal specializations
of
such that
is consistent with
, and some member of
is more specific than
- Remove from
any hypothesis that is less general than another hypothesis is
- Remove
- Remove from
- If