![]()
![]()
Inductive learning methods can be defined as those methods that systematically produce intensional concept descriptions from extensional concept descriptions. In other words, from the specific knowledge provided by domain examples, an inductive learning method is capable to obtain general domain knowledge.
There are two families of inductive learning methods. One of them, that includes algorithms such as ID3 (Quinlan, 1986) or C4.5 (Quinlan, 1993), requires that domain objects are represented as attribute-value pairs. The other family are the relational techniques, including FOIL (Quinlan, 1990) and the Inductive Logic programming (ILP) systems that handle examples and domain knowledge represented as Horn clauses. In some tasks a structured representation of the domain objects is more natural. Structured representation means that an object is represented by a set of features which values may be, in turn, objects with a set of features. Features Terms is a formal view for the structured representation of objects.
In general, inductive methods can be characterized as search methods over
a hypothesis space. The search techniques usually comply to certain
biases: constraints upon the hypothesis space effectively searched and
strategies for searching certain subspaces before others. To constraint
the hypothesis space, relational learners introduce a partial order
between hypotheses (e.g. in Horn clauses it is called -subsumption). Then, this ordered
space is searched according to the bias of each inductive method.
Feature terms offer a representation formalism that is a subset of first
order logic where subsumption provides a well defined and natural way for
defining generalization relationships: a feature term
is more general than (or equal to) another feature
term
' if and only if
subsumes
'.
The inductive learning methods that we have developed ( INDIE and DISC) use feature terms to uniformly represent examples and generalizations, so both are considered partial descriptions and have a structured representations. The more general than relation commonly used in Machine Learning is here the subsumption relation among feature terms. When we say that a description D subsumes an example e1, this is equivalent to say that D is more general than e2.
The anti-unification of two examples e1 and e2 is a description that is also a term in the formalism, and is a most specific generalization subsuming the examples e1 and e2. In our example above, T2 is the anti-unification of the feature terms T4 and T5.
Induction from a set of examples means to obtain a description generalizing those examples. If the examples have a structured representation (feature terms), induction means to search for a term subsuming the examples represented as terms. Thus, if T3, T4 and T5 are considered examples, the feature term T1 can be viewed as the generalization of them.
References
Quinlan, J.R. (1986), Induction of Decision Trees. Machine Learning, 1, pp. 81-106.
Quinlan, J.R. (1990), Learning Logical Definitions from Relations. Machine Lerning, 5, pp.239-266.
Quinlan, J.R. (1993), C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo, California.