When bioscientists are interested in the analysis of the behaviour of a molecule they should obtain this molecule in a pure form from a biological sample. This sample could be a biological tissue (animal or vegetal) that contains, besides the molecule of interest, hundreds of other molecules that should be removed. Protein purification is the process that allows the isolation of one molecule among many others. Once the molecule of interest has been isolated, its structure, function, electrical and physical properties and behaviour can be analyzed. The development of techniques and methods for the separation and purification of biological macro-molecules (such as proteins) has been an important prerequisite for many of the advancements made in biosciences and biotechnology over the past three decades. The main problems that can appear in a purification process are in general related with denaturation, proteolysis and contamination with pyrogens, nucleic acids, bacteria and viruses.
The usual way to search for a purification procedure is to look in the literature for previos purifications of the protein that we are interested in. Then we can use the same source than the obtained experiments and, consequently, the same purification process will be useful. The main difficulty is the unavailability of the sources used in the obtained literature. Therefore, the purification process has to be modified according to the characteristics of the available source. The optimization of the chosen purification process is made by a systematic variation of parameters as the composition of the extraction method. The extraction of a protein from a solid source implies an agreement between the retrieval of the protein and its purity.
CHROMA has a base of cases containing experiments obtained from the literature (Comparative Biochemistry and Physiology revue). CHROMA searches in this base and the result of this search is one or several experiments close to our experiment providing a first approximation of how the protein of interest can be purified. We want to make special emphasis in that the adequacy of the proposed solution can be only evaluated in the laboratory. That makes difficult the evaluation of CHROMA.
The main task of CHROMA is the
purification task. Given a new
experiment and a base of solved experiments, the goal of the
purification task is to find a sequence of chromatographic
techniques (purification plan) purifying the protein of the new experiment.
The domain expert uses different strategies to find a purification plan for the
M1) Searching for an experiment using exactly the same sample for the same protein.
M2) Searching for experiments purifying the same protein but from other kinds of sample. If more than one is found, the domain expert chooses one of them according to some specific criteria.
M3) If the sample of the current experiment satisfies some specific domain properties (i.e. the current protein belong to a special family of proteins), the domain expert knows which purification plan to apply without searching for past experiments.
M4) If the domain expert has not found any experiment in the literature purifying the protein of the current experiment, he tries to build a purification plan by trial and error in the laboratory. The steps of this purification plan are build according to the characteristics of each purification techniques.
Each of these strategies has been modelled in CHROMA by a different problem
solving method. In particular, strategy M1 has been modelled by the
equal-sample method that detects if there is an experiment in the
case base having the same protein and sample as the current experiment.
analogy-by-determination method is a case-based method, used to
model strategy M2, that retrieves experiments from the case base that purify the
same protein. Given a protein P, several experiments purifying P can be
analogy-by-determination method performs some
interaction with the user in order to let him decide the most appropriate
Strategy M3 has been modelled by a classification method called
purify-by-class. This method uses intensional concept descriptions
to determine the
class to which an experiment belongs. The
needs two input models: new experiment and class descriptions.
The New experiment
model contains the description of a sample from which a protein has to be
purified. The class descriptions model contains the descriptions of the
to which a purification experiment can belong. This model is not provided by the
domain expert, so during the KM analysis a KA-Task has to be associated to it.
This KA-Task is solved using a learning method, called
that induces the descriptions of the classes from the experiments
contained in the experiments model.
During the KM analysis of the domain, four PSM have been associated to the
purification task. Let us suppose that in the CHROMA application the methods
purify-by-class, analogy-by-determination, and
are sequentially tried in this order. If a new experiment wants to purify a
protein that is not used in any experiment of the base of cases, the only
applicable method is
default-plan. Using the sequential order, all
the methods have to be executed (and fail) before to obtain the solution from
The KM analysis of the domain suggests a more intelligent strategy to select the appropriate method. We propose to use a lazy problem-centred selection of the method taking into account an Applicability Conditions model and a Preferences model. In particular, the Applicability Conditions model in CHROMA contains the following knowledge:
If the protein of the current problem is not purified in any experiment in the case base, the only applicable method isThe Preferences model contains preferences provided by the domain expert in order to choose one method if more than one is applicable. In CHROMA the Preferences model contains the following preferences:
If there is no experiment in the case base using the same sample that the current problem the applicable methods are the
analogy-by-determinationmethod and the
If the sample of the current problem does not satisfy any class description, the applicable methods are the
analogy-by-determinationmethod and the
default-planmethod. As we will see later, to evaluate this condition CHROMA needs an additional model called control sample.
1) If applicable,
equal-sampleis preferable to others (since identical precedent assures an appropriate solution)
default-planis the less preferable
purify-by-classare equally preferable if both are applicable.
The lazy problem-centred strategy has been implemented using a
task at the meta-level of the
purification task. The
selection task has as input the control sample model that
contains the description of a sample S. Each feature A of the sample S has as values
the disjunction of the values that A takes in all the case base experiments. The
selection task is solved using the following method:
if there is no experiment in the case base using the current protein, the purification plan is always to be obtained using the
If the protein was already used and there is an experiment having the same sample that the new one, the
equal-samplemethod can be used.
If the new experiment belongs to some solution class, the
purify-by-classmethod can be used (also the
analogy-by-determinationmethod could be used).
Otherwise, the new experiment only can be solved using the
A detailed description of CHROMA and the methods that it uses can be found in:
E. Armengol, E. Plaza (1995); Integrating induction in a Case-based Reasoner. Lecture Notes in Artificial Intelligence. Springer. num. 984, pp. 3-17. (Extended version IIIA-RR-95-02)