Machine Learning

The use of complementary techniques of machine learning to discover knowledge in real complex domains

Publication Type:

Thesis

Source:

Universitat Politècnica de Catalunya, Barcelona, p.240 (2002)

Keywords:

Artificial Intelligence; clinical data; data analysis and modeling; data mining; diagnosis; Fuzzy; fuzzy representation; Machine Learning; WOWA aggregation

Abstract:

This thesis is concerned with developing and refining a collection of methods and tools which can be applied to the different steps of the Data Mining process. Data Mining is understood as the analysis of data using sophisticated tools and methods, which include aspects of data representation, data exploration, knowledge discovery, data modelling and data aggregation. Data Mining can be applied in real and complex domains, such as the domain of clinical prognosis, as well as with artificial test, or benchmark data. Medical informatics is a dynamic area where new approaches and techniques are constantly being developed, the objective being to improve current data representation, modelling and aggregation methods to achieve better diagnosis and prognosis. In this work we focus on two medical data domains: prognosis for ICU patients and diagnosis of Sleep Apnea cases, although it is proposed that the techniques have general use for any data domain. A key approach which is used for data processing and representation is that of fuzzy logic techniques. Existing techniques are benchmarked against the data, such as neural networks, tree induction and standard statistical analysis methods such as correlation, principal components and regression models.
We carry out a survey of existing techniques, authors and their approaches, in order to establish their strong and weak points, limitations, and opportunities where improvement may be achieved.
The first major area under consideration is data representation: how to define a unified scheme which encompasses different data types, such as numeric, continuous, ordered categorical, unordered categorical, binary and fuzzy; how to define membership functions; how to measure differences and similarities in the data. This is followed by a comprehensive benchmarking of existing AI and statistical algorithms on a real ICU medical dataset, comparing the ‘Data Mining’ results to methods proposed by the author.
We define ‘fuzzy covariance’ as a value which permits the measurement of relation between two fuzzy variables. Previous fuzzy covariance work was limited to the covariance of a fuzzy cluster to its fuzzy prototype [Gustafson79]. More recent authors [Nakamori97][Wangh95][Watada94] have created specialised fuzzy covariance calculations tailored for specific applications. In this work, a general fuzzy covariance algorithm, which measures the fuzzy covariance between two fuzzy variables, has been conceived, developed and tested. The initial work based the Hartigan joining algorithm and fuzzy covariances evolves into and is contrasted with the later work on data and attribute fusion using the WOWA aggregation operator .
‘Aggregation operators’ are considered as a method for modelling data for clinical diagnosis, and use ‘relevance’ and ‘reliability’ meta-data together with grades of membership to enhance the information which the aggregation operator receives in order to model the data. We also make enhancements to the WOWA operator, to enable it to process data with missing values and we develop a novel method for learning the weighting vectors.

Context-GMM: Incremental Learning of Sparse Priors for Gaussian Mixture Regression

Publication Type:

Conference Paper

Source:

2012 IEEE International Conference on Robotics and Biomimetics (ROBIO 2012), Guangzhou, China (2012)

Keywords:

Probabilistic Learning; Gaussian Mixtures Learning

Incremental Learning of an Optical Flow Model for Sensorimotor Anticipation in a Mobile Robot

Publication Type:

Conference Paper

Source:

ICDL-EpiRob 2012: IEEE Conference on Development and Learning and Epigenetic Robotics, IEEE, San Diego, California (2012)

Keywords:

Developmental learning; Developmental robotics

Melody, bassline and harmony representations for music version identification

Publication Type:

Conference Paper

Source:

Int. World Wide Web Conf., Workshop on Advances on Music Information Retrieval (AdMIRe), WWW, Lyon, France, p.887-894 (2012)

URL:

http://www2012.wwwconference.org/proceedings/forms/companion.htm#8

Abstract:

In this paper we compare the use of different musical representations for the task of version identification (i.e. retrieving alternative performances of the same musical piece). We automatically compute descriptors representing the melody and bass line using a state-of-the-art melody extraction algorithm, and compare them to a harmony-based descriptor. The similarity of descriptor sequences is computed using a dynamic programming algorithm based on nonlinear time series analysis which has been successfully used for version identification with harmony descriptors. After evaluating the accuracy of individual descriptors, we assess whether performance can be improved by descriptor fusion, for which we apply a classification approach, comparing different classification algorithms. We show that both melody and bass line descriptors carry useful information for version identification, and that combining them increases version detection accuracy. Whilst harmony remains the most reliable musical representation for version identification, we demonstrate how in some cases performance can be improved by combining it with melody and bass line descriptions. Finally, we identify some of the limitations of the proposed descriptor fusion approach, and discuss directions for future research.

PDFFile: 

Combining two lazy learning methods for classification and knowledge discovery.

Publication Type:

Conference Paper

Source:

International Conference on Knowledge Discovery and Information Retrieval, INSTICC, Senart, Paris (2011)

Keywords:

Machine Learning; Lazy learning methods; knowledge discovery; classification; medical diagnosis

Abstract:

The goal of this paper is to construct a classifier for diagnosing malignant melanoma. We experimented with two lazy learning methods, $k$-NN and \textsf{LID}, and compared their results with the ones produced by decision trees. We performed this comparison because we are also interested on building a domain model that can serve as basis to dermatologists to propose a good characterization of early melanomas. We shown that lazy learning methods have a better performance than decision trees in terms of sensitivity and specificity. We have seen that both lazy learning methods produce complementary results ($k$-NN has high specificity and LID has high sensitivity) suggesting that a combination of both could be a good classifier. We report experiments confirming this point. Concerning the construction of a domain model, we propose to use the explanations provided by the lazy learning methods, and we see that the resulting theory is as predictive and useful as the one obtained from decision trees.

Similarity Measures over Refinement Graphs

Publication Type:

Journal Article

Source:

Machine Learning, Volume 87, Issue 1, p.57-92 (2012)

Keywords:

CBR; Similarity; Machine Learning; Feature Terms

Abstract:

Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employed to assess and reason about similarity. We will define and analyze two similarity measures, $S_{\lambda}$ and $S_{\pi}$, based on refinement graphs. The \emph{anti-unification-based similarity}, $S_{\lambda}$, assesses similarity by finding the anti-unification of two instances, which is a description capturing all the information common to these two instances. The \emph{property-based similarity}, $S_{\pi}$, is based on a process of disintegrating the instances into a set of {\em properties}, and then analyzing these property sets.
Moreover these similarity measures are applicable to any representation language for which a refinement graph that satisfies the requirements we identify can be defined. Specifically, we present a refinement graph for feature terms, in which several languages of increasing expressiveness can be defined. The similarity measures are empirically evaluated on relational data sets belonging to languages of different expressiveness.

Empirical hardness for mixed auctions

Publication Type:

Book Chapter

Source:

Lecture notes in computer science, Springer, Volume 5988, p.161-170 (2010)

Analysing the behaviour of robot teams through relational sequential pattern mining

Publication Type:

Report

Source:

CoRR arXiv:1010.6234v1 [cs.AI] (2010)

Using Transfer Learning to Speed-Up Reinforcement Learning: A Case-Based Approach

Publication Type:

Conference Paper

Source:

IEEE Latin American Robotics Symposium and Intelligent Robotics Meeting, IEEE, Brasil, p.55-60 (2010)

Concept Convergence in Empirical Domains

Publication Type:

Conference Paper

Source:

DS10: 13th International Conference on Discovery Science, p.281 - 295 (2010)

Keywords:

Machine Learning; Argumentation

Syndicate content