Publicaciones

Supervised learning using mahalanobis distance for record linkage

Tipo de Publicación:

Conference Proceedings

Origen:

6th International Summer School on Aggregation Operators-AGOP2011, Lulu.com, Univ. of Sannio, Benevento, Italy, p.223--228 (2011)

ISBN:

978-1-4477-7019-0

URL:

http://agop2011.ciselab.org/proceedings

Palabras clave:

data privacy; record linkage; disclosure risk; Mahalanobis distance; fuzzy measure; Choquet integral

Resumen:

In data privacy, record linkage is a well known technique used to evaluate the disclosure risk of protected data. Mainly, the idea is the linkage between records of different databases, which make reference to the same individuals. In this paper we introduce a new parametrized variation of record linkage relying on the Mahalanobis distance, and a supervised learning method to determine the optimum simulated covariance matrix for the linkage process. We evaluate and compare our proposal with other studied parametrized and not parametrized variations of record linkage, such as weighted mean or the Choquet integral, which determines the optimal fuzzy measure.

Proyectos: