Heuristic Supervised Approach for Record Linkage
Publication Type:Conference Paper
Source:Modeling Decisions for Artificial Intelligence (MDAI), Springer Berlin / Heidelber, Volume 7647, Girona, Catalonia, p.210-221 (2012)
Record linkage is a well known technique used to link records from one database to records from another database which make reference to the same individuals. Although it is usually used in database integration, it is also used in the data privacy field for the disclosure risk evaluation of protected datasets. In this paper we compare two different supervised algorithms which rely on distance-based record linkage techniques, specifically using the Choquet integral’s fuzzy integral to compute the distance between records. The first approach uses a linear optimization problem which determines the optimal fuzzy measure for the linkage. While, the second approach is a kind of gradient algorithm with constraints for the fuzzy measures’ identification. We show the advantages and drawbacks of both algorithms and also in which situations they will work better.
Choquet integral for record linkage
Publication Type:Journal Article
Source:Annals of Operations Research, Springer US, Volume 195, Issue 1, p.97-110 (2012)
Record linkage is used in data privacy to evaluate the disclosure risk of protected data. It models potential attacks, where an intruder attempts to link records from the protected data to the original data. In this paper we introduce a novel distance based record linkage, which uses the Choquet integral to compute the distance between records. We use a fuzzy measure to weight each subset of variables from each record. This allows us to improve standard record linkage and provide insightful information about the re-identification risk of each variable and their interaction. To do that, we use a supervised learning approach which determines the optimal fuzzy measure for the linkage.
Sequential mixed auctions
Solving Sequential Mixed Auctions with Integer Programming
Improving function filtering for computationally demanding DCOPs
Publication Type:Conference Paper
Source:Workshop on Distributed Constraint Reasoning at IJCAI 2011, Barcelona, p.99-111 (2011)
In this paper we focus on solving DCOPs in computationally demanding scenarios. GDL optimally solves DCOPs, but requires exponentially large cost functions, being impractical in such settings. Function filtering is a technique that reduces the size of cost functions. We improve the effectiveness of function filtering to reduce the amount of resources required to optimally solve DCOPs. As a result, we enlarge the range of problems solvable by algorithms employing function filtering.
Container loading for nonorthogonal objects: an approximation using local search and simulated annealing
Supervised Learning Methods on Distance Based Record Linkage
Source:Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain, p.25 (2010)
Keywords:record linkage; data privacy; disclosure risk; optimization; fuzzy measure; Choquet integral
Record linkage is the task of identifying records corresponding to the same entity from one or more data sources. Relying on this idea, it is feasible to use it in the data privacy context, to evaluate the disclosure risk of protected data, evaluating the number of linked records between a data set and its protected version. In this project we introduce two parametrized variations of distance based record linkage. One uses a weighted mean and the other the Choquet integral to compute the distance between records. These methods, for example, allows us to improve standard record linkage and provide insightful information about the re-identification risk of each variable and also, in the second method, their interactions. To do that, we use a supervised learning approach applied to both methods which determines the optimal weights and fuzzy measure, respectively, for maximizing the linkage between two data files.