Statistical Disclosure Control (SDC) is a very active area of research in the last years. The goal is to transform an original database X into a protected one X', such that X' does not reveal any relation between con?dential and (quasi-)identi?er original attributes and such that X? can be used to compute reliable statistical information about X.
Different properties of confidential data have been defined and discussed in the literature. However, most of them are mainly studied from a theoretical point of view, disregarding the final implementation as a trivial problem that data partitioners have to solve. In this seminar, I will present three different such properties (k-anonymity, p-sensitiveness and l-diversity) from both points of view, theoretical research results and practical implementations. All the implementations used in the experiments carried out in this seminar have been done by modifying the CBFS algorithm, a very practical and intuitive microaggregation technique. In addition, I will present some results for applying CBFS to very large databases.
