In this paper we apply different types of clustering,
fuzzy (fuzzy c-Means) and crisp (k-Means) to graph statistical
data in order to evaluate information loss due to perturbation as
part of the anonymization process for a data privacy application.
We make special emphasis on two major node types: hubs, which
are nodes with a high relative degree value, and bridges, which
act as connecting nodes between different regions in the graph.
By clustering the graph's statistical data before and after
perturbation, we can measure the change in characteristics and
therefore the information loss. We partition the nodes into three
groups: hubs/global bridges, local bridges, and all other nodes.
We suspect that the partitions of these nodes are best represented
in the fuzzy form, especially in the case of nodes in frontier
regions of the graphs which may have an ambiguous assignment.
Links:
[1] http://www.iiia.csic.es/en/individual/david-f-nettleton
[2] http://www.iiia.csic.es/en/publications/export/tagged/4598
[3] http://www.iiia.csic.es/en/publications/export/xml/4598
[4] http://www.iiia.csic.es/en/publications/export/bib/4598
[5] http://www.iiia.csic.es/en/project/ares