Download PDFOpen PDF in browserSafety Analysis of High-Dimensional Anonymized Data from Multiple PerspectivesEasyChair Preprint 499916 pages•Date: February 21, 2021AbstractRecently, large-scale data collection has driven data utilization in the medical, financial, advertising, and several other fields. This increasing use of data necessitates privacy risk considerations. K-anonymization and other anonymization methods have been used to minimize data privacy risks, but they are unsuitable for large and high-dimensional datasets required in machine learning and other data mining techniques. Although subsequent methods such as matrix decomposition anonymization can anonymize high-dimensional data while maintaining a high level of utility, they do not clarify anonymized data safety or adequately analyze privacy risks. Therefore, in this study, we performed a multi-perspective analysis on the privacy risks of datasets anonymized with some anonymization methods using various safety metrics. In addition, we propose a new technique for evaluat- ing privacy risk for each attribute of anonymized data. Experimental results showed that our method effectively analyzed privacy risks of high-dimensional anonymized data. Furthermore, our evaluation of the resistance to data re- identification using existing techniques showed that anonymization methods have their suitable attack types, and it is important to assess data safety using various metrics before publishing. Keyphrases: Anonymaization, Privacy, safety metrics
|