Utility preserving query log anonymization via semantic microaggregation
Information Sciences, 2013•Elsevier
Query logs are of great interest for scientists and companies for research, statistical and
commercial purposes. However, the availability of query logs for secondary uses raises
privacy issues since they allow the identification and/or revelation of sensitive information
about individual users. Hence, query anonymization is crucial to avoid identity disclosure. To
enable the publication of privacy-preserved–but still useful–query logs, in this paper, we
present an anonymization method based on semantic microaggregation. Our proposal aims …
commercial purposes. However, the availability of query logs for secondary uses raises
privacy issues since they allow the identification and/or revelation of sensitive information
about individual users. Hence, query anonymization is crucial to avoid identity disclosure. To
enable the publication of privacy-preserved–but still useful–query logs, in this paper, we
present an anonymization method based on semantic microaggregation. Our proposal aims …
Query logs are of great interest for scientists and companies for research, statistical and commercial purposes. However, the availability of query logs for secondary uses raises privacy issues since they allow the identification and/or revelation of sensitive information about individual users. Hence, query anonymization is crucial to avoid identity disclosure. To enable the publication of privacy-preserved – but still useful – query logs, in this paper, we present an anonymization method based on semantic microaggregation. Our proposal aims at minimizing the disclosure risk of anonymized query logs while retaining their semantics as much as possible. First, a method to map queries to their formal semantics extracted from the structured categories of the Open Directory Project is presented. Then, a microaggregation method is adapted to perform a semantically-grounded anonymization of query logs. To do so, appropriate semantic similarity and semantic aggregation functions are proposed. Experiments performed using real AOL query logs show that our proposal better retains the utility of anonymized query logs than other related works, while also minimizing the disclosure risk.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果