TY - JOUR
T1 - Robust Estimation Method against Poisoning Attacks for Key-Value Data with Local Differential Privacy
AU - Horigome, Hikaru
AU - Kikuchi, Hiroaki
AU - Fujita, Masahiro
AU - Yu, Chia Mu
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/7
Y1 - 2024/7
N2 - Local differential privacy (LDP) protects user information from potential threats by randomizing data on individual devices before transmission to untrusted collectors. This method enables collectors to derive user statistics by analyzing randomized data, thereby presenting a promising avenue for privacy-preserving data collection. In the context of key–value data, in which discrete and continuous values coexist, PrivKV has been introduced as an LDP protocol to ensure secure collection. However, this framework is susceptible to poisoning attacks. To address this vulnerability, we propose an expectation maximization (EM)-based algorithm combined with a cryptographic protocol to facilitate secure random sampling. Our LDP protocol, known as emPrivKV, exhibits two key advantages: it improves the accuracy of statistical information estimation from randomized data, and enhances resilience against the manipulation of statistics, that is, poisoning attacks. These attacks involve malicious users manipulating the analysis results without detection. This study presents the empirical results of applying the emPrivKV protocol to both synthetic and open datasets, highlighting a notable improvement in the precision of statistical value estimation and robustness against poisoning attacks. As a result, emPrivKV improved the frequency and the mean gains by (Formula presented.) and (Formula presented.), respectively, compared to PrivKV, with the number of fake users being (Formula presented.) of the genuine users. Our findings contribute to the ongoing discourse on refining LDP protocols for key–value data in scenarios involving privacy-sensitive information.
AB - Local differential privacy (LDP) protects user information from potential threats by randomizing data on individual devices before transmission to untrusted collectors. This method enables collectors to derive user statistics by analyzing randomized data, thereby presenting a promising avenue for privacy-preserving data collection. In the context of key–value data, in which discrete and continuous values coexist, PrivKV has been introduced as an LDP protocol to ensure secure collection. However, this framework is susceptible to poisoning attacks. To address this vulnerability, we propose an expectation maximization (EM)-based algorithm combined with a cryptographic protocol to facilitate secure random sampling. Our LDP protocol, known as emPrivKV, exhibits two key advantages: it improves the accuracy of statistical information estimation from randomized data, and enhances resilience against the manipulation of statistics, that is, poisoning attacks. These attacks involve malicious users manipulating the analysis results without detection. This study presents the empirical results of applying the emPrivKV protocol to both synthetic and open datasets, highlighting a notable improvement in the precision of statistical value estimation and robustness against poisoning attacks. As a result, emPrivKV improved the frequency and the mean gains by (Formula presented.) and (Formula presented.), respectively, compared to PrivKV, with the number of fake users being (Formula presented.) of the genuine users. Our findings contribute to the ongoing discourse on refining LDP protocols for key–value data in scenarios involving privacy-sensitive information.
KW - expectation maximization
KW - key-value data
KW - LDP
UR - http://www.scopus.com/inward/record.url?scp=85199632746&partnerID=8YFLogxK
U2 - 10.3390/app14146368
DO - 10.3390/app14146368
M3 - Article
AN - SCOPUS:85199632746
SN - 2076-3417
VL - 14
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 14
M1 - 6368
ER -