Distributed Dual Averaging Based Data Clustering

Mykola Servetnyk, Carrson C. Fung*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Multiagent distributed clustering scheme is proposed herein to process data which are collected by dispersed sensors that are not under centralized control. Two methods based on distributed dual averaging (DDA) algorithm are proposed, which are able to incorporate network structure and do not require exchange of centroid estimates, which makes it appealing for security conscious applications. The first method provides the framework for distributed clustering using the DDA algorithm with predefined regularization parameter. The second method, called Adaptive DDA (ADDA), relaxes the condition concerning a priori knowledge about the centroids, assumed in the first method, without losing clustering performance. This is achieved by properly regularizing the problem where a data-driven approach is used to determine the regularization parameter. The proposed methods are further extended via the proposed Bin method to scenario where processing agents store unbalanced amount of data with non-IID class distribution. Experiments are conducted on both real-life and synthetic data. Numerical results show the efficacy of the proposed approaches compared to state-of-art centralized algorithm and other distributed approaches.

Original languageEnglish
Pages (from-to)372-379
Number of pages8
JournalIEEE Transactions on Big Data
Volume9
Issue number1
DOIs
StatePublished - 1 Feb 2023

Keywords

  • Clustering algorithms
  • distributed algorithms
  • security conscious algorithm
  • subgradient methods
  • unbalanced data

Fingerprint

Dive into the research topics of 'Distributed Dual Averaging Based Data Clustering'. Together they form a unique fingerprint.

Cite this