Distributed Dual Averaging Based Data Clustering

Mykola Servetnyk, Carrson C. Fung

Research output: Contribution to journalArticlepeer-review


Multiagent distributed clustering scheme is proposed herein to process data which are collected by dispersed sensors that are not under centralized control. Two methods based on distributed dual averaging (DDA) algorithm are proposed, which are able to incorporate network structure and do not require exchange of centroid estimates, which makes it appealing for security conscious applications. The first method provides the framework for distributed clustering using the DDA algorithm with predefined regularization parameter. The second method, called Adaptive DDA (ADDA), relaxes the condition concerning \emph{a priori} knowledge about the centroids, assumed in the first method, without losing clustering performance. This is achieved by properly regularizing the problem where a data-driven approach is used to determine the regularization parameter. The proposed methods are further extended via the proposed Bin method to scenario where processing agents store unbalanced amount of data with non-IID class distribution. Experiments are conducted on both real-life and synthetic data. Numerical results show the efficacy of the proposed approaches compared to state-of-art centralized algorithm and other distributed approaches.

Original languageEnglish
JournalIEEE Transactions on Big Data
StateAccepted/In press - 2022


  • Approximation algorithms
  • Big Data
  • Clustering algorithms
  • Clustering algorithms
  • Convergence
  • distributed algorithms
  • Distributed databases
  • security conscious algorithm
  • subgradient methods
  • Symmetric matrices
  • Task analysis
  • unbalanced data


Dive into the research topics of 'Distributed Dual Averaging Based Data Clustering'. Together they form a unique fingerprint.

Cite this