Abstract
The problem of computing batches of matrix multiplications in distributed computing systems with stragglers is studied. Unlike existing works in the literature, the matrices in a batch are assumed to be sparse, and the sparsity levels for matrices in different batches can be different. A novel coding scheme, called generalized sparse code (GSC), is proposed, in which the matrices are partitioned into smaller chunks that are re- grouped and encoded by respective sparse codes. The expected runtime of the proposed GSC scheme is analyzed, based on which a task assignment problem associated with the proposed GSC is formulated and solved. The solution follows the reverse water-filling principle, by which an efficient worker assignment algorithm whose worst-case time complexity equal to the total number of workers can be developed. Simulation results validate the advantage of the proposed GSC over four existing schemes, including entangled polynomial codes (EP), generalized cross-subspace alignment (GCSA), Lagrange coded computing (LCC) codes and factored Luby transform (FLT) codes at all sparsity levels. As a potential application of the proposed GSC, the problem of computing a batch of matrix multiplications with similarity is discussed.
Original language | English |
---|---|
Pages (from-to) | 633-647 |
Number of pages | 15 |
Journal | IEEE Transactions on Communications |
Volume | 72 |
Issue number | 2 |
DOIs | |
State | Published - 1 Feb 2024 |
Keywords
- Distributed system
- coded computing
- matrix-matrix multiplication
- similar matrices
- sparse matrices
- stragglers problem