AKSEL: Fast Byzantine SGD


Authors / Editors


Research Areas

No matching items found.


Publication Details

Output type: Conference proceeding

UM6P affiliated Publication?: Yes

Author list: Boussetta A., El-Mhamdi E.-M., Guerraoui R., Maurer A., Rouault S.

Publication year: 2021

Title of series: Leibniz International Proceedings in Informatics, LIPIcs

Volume number: 184

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85101726840&doi=10.4230%2fLIPIcs.OPODIS.2020.8&partnerID=40&md5=831b815804075a8339aca4fe64d76389

Languages: English (EN-GB)


View on publisher site


Abstract

Modern machine learning architectures distinguish servers and workers. Typically, a d-dimensional model is hosted by a server and trained by n workers, using a distributed stochastic gradient descent (SGD) optimization scheme. At each SGD step, the goal is to estimate the gradient of a cost function. The simplest way to do this is to average the gradients estimated by the workers. However, averaging is not resilient to even one single Byzantine failure of a worker. Many alternative gradient aggregation rules (GARs) have recently been proposed to tolerate a maximum number f of Byzantine workers. These GARs differ according to (1) the complexity of their computation time, (2) the maximal number of Byzantine workers despite which convergence can still be ensured (breakdown point), and (3) their accuracy, which can be captured by (3.1) their angular error, namely the angle with the true gradient, as well as (3.2) their ability to aggregate full gradients. In particular, many are not full gradients for they operate on each dimension separately, which results in a coordinate-wise blended gradient, leading to low accuracy in practical situations where the number (s) of workers that are actually Byzantine in an execution is small (s << f). © Amine Boussetta, El-Mahdi El-Mhamdi, Rachid Guerraoui, Alexandre Maurer, and Sébastien Rouault; licensed under Creative Commons License CC-BY 24th International Conference on Principles of Distributed Systems (OPODIS 2020).


Keywords

No matching items found.


Documents

No matching items found.


Last updated on 2021-21-11 at 23:16