Decentralized multi-agent multi-armed bandits for smart electric vehicles charging

Published on 31 December 2025 - Engineering Applications of Artificial Intelligence

Authors: Sharyal Zafar, Raphaël Féraud, Anne Blavette, Guy Camilleri, Hamid Ben Ahmed

Smart charging of electrical vehicles can help in avoiding congestion and peak load demands in an electrical distribution network. On the consumer's side, the advantage lies in minimizing the daily charging cost. He may also benefit from cheap photovoltaic electricity from local sources and therefore reduce his environmental impact. However, this cheaper electricity is variable and uncertain. In many research works, this has been formulated and solved as a centralized or hierarchical optimization problem. However, such systems may suffer from lack of scalability, single point of failures, and privacy breaches. We propose a fully decentralized and fair multi-agent system combined with reinforcement learning called "Decentralised multi-armed bandit (2-armed bandit) based on Thompson sampling"(D-MAB2AB-TS) to control the charging of electrical vehicles under uncertainties. The problem under consideration is formulated as a two-armed bandit (charging or not) for each instant. The proposed algorithm, based on Thompson Sampling, takes into account the uncertainties in the choice of arms combination of other players. The proposed algorithm finds the best combination of arms to play with a computational complexity O(m) linear with the number of arms. The suggested system is also model-free, as it does not assume the model of the environment to be perfectly known, which is a common assumption in many of the existing centralized optimization strategies for smart charging.

Sorry, but this page still haven't any translation.