Artificial Intelligence

Decentralized Smart Charging of Large-Scale EV Fleets using Adaptive Multi-Agent Multi-Armed Bandits

Published on - PGMO Days 2023

Authors: Sharyal Zafar, Raphaël Féraud, Anne Blavette, Guy Camilleri, H. Ben Ahmed

As solar photovoltaics (PVs) and electric vehicles (EVs) integrate into electrical grids, chal- lenges like network congestion and peak demand emerge. Traditional solutions require costly infrastructure investments. However, smart EV charging offers an elegant alternative. Exist- ing solutions include centralized, hierarchical, and decentralized systems. However, scalability and real-time operation may pose challenges for centralized and hierarchical architectures. To address these issues comprehensively, we propose a fully decentralized smart charging system, ensuring scalability, real-time operation, and data privacy. In recent years, decentralized smart charging solutions utilizing standard reinforcement learn- ing (RL) algorithms have gained prominence. However, the integration of multi-level constraints and objectives, encompassing both prosumer-specific local considerations as well as broader network-global considerations, remains a complex challenge. This complexity arises from the absence of predefined reward signals in the majority of smart-grid applications. Furthermore, standard RL algorithms employing deep learning-based approximations may exhibit slower con- vergence when compared to the multi-armed bandit class of RL algorithms, especially in scenarios where a perfect oracle is unavailable. Thus, we introduce a novel adaptive multi-agent system employing combinatorial 2-armed bandits with Thompson Sampling to minimize EV charging costs, considering variable electricity prices and PV energy production uncertainty, while ensur- ing fairness among prosumers. Validation includes a large-scale simulation with over 10,000 EVs, comparing our approach to "dumb" charging and centralized mixed-integer linear programming optimization (computed only at a small-scale). Evaluation criteria encompass cost optimization, constraint satisfaction, fairness, and solution speed. Our results demonstrate the effectiveness of our decentralized approach.