Electric power

Multi-Agent Contextual Combinatorial Multi-Armed Bandits with Linear Structured Super Arm: application to energy management optimization in Smart Grids

Publié le - PGMO Days

Auteurs : Eloann Le Guern-Dall, Raphaël Féraud, Guy Camilleri, Patrick Maillé, H. Ben Ahmed, Juan J Cuenca, Riadh Zorgati, Fabien Petit, Anne Blavette

The integration of electric vehicles (EVs) and renewable energy sources (RES) into future electrical grids presents both significant challenges and opportunities for energy management. While RES production is an incentive for increased demand at specific times of the day (e.g., at noon for photovoltaic production), the distribution system operator (DSO) should also prevent grid congestion. The optimal strategy for integrating each EV with regard to the DSO constraints in a manner that maximizes the local use of RES is an NP-hard problem, as it requires the resolution of a mixed-integer linear programming problem (MILP) [1]. Moreover, due to weather and human behaviors, this optimization should be done under uncertainty. In particular, learning algorithms from the family of Bandit algorithms have been studied in [2], [3] to handle uncertainty. In light of the decentralized algorithms developed in [2], we propose a Multi-Agent Contextual Combinatorial Multi-Armed Bandits (MA-CC-MAB) approach with linearly structured super arms. By leveraging contextual information such as time-varying renewable energy generation, grid load conditions and day of the week, each agent will selfishly and dynamically schedule its charging intervals while considering grid constraints. The use of linearly structured super arms enables efficient exploration and exploitation in a high-dimensional combinatorial action space, while decentralization will address scalability issues inherent to large systems. The use of contextual information allows adapting to different conditions of the environment, ensuring efficient decision-making under uncertainty. We show performance of this algorithm on the IEEE LVTF network model (55 agents) with PV production, using pandapower python library for the load flow simulation.