TY - JOUR
T1 - Efficient optimal power flow learning
T2 - A deep reinforcement learning with physics-driven critic model
AU - Sayed, Ahmed
AU - Al Jaafari, Khaled
AU - Zhang, Xian
AU - Zeineldin, Hatem
AU - Al-Durra, Ahmed
AU - Wang, Guibin
AU - Elsaadany, Ehab
N1 - Publisher Copyright:
© 2025
PY - 2025/6
Y1 - 2025/6
N2 - The transition to decarbonized energy systems presents significant operational challenges due to increased uncertainties and complex dynamics. Deep reinforcement learning (DRL) has emerged as a powerful tool for optimizing power system operations. However, most existing DRL approaches rely on approximated data-driven critic networks, requiring numerous risky interactions to explore the environment and often facing estimation errors. To address these limitations, this paper proposes an efficient DRL algorithm with a physics-driven critic model, namely a differentiable holomorphic embedding load flow model (D-HELM). This approach enables accurate policy gradient computation through a differentiable loss function based on system states of realized uncertainties, simplifying both the replay buffer and the learning process. By leveraging continuation power flow principles, D-HELM ensures operable, feasible solutions while accelerating gradient steps through simple matrix operations. Simulation results across various test systems demonstrate the computational superiority of the proposed approach, outperforming state-of-the-art DRL algorithms during training and model-based solvers in online operations. This work represents a potential breakthrough in real-time energy system operations, with extensions to security-constrained decision-making, voltage control, unit commitment, and multi-energy systems.
AB - The transition to decarbonized energy systems presents significant operational challenges due to increased uncertainties and complex dynamics. Deep reinforcement learning (DRL) has emerged as a powerful tool for optimizing power system operations. However, most existing DRL approaches rely on approximated data-driven critic networks, requiring numerous risky interactions to explore the environment and often facing estimation errors. To address these limitations, this paper proposes an efficient DRL algorithm with a physics-driven critic model, namely a differentiable holomorphic embedding load flow model (D-HELM). This approach enables accurate policy gradient computation through a differentiable loss function based on system states of realized uncertainties, simplifying both the replay buffer and the learning process. By leveraging continuation power flow principles, D-HELM ensures operable, feasible solutions while accelerating gradient steps through simple matrix operations. Simulation results across various test systems demonstrate the computational superiority of the proposed approach, outperforming state-of-the-art DRL algorithms during training and model-based solvers in online operations. This work represents a potential breakthrough in real-time energy system operations, with extensions to security-constrained decision-making, voltage control, unit commitment, and multi-energy systems.
KW - Deep reinforcement learning
KW - Holomorphic embedding
KW - Operable power flow
KW - Physics-driven policy gradient
KW - Real-time economic control
UR - https://www.scopus.com/pages/publications/105001042713
U2 - 10.1016/j.ijepes.2025.110621
DO - 10.1016/j.ijepes.2025.110621
M3 - Article
AN - SCOPUS:105001042713
SN - 0142-0615
VL - 167
JO - International Journal of Electrical Power and Energy Systems
JF - International Journal of Electrical Power and Energy Systems
M1 - 110621
ER -