Date of Award
12-2021
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
Committee Chair/Advisor
Dr. Nina Christine Hubig
Committee Member
Dr. Vidya (Seyedehzahra) Samadi
Committee Member
Dr. Feng Luo
Abstract
Changes in demand, various hydrological inputs, and environmental stressors are among issues that water managers and policymakers face on a regular basis. These concerns have sparked interest in applying different techniques to determine reservoir operation policy and improve reservoir release decisions. As the resolution of the analysis rises, it becomes more difficult to effectively represent a real-world system using traditional approaches for determining the best reservoir operation policy. One of the challenges is the “curse of dimensionality,” which occurs when the discretization of the state and action spaces becomes finer or when more state or action variables are taken into account. Because of the dimensionality curse, the number of state-action variables is limited, rendering Dynamic Programming (DP) and Stochastic Dynamic Programming (SDP) ineffective in handling complex reservoir optimization issues. Deep Reinforcement Learning (DRL) is an intelligent approach to overcome the aforementioned curses of stochastic optimization of reservoir system planning. This study examined various novel DRL continuous-action policy gradient methods (PGMs), including Deep Deterministic Policy Gradients (DDPG), Twin Delayed DDPG (TD3), and two different versions of Soft Actor-Critic (SAC18 and SAC19) to identify optimal reservoir operation policy for the Folsom Reservoir located in California, US. The Folsom Reservoir supplies agricultural and municipal water, hydropower, environmental flows, and flood protection to the City of Sacramento. We concluded DRL methods release decisions with respect to these demands as well as by comparing the results to standard operating policy (SOP) and base conditions using different performance criteria and sustainability indices. TD3 and SAC methods have shown promising performance in providing optimal operation policy. Experiments on continuous-action spaces of reservoir operation policy decisions demonstrated that the DRL techniques could efficiently learn strategic policies in space with the curse of dimensionality and modeling.
Recommended Citation
Sadeghi Tabas, Sadegh, "Reinforcement Learning Policy Gradient Methods for Reservoir Operation Management and Control" (2021). All Theses. 3670.
https://open.clemson.edu/all_theses/3670
Author ORCID Identifier
https://orcid.org/ 0000-0001-9157-3397
Included in
Artificial Intelligence and Robotics Commons, Hydrology Commons, Sustainability Commons, Water Resource Management Commons