Date of Award

12-2021

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chair/Advisor

Dr. Nina Christine Hubig

Committee Member

Dr. Vidya (Seyedehzahra) Samadi

Committee Member

Dr. Feng Luo

Abstract

Changes in demand, various hydrological inputs, and environmental stressors are among issues that water managers and policymakers face on a regular basis. These concerns have sparked interest in applying different techniques to determine reservoir operation policy and improve reservoir release decisions. As the resolution of the analysis rises, it becomes more difficult to effectively represent a real-world system using traditional approaches for determining the best reservoir operation policy. One of the challenges is the “curse of dimensionality,” which occurs when the discretization of the state and action spaces becomes finer or when more state or action variables are taken into account. Because of the dimensionality curse, the number of state-action variables is limited, rendering Dynamic Programming (DP) and Stochastic Dynamic Programming (SDP) ineffective in handling complex reservoir optimization issues. Deep Reinforcement Learning (DRL) is an intelligent approach to overcome the aforementioned curses of stochastic optimization of reservoir system planning. This study examined various novel DRL continuous-action policy gradient methods (PGMs), including Deep Deterministic Policy Gradients (DDPG), Twin Delayed DDPG (TD3), and two different versions of Soft Actor-Critic (SAC18 and SAC19) to identify optimal reservoir operation policy for the Folsom Reservoir located in California, US. The Folsom Reservoir supplies agricultural and municipal water, hydropower, environmental flows, and flood protection to the City of Sacramento. We concluded DRL methods release decisions with respect to these demands as well as by comparing the results to standard operating policy (SOP) and base conditions using different performance criteria and sustainability indices. TD3 and SAC methods have shown promising performance in providing optimal operation policy. Experiments on continuous-action spaces of reservoir operation policy decisions demonstrated that the DRL techniques could efficiently learn strategic policies in space with the curse of dimensionality and modeling.

Recommended Citation

Sadeghi Tabas, Sadegh, "Reinforcement Learning Policy Gradient Methods for Reservoir Operation Management and Control" (2021). All Theses. 3670.
https://open.clemson.edu/all_theses/3670

Author ORCID Identifier

https://orcid.org/ 0000-0001-9157-3397

Download

Included in

Artificial Intelligence and Robotics Commons, Hydrology Commons, Sustainability Commons, Water Resource Management Commons

COinS

All Theses

Reinforcement Learning Policy Gradient Methods for Reservoir Operation Management and Control

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Search

Browse by

Useful Links

All Theses

Reinforcement Learning Policy Gradient Methods for Reservoir Operation Management and Control

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Share

Search

Browse by

Useful Links