Accelerated zeroth-order and first-order momentum methods from mini to minimax optimization
In the paper, we propose a class of accelerated zeroth-order and first-order momentum
methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we …
methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we …
Examining average and discounted reward optimality criteria in reinforcement learning
V Dewanto, M Gallagher - Australasian Joint Conference on Artificial …, 2022 - Springer
In reinforcement learning (RL), the goal is to obtain an optimal policy, for which the optimality
criterion is fundamentally important. Two major optimality criteria are average and …
criterion is fundamentally important. Two major optimality criteria are average and …
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
We study the problem of computing deterministic optimal policies for constrained Markov
decision processes (MDPs) with continuous state and action spaces, which are widely …
decision processes (MDPs) with continuous state and action spaces, which are widely …
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
Bi-level optimization (BO) has become a fundamental mathematical framework for
addressing hierarchical machine learning problems. As deep learning models continue to …
addressing hierarchical machine learning problems. As deep learning models continue to …
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Policy gradient (PG) methods are successful approaches to deal with continuous
reinforcement learning (RL) problems. They learn stochastic parametric (hyper) policies by …
reinforcement learning (RL) problems. They learn stochastic parametric (hyper) policies by …
Distributed cooperative multi-agent reinforcement learning with directed coordination graph
Existing distributed cooperative multi-agent reinforcement learning (MARL) frameworks
usually assume undirected coordination graphs and communication graphs, while …
usually assume undirected coordination graphs and communication graphs, while …
Model-free learning of optimal deterministic resource allocations in wireless systems via action-space exploration
H Hashmi, DS Kalogerias - 2021 IEEE 31st International …, 2021 - ieeexplore.ieee.org
Wireless systems resource allocation refers to perpetual and challenging nonconvex
constrained optimization tasks, which are especially timely in modern communications and …
constrained optimization tasks, which are especially timely in modern communications and …
Unmanned Vehicles in 6G Networks: A Unifying Treatment of Problems, Formulations, and Tools
Unmanned Vehicles (UVs) functioning as autonomous agents are anticipated to play a
crucial role in the 6th Generation of wireless networks. Their seamless integration, cost …
crucial role in the 6th Generation of wireless networks. Their seamless integration, cost …
Model-Free Learning of Two-Stage Beamformers for Passive IRS-Aided Network Design
H Hashmi, S Pougkakiotis… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Electronically tunable metasurfaces, or Intelligent Reflecting Surfaces (IRSs), are a popular
technology for achieving high spectral efficiency in modern wireless systems by shaping …
technology for achieving high spectral efficiency in modern wireless systems by shaping …
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions
Achieving distributed reinforcement learning (RL) for large-scale cooperative multi-agent
systems (MASs) is challenging because:(i) each agent has access to only limited …
systems (MASs) is challenging because:(i) each agent has access to only limited …