Deep multi-user reinforcement learning for distributed dynamic spectrum access
O Naparstek, K Cohen - IEEE transactions on wireless …, 2018 - ieeexplore.ieee.org
We consider the problem of dynamic spectrum access for network utility maximization in
multichannel wireless networks. The shared bandwidth is divided into K orthogonal …
multichannel wireless networks. The shared bandwidth is divided into K orthogonal …
Markovian restless bandits and index policies: A review
J Niño-Mora - Mathematics, 2023 - mdpi.com
The restless multi-armed bandit problem is a paradigmatic modeling framework for optimal
dynamic priority allocation in stochastic models of wide-ranging applications that has been …
dynamic priority allocation in stochastic models of wide-ranging applications that has been …
Q-learning Lagrange policies for multi-action restless bandits
Multi-action restless multi-armed bandits (RMABs) are a powerful framework for constrained
resource allocation in which N independent processes are managed. However, previous …
resource allocation in which N independent processes are managed. However, previous …
Distributed learning over Markovian fading channels for stable spectrum access
We consider the problem of multi-user spectrum access in wireless networks. The bandwidth
is divided into orthogonal channels, and users aim to access the spectrum. Each user …
is divided into orthogonal channels, and users aim to access the spectrum. Each user …
Non-stationary representation learning in sequential linear bandits
In this paper, we study representation learning for multi-task decision-making in non-
stationary environments. We consider the framework of sequential linear bandits, where the …
stationary environments. We consider the framework of sequential linear bandits, where the …
On learning Whittle index policy for restless bandits with scalable regret
N Akbarzadeh, A Mahajan - IEEE Transactions on Control of …, 2023 - ieeexplore.ieee.org
Reinforcement learning is an attractive approach to learn good resource allocation and
scheduling policies based on data when the system model is unknown. However, the …
scheduling policies based on data when the system model is unknown. However, the …
Deep reinforcement learning for simultaneous sensing and channel access in cognitive networks
We consider the problem of dynamic spectrum access (DSA) in cognitive wireless networks,
consisting of primary users (PUs) and secondary users (SUs), where only partial …
consisting of primary users (PUs) and secondary users (SUs), where only partial …
Learning in restless bandits under exogenous global Markov process
We consider an extension to the restless multi-armed bandit (RMAB) problem with unknown
arm dynamics, where an unknown exogenous global Markov process governs the rewards …
arm dynamics, where an unknown exogenous global Markov process governs the rewards …
Client selection for generalization in accelerated federated learning: A multi-armed bandit approach
Federated learning (FL) is an emerging machine learning (ML) paradigm used to train
models across multiple nodes (ie, clients) holding local data sets, without explicitly …
models across multiple nodes (ie, clients) holding local data sets, without explicitly …
Causal inference machine learning leads original experimental discovery in CdSe/CdS core/shell nanoparticles
The synthesis of CdSe/CdS core/shell nanoparticles was revisited with the help of a causal
inference machine learning framework. The tadpole morphology with 1–2 tails was …
inference machine learning framework. The tadpole morphology with 1–2 tails was …