A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning
IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
Difusco: Graph-based diffusion solvers for combinatorial optimization
Abstract Neural network-based Combinatorial Optimization (CO) methods have shown
promising results in solving various NP-complete (NPC) problems without relying on hand …
promising results in solving various NP-complete (NPC) problems without relying on hand …
Pomo: Policy optimization with multiple optima for reinforcement learning
In neural combinatorial optimization (CO), reinforcement learning (RL) can turn a deep
neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a …
neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a …
Dimes: A differentiable meta solver for combinatorial optimization problems
Recently, deep reinforcement learning (DRL) models have shown promising results in
solving NP-hard Combinatorial Optimization (CO) problems. However, most DRL solvers …
solving NP-hard Combinatorial Optimization (CO) problems. However, most DRL solvers …
Rmm: Reinforced memory management for class-incremental learning
Abstract Class-Incremental Learning (CIL)[38] trains classifiers under a strict memory
budget: in each incremental phase, learning is done for new data, most of which is …
budget: in each incremental phase, learning is done for new data, most of which is …
Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning
In this work, we introduce Graph Pointer Networks (GPNs) trained using reinforcement
learning (RL) for tackling the traveling salesman problem (TSP). GPNs build upon Pointer …
learning (RL) for tackling the traveling salesman problem (TSP). GPNs build upon Pointer …
A-nesi: A scalable approximate method for probabilistic neurosymbolic inference
E van Krieken, T Thanapalasingam… - Advances in …, 2023 - proceedings.neurips.cc
We study the problem of combining neural networks with symbolic reasoning. Recently
introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as …
introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as …
Rlhf can speak many languages: Unlocking multilingual preference optimization for llms
Preference optimization techniques have become a standard final stage for training state-of-
art large language models (LLMs). However, despite widespread adoption, the vast majority …
art large language models (LLMs). However, despite widespread adoption, the vast majority …
Learn to design the heuristics for vehicle routing problem
This paper presents an approach to learn the local-search heuristics that iteratively improves
the solution of Vehicle Routing Problem (VRP). A local-search heuristics is composed of a …
the solution of Vehicle Routing Problem (VRP). A local-search heuristics is composed of a …
A reinforcement learning approach to the orienteering problem with time windows
R Gama, HL Fernandes - Computers & Operations Research, 2021 - Elsevier
Abstract The Orienteering Problem with Time Windows (OPTW) is a combinatorial
optimization problem where the goal is to maximize the total score collected from different …
optimization problem where the goal is to maximize the total score collected from different …