Recent advances in reinforcement learning in finance
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …
revolutionized the techniques on data processing and data analysis and brought new …
Goals, usefulness and abstraction in value-based choice
B De Martino, A Cortese - Trends in Cognitive Sciences, 2023 - cell.com
Abstract Colombian drug lord Pablo Escobar, while on the run, purportedly burned two
million dollars in banknotes to keep his daughter warm. A stark reminder that, in life …
million dollars in banknotes to keep his daughter warm. A stark reminder that, in life …
Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …
A practical guide to multi-objective reinforcement learning and planning
Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …
between multiple, often conflicting, objectives. Despite this, the majority of research in …
Multi-objective gflownets
M Jain, SC Raparthy… - International …, 2023 - proceedings.mlr.press
We study the problem of generating diverse candidates in the context of Multi-Objective
Optimization. In many applications of machine learning such as drug discovery and material …
Optimization. In many applications of machine learning such as drug discovery and material …
Pareto set learning for expensive multi-objective optimization
Expensive multi-objective optimization problems can be found in many real-world
applications, where their objective function evaluations involve expensive computations or …
applications, where their objective function evaluations involve expensive computations or …
Prediction-guided multi-objective reinforcement learning for continuous robot control
Many real-world control problems involve conflicting objectives where we desire a dense
and high-quality set of control policies that are optimal for different objective preferences …
and high-quality set of control policies that are optimal for different objective preferences …
Personalized soups: Personalized large language model alignment via post-hoc parameter merging
While Reinforcement Learning from Human Feedback (RLHF) aligns Large Language
Models (LLMs) with general, aggregate human preferences, it is suboptimal for learning …
Models (LLMs) with general, aggregate human preferences, it is suboptimal for learning …
Effective diversity in population based reinforcement learning
J Parker-Holder, A Pacchiano… - Advances in …, 2020 - proceedings.neurips.cc
Exploration is a key problem in reinforcement learning, since agents can only learn from
data they acquire in the environment. With that in mind, maintaining a population of agents is …
data they acquire in the environment. With that in mind, maintaining a population of agents is …
Toward Pareto efficient fairness-utility trade-off in recommendation through reinforcement learning
The issue of fairness in recommendation is becoming increasingly essential as
Recommender Systems (RS) touch and influence more and more people in their daily lives …
Recommender Systems (RS) touch and influence more and more people in their daily lives …