MaxMin-RLHF: Towards equitable alignment of large language models with diverse human preferences S Chakraborty, J Qiu, H Yuan, A Koppel, F Huang, D Manocha, AS Bedi, ... arXiv preprint arXiv:2402.08925, 2024 | 33 | 2024 |
Reward-directed conditional diffusion: Provable distribution estimation and reward improvement H Yuan, K Huang, C Ni, M Chen, M Wang Advances in Neural Information Processing Systems 36, 2024 | 28 | 2024 |
Diffusion model for data-driven black-box optimization Z Li, H Yuan, K Huang, C Ni, Y Ye, M Chen, M Wang arXiv preprint arXiv:2403.13219, 2024 | 7 | 2024 |
Neural network is heterogeneous: Phase matters more Y Nie, H Yuan arXiv preprint arXiv:2111.02014, 2021 | 7 | 2021 |
Gradient Guidance for Diffusion Models: An Optimization Perspective Y Guo, H Yuan, Y Yang, M Chen, M Wang arXiv preprint arXiv:2404.14743, 2024 | 6 | 2024 |
Learning entangled single-sample distributions via iterative trimming H Yuan, Y Liang International Conference on Artificial Intelligence and Statistics, 2666-2676, 2020 | 6 | 2020 |
Learning entangled single-sample Gaussians in the subset-of-signals model Y Liang, H Yuan Conference on Learning Theory, 2712-2737, 2020 | 4 | 2020 |
MaxMin-RLHF: Alignment with Diverse Human Preferences S Chakraborty, J Qiu, H Yuan, A Koppel, D Manocha, F Huang, A Bedi, ... Forty-first International Conference on Machine Learning, 0 | 3 | |
Unified off-policy learning to rank: a reinforcement learning perspective Z Zhang, Y Su, H Yuan, Y Wu, R Balasubramanian, Q Wu, H Wang, ... Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Adversarial attacks on online learning to rank with stochastic click models Z Wang, R Balasubramanian, H Yuan, C Song, M Wang, H Wang arXiv preprint arXiv:2305.19218, 2023 | 2 | 2023 |
Bandit theory and thompson sampling-guided directed evolution for sequence optimization H Yuan, C Ni, H Wang, X Zhang, L Cong, C Szepesvári, M Wang Advances in Neural Information Processing Systems 35, 38291-38304, 2022 | 2 | 2022 |
Uniform joint screening for ultra-high dimensional graphical models Z Zheng, H Shi, Y Li, H Yuan Journal of Multivariate Analysis 179, 104645, 2020 | 1 | 2020 |
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement H Yuan, Y Zeng, Y Wu, H Wang, M Wang, L Leqi arXiv preprint arXiv:2410.13828, 2024 | | 2024 |
Conversational Dueling Bandits in Generalized Linear Models S Yang, H Yuan, X Zhang, M Wang, H Zhang, H Wang Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | | 2024 |
Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization J Qiu, H Yuan, J Zhang, W Chen, H Wang, M Wang Proceedings of the AAAI Conference on Artificial Intelligence 38 (13), 14686 …, 2024 | | 2024 |