Comparison-based conversational recommender system with relative bandit feedback Z Xie, T Yu, C Zhao, S Li Proceedings of the 44th International ACM SIGIR Conference on Research and …, 2021 | 41 | 2021 |
Knowledge-aware conversational preference elicitation with bandit feedback C Zhao, T Yu, Z Xie, S Li Proceedings of the ACM Web Conference 2022, 483-492, 2022 | 24 | 2022 |
Clustering of conversational bandits for user preference learning and elicitation J Wu, C Zhao, T Yu, J Li, S Li Proceedings of the 30th ACM International Conference on Information …, 2021 | 21 | 2021 |
Best-of-three-worlds analysis for linear bandits with follow-the-regularized-leader algorithm F Kong, C Zhao, S Li The Thirty Sixth Annual Conference on Learning Theory, 657-673, 2023 | 9 | 2023 |
Learning adversarial linear mixture markov decision processes with bandit feedback and unknown transition C Zhao, R Yang, B Wang, S Li The Eleventh International Conference on Learning Representations, 2023 | 8 | 2023 |
Simultaneously learning stochastic and adversarial bandits under the position-based model C Chen, C Zhao, S Li Proceedings of the AAAI Conference on Artificial Intelligence 36 (6), 6202-6210, 2022 | 5 | 2022 |
Conservative contextual combinatorial cascading bandit K Wang IEEE Access 9, 151434-151443, 2021 | 4 | 2021 |
Clustering of conversational bandits with posterior sampling for user preference learning and elicitation Q Li, C Zhao, T Yu, J Wu, S Li User Modeling and User-Adapted Interaction 33 (5), 1065-1112, 2023 | 3 | 2023 |
Learning adversarial low-rank markov decision processes with unknown transition and full-information feedback C Zhao, R Yang, B Wang, X Zhang, S Li Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization C Zhao, Y Ze, J Dong, B Wang, S Li Proceedings of the Sixteenth ACM International Conference on Web Search and …, 2023 | 2 | 2023 |
DPMAC: differentially private communication for cooperative multi-agent reinforcement learning C Zhao, Y Ze, J Dong, B Wang, S Li arXiv preprint arXiv:2308.09902, 2023 | 1 | 2023 |
Toward joint utilization of absolute and relative bandit feedback for conversational recommendation Y Xia, Z Xie, T Yu, C Zhao, S Li User Modeling and User-Adapted Interaction, 1-38, 2024 | | 2024 |
Towards Provably Efficient Learning of Extensive-Form Games with Imperfect Information and Linear Function Approximation C Zhao, S Chen, W Liu, H Fu, Q FU, S Li | | |
Learning Adversarial Low-rank MDPs with Unknown Transition and Full-information Feedback C Zhao, R Yang, B Wang, X Zhang, S Li | | |