Phi-3 technical report: A highly capable language model locally on your phone M Abdin, SA Jacobs, AA Awan, J Aneja, A Awadallah, H Awadalla, ... arXiv preprint arXiv:2404.14219, 2024 | 150 | 2024 |
Model-free reinforcement learning in infinite-horizon average-reward markov decision processes CY Wei, MJ Jahromi, H Luo, H Sharma, R Jain International conference on machine learning, 10170-10180, 2020 | 102 | 2020 |
Evaluating cognitive maps and planning in large language models with CogEval I Momennejad, H Hasanbeig, F Vieira Frujeri, H Sharma, N Jojic, ... Advances in Neural Information Processing Systems 36, 2024 | 33 | 2024 |
Fine-tuning language models with advantage-induced policy alignment B Zhu, H Sharma, FV Frujeri, S Dong, C Zhu, MI Jordan, J Jiao arXiv preprint arXiv:2306.02231, 2023 | 27 | 2023 |
A universal empirical dynamic programming algorithm for continuous state MDPs WB Haskell, R Jain, H Sharma, P Yu IEEE Transactions on Automatic Control 65 (1), 115-129, 2019 | 20 | 2019 |
Approximate relative value learning for average-reward continuous state MDPs H Sharma, M Jafarnia-Jahromi, R Jain Uncertainty in Artificial Intelligence, 956-964, 2020 | 16 | 2020 |
An empirical relative value learning algorithm for non-parametric MDPs with continuous state space H Sharma, R Jain, A Gupta 2019 18th European Control Conference (ECC), 1368-1373, 2019 | 13 | 2019 |
Language models can be logical solvers J Feng, R Xu, J Hao, H Sharma, Y Shen, D Zhao, W Chen arXiv preprint arXiv:2311.06158, 2023 | 9 | 2023 |
Evaluating cognitive maps in large language models with cogeval: No emergent planning I Momennejad, H Hasanbeig, FV Frujeri, H Sharma, RO Ness, N Jojic, ... Advances in neural information processing systems 37, 2023 | 9 | 2023 |
Randomized function fitting-based empirical value iteration WB Haskell, P Yu, H Sharma, R Jain 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2467-2472, 2017 | 9 | 2017 |
Allure: A systematic protocol for auditing and improving llm-based evaluation of text using iterative in-context-learning H Hasanbeig, H Sharma, L Betthauser, FV Frujeri, I Momennejad arXiv preprint arXiv:2309.13701, 2023 | 8 | 2023 |
An empirical dynamic programming algorithm for continuous MDPs WB Haskell, R Jain, H Sharma, P Yu arXiv preprint arXiv:1709.07506, 2017 | 8 | 2017 |
An approximately optimal relative value learning algorithm for averaged MDPs with continuous states and actions H Sharma, R Jain 2019 57th Annual Allerton Conference on Communication, Control, and …, 2019 | 7 | 2019 |
Optimal spectrum sensing for cognitive radio with imperfect detector H Sharma, A Patel, SN Merchant, UB Desai 2014 IEEE 79th Vehicular Technology Conference (VTC Spring), 1-5, 2014 | 4 | 2014 |
ALLURE: auditing and improving llm-based evaluation of text using iterative in-context-learning H Hasanbeig, H Sharma, L Betthauser, F Vieira Frujeri, I Momennejad arXiv e-prints, arXiv: 2309.13701, 2023 | 3 | 2023 |
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment S Zhang, D Yu, H Sharma, Z Yang, S Wang, H Hassan, Z Wang arXiv preprint arXiv:2405.19332, 2024 | 2 | 2024 |
Language Models can be Deductive Solvers J Feng, R Xu, J Hao, H Sharma, Y Shen, D Zhao, W Chen Findings of the Association for Computational Linguistics: NAACL 2024, 4026-4042, 2024 | 1 | 2024 |
Finite Time Guarantees for Continuous State MDPs with Generative Model H Sharma, R Jain 2020 59th IEEE Conference on Decision and Control (CDC), 3617-3622, 2020 | 1 | 2020 |
Randomized Policy Learning for Continuous State and Action MDPs H Sharma, R Jain arXiv preprint arXiv:2006.04331, 2020 | 1 | 2020 |
Empirical algorithms for general stochastic systems with continuous states and actions H Sharma, R Jain, W Haskell 2019 IEEE 58th Conference on Decision and Control (CDC), 6344-6349, 2019 | 1 | 2019 |