Chat generative pretrained transformer fails the multiple-choice American College of Gastroentero...

Q Wei, Z Yao, Y Cui, B Wei, Z Jin, X Xu - Journal of Biomedical Informatics, 2024 - Elsevier

Abstract Objective Large language models (LLMs) such as ChatGPT are increasingly
explored in medical domains. However, the absence of standard guidelines for performance …

被引用次数：32 相关文章所有 5 个版本

[PDF] springer.com

Large language models in healthcare: from a systematic review on medical examinations to a comparative analysis on fundamentals of robotic surgery online test

A Moglia, K Georgiou, P Cerveri, L Mainardi… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have the intrinsic potential to acquire medical knowledge.
Several studies assessing LLMs on medical examinations have been published. However …

被引用次数：6 相关文章所有 8 个版本

[HTML] nih.gov

[HTML][HTML] Artificial intelligence in ophthalmology: a comparative analysis of GPT-3.5, GPT-4, and human expertise in answering StatPearls questions

M Moshirfar, AW Altaf, IM Stoakes, JJ Tuttle… - Cureus, 2023 - ncbi.nlm.nih.gov

Purpose This study evaluates the performance of two ChatGPT models (GPT-3.5 and GPT-
4) and human professionals in answering ophthalmology questions from the StatPearls …

被引用次数：85 相关文章所有 9 个版本

Performance of ChatGPT on nephrology test questions

J Miao, C Thongprayoon, OAG Valencia… - Clinical Journal of the …, 2023 - journals.lww.com

Background: ChatGPT is a novel tool that allows people to engage in conversations with an
advanced machine learning model. ChatGPT's performance in the United States Medical …

被引用次数：43 相关文章所有 4 个版本

[PDF] nejm.org

Benchmarking open-source large language models, GPT-4 and Claude 2 on multiple-choice questions in nephrology

S Wu, M Koo, L Blum, A Black, L Kao, Z Fei, F Scalzo… - NEJM AI, 2024 - ai.nejm.org

Background In recent years, significant breakthroughs have been made in the field of natural
language processing, particularly with the development of large language models (LLMs) …

被引用次数：24 相关文章所有 2 个版本

[PDF] mdpi.com

Evaluating the efficacy of ChatGPT in navigating the Spanish medical residency entrance examination (MIR): Promising horizons for AI in clinical medicine

F Guillen-Grima, S Guillen-Aguinaga… - Clinics and …, 2023 - mdpi.com

The rapid progress in artificial intelligence, machine learning, and natural language
processing has led to increasingly sophisticated large language models (LLMs) for use in …

被引用次数：34 相关文章所有 8 个版本

[PDF] nature.com

Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy

CE Onder, G Koc, P Gokbulut, I Taskaldiran… - Scientific reports, 2024 - nature.com

Hypothyroidism is characterized by thyroid hormone deficiency and has adverse effects on
both pregnancy and fetal health. Chat Generative Pre-trained Transformer (ChatGPT) is a …

被引用次数：29 相关文章所有 7 个版本

Accuracy of ChatGPT in common gastrointestinal diseases: impact for patients and providers

A Kerbage, J Kassab, J El Dahdah… - Clinical …, 2024 - cghjournal.org

Since its release in 2022, Chat Generative Pre-Trained Transformer (ChatGPT) became the
most rapidly expanding consumer software application in history, 1 and its role in medicine …

被引用次数：23 相关文章所有 2 个版本

[PDF] springer.com

Performance of ChatGPT-3.5 and GPT-4 in national licensing examinations for medicine, pharmacy, dentistry, and nursing: a systematic review and meta-analysis

HK Jin, HE Lee, EY Kim - BMC Medical Education, 2024 - Springer

Background ChatGPT, a recently developed artificial intelligence (AI) chatbot, has
demonstrated improved performance in examinations in the medical field. However, thus far …

被引用次数：5 相关文章所有 8 个版本

[PDF] arxiv.org

A comparative study of open-source large language models, gpt-4 and claude 2: Multiple-choice test taking in nephrology

S Wu, M Koo, L Blum, A Black, L Kao, F Scalzo… - arXiv preprint arXiv …, 2023 - arxiv.org

In recent years, there have been significant breakthroughs in the field of natural language
processing, particularly with the development of large language models (LLMs). These …

被引用次数：36 相关文章所有 2 个版本