These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


PUBMED FOR HANDHELDS

Search MEDLINE/PubMed


  • Title: Do ChatGPT and Google differ in answers to commonly asked patient questions regarding total shoulder and total elbow arthroplasty?
    Author: Tharakan S, Klein B, Bartlett L, Atlas A, Parada SA, Cohn RM.
    Journal: J Shoulder Elbow Surg; 2024 Aug; 33(8):e429-e437. PubMed ID: 38182023.
    Abstract:
    BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) offer a new potential resource for patient education. The answers by Chat Generative Pre-Trained Transformer (ChatGPT), a LLM AI text bot, to frequently asked questions (FAQs) were compared to answers provided by a contemporary Google search to determine the reliability of information provided by these sources for patient education in upper extremity arthroplasty. METHODS: "Total shoulder arthroplasty" (TSA) and "total elbow arthroplasty" (TEA) were entered into Google Search and ChatGPT 3.0 to determine the ten most FAQs. On Google, the FAQs were obtained through the "people also ask" section, while ChatGPT was asked to provide the ten most FAQs. Each question, answer, and reference(s) cited were recorded. A modified version of the Rothwell system was used to categorize questions into 10 subtopics: special activities, timeline of recovery, restrictions, technical details, cost, indications/management, risks and complications, pain, longevity, and evaluation of surgery. Each reference was categorized into the following groups: commercial, academic, medical practice, single surgeon personal, or social media. Questions for TSA and TEA were combined for analysis and compared between Google and ChatGPT with a 2 sample Z-test for proportions. RESULTS: Overall, most questions were related to procedural indications or management (17.5%). There were no significant differences between Google and ChatGPT between question categories. The majority of references were from academic websites (65%). ChatGPT produced a greater number of academic references compared to Google (80% vs. 50%; P = .047), while Google more commonly provided medical practice references (25% vs. 0%; P = .017). CONCLUSION: In conjunction with patient-physician discussions, AI LLMs may provide a reliable resource for patients. By providing information based on academic references, these tools have the potential to improve health literacy and improved shared decision making for patients searching for information about TSA and TEA. CLINICAL SIGNIFICANCE: With the rising prevalence of AI programs, it is essential to understand how these applications affect patient education in medicine.
    [Abstract] [Full Text] [Related] [New Search]