65 research outputs found

    Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards human-level medical question answering

    Get PDF
    Background: Evidence on the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model (LLM), in the ophthalmology question-answering domain is needed. // Methods: We tested GPT-4 on two 260-question multiple choice question sets from the Basic and Clinical Science Course (BCSC) Self-Assessment Program and the OphthoQuestions question banks. We compared the accuracy of GPT-4 models with varying temperatures (creativity setting) and evaluated their responses in a subset of questions. We also compared the best-performing GPT-4 model to GPT-3.5 and to historical human performance. // Results: GPT-4–0.3 (GPT-4 with a temperature of 0.3) achieved the highest accuracy among GPT-4 models, with 75.8% on the BCSC set and 70.0% on the OphthoQuestions set. The combined accuracy was 72.9%, which represents an 18.3% raw improvement in accuracy compared with GPT-3.5 (p<0.001). Human graders preferred responses from models with a temperature higher than 0 (more creative). Exam section, question difficulty and cognitive level were all predictive of GPT-4-0.3 answer accuracy. GPT-4-0.3’s performance was numerically superior to human performance on the BCSC (75.8% vs 73.3%) and OphthoQuestions (70.0% vs 63.0%), but the difference was not statistically significant (p=0.55 and p=0.09). // Conclusion: GPT-4, an LLM trained on non-ophthalmology-specific data, performs significantly better than its predecessor on simulated ophthalmology board-style exams. Remarkably, its performance tended to be superior to historical human performance, but that difference was not statistically significant in our study

    “It’s hard to tell”. The challenges of scoring patients on standardised outcome measures by multidisciplinary teams: a case study of Neurorehabilitation

    Get PDF
    Background Interest is increasing in the application of standardised outcome measures in clinical practice. Measures designed for use in research may not be sufficiently precise to be used in monitoring individual patients. However, little is known about how clinicians and in particular, multidisciplinary teams, score patients using these measures. This paper explores the challenges faced by multidisciplinary teams in allocating scores on standardised outcome measures in clinical practice. Methods Qualitative case study of an inpatient neurorehabilitation team who routinely collected standardised outcome measures on their patients. Data were collected using non participant observation, fieldnotes and tape recordings of 16 multidisciplinary team meetings during which the measures were recited and scored. Eleven clinicians from a range of different professions were also interviewed. Data were analysed used grounded theory techniques. Results We identified a number of instances where scoring the patient was 'problematic'. In 'problematic' scoring, the scores were uncertain and subject to revision and adjustment. They sometimes required negotiation to agree on a shared understanding of concepts to be measured and the guidelines for scoring. Several factors gave rise to this problematic scoring. Team members' knowledge about patients' problems changed over time so that initial scores had to be revised or dismissed, creating an impression of deterioration when none had occurred. Patients had complex problems which could not easily be distinguished from each other and patients themselves varied in their ability to perform tasks over time and across different settings. Team members from different professions worked with patients in different ways and had different perspectives on patients' problems. This was particularly an issue in the scoring of concepts such as anxiety, depression, orientation, social integration and cognitive problems. Conclusion From a psychometric perspective these problems would raise questions about the validity, reliability and responsiveness of the scores. However, from a clinical perspective, such characteristics are an inherent part of clinical judgement and reasoning. It is important to highlight the challenges faced by multidisciplinary teams in scoring patients on standardised outcome measures but it would be unwarranted to conclude that such challenges imply that these measures should not be used in clinical practice for decision making about individual patients. However, our findings do raise some concerns about the use of such measures for performance management

    Management of hepatitis C virus genotype 4: recommendations of an international expert panel.

    Get PDF
    HCV has been classified into no fewer than six major genotypes and a series of subtypes. Each HCV genotype is unique with respect to its nucleotide sequence, geographic distribution, and response to therapy. Genotypes 1, 2, and 3 are common throughout North America and Europe. HCV genotype 4 (HCV-4) is common in the Middle East and in Africa, where it is responsible for more than 80% of HCV infections. It has recently spread to several European countries. HCV-4 is considered a major cause of chronic hepatitis, cirrhosis, hepatocellular carcinoma, and liver transplantation in these regions. Although HCV-4 is the cause of approximately 20% of the 170 million cases of chronic hepatitis C in the world, it has not been the subject of widespread research. Therefore, this document, drafted by a panel of international experts, aimed to review current knowledge on the epidemiology, natural history, clinical, histological features, and treatment of HCV-4 infections

    Aphrodisiac activity of 50% ethanolic extracts of Myristica fragrans Houtt. (nutmeg) and Syzygium aromaticum (L) Merr. & Perry. (clove) in male mice: a comparative study

    Get PDF
    BACKGROUND: Spices are considered as sexual invigorators in the Unani System of Medicine. In order to explore the sexual function improving effect of Myristica fragrans Houtt. (nutmeg) and Syzygium aromaticum (L) Merr. & Perry. (clove) an experimental study was conducted in normal male mice. METHODS: The extracts (50% ethanolic) of nutmeg and clove were administered (500 mg/kg; p.o.) to different groups of male Swiss mice. Mounting behaviour, mating performance, and general short term toxicity of the test drugs were determined and compared with the standard drug Penegra (Sildenafil citrate). RESULTS: The extracts of the nutmeg and clove were found to stimulate the mounting behaviour of male mice, and also to significantly increase their mating performance. The drugs were devoid of any conspicuous general short term toxicity. CONCLUSION: The extracts (50% ethanolic) of nutmeg and clove enhanced the sexual behaviour of male mice

    The Evolution of the Major Hepatitis C Genotypes Correlates with Clinical Response to Interferon Therapy

    Get PDF
    Patients chronically infected with hepatitis C virus (HCV) require significantly different durations of therapy and achieve substantially different sustained virologic response rates to interferon-based therapies, depending on the HCV genotype with which they are infected. There currently exists no systematic framework that explains these genotype-specific response rates. Since humans are the only known natural hosts for HCV-a virus that is at least hundreds of years old-one possibility is that over the time frame of this relationship, HCV accumulated adaptive mutations that confer increasing resistance to the human immune system. Given that interferon therapy functions by triggering an immune response, we hypothesized that clinical response rates are a reflection of viral evolutionary adaptations to the immune system.We have performed the first phylogenetic analysis to include all available full-length HCV genomic sequences (n = 345). This resulted in a new cladogram of HCV. This tree establishes for the first time the relative evolutionary ages of the major HCV genotypes. The outcome data from prospective clinical trials that studied interferon and ribavirin therapy was then mapped onto this new tree. This mapping revealed a correlation between genotype-specific responses to therapy and respective genotype age. This correlation allows us to predict that genotypes 5 and 6, for which there currently are no published prospective trials, will likely have intermediate response rates, similar to genotype 3. Ancestral protein sequence reconstruction was also performed, which identified the HCV proteins E2 and NS5A as potential determinants of genotype-specific clinical outcome. Biochemical studies have independently identified these same two proteins as having genotype-specific abilities to inhibit the innate immune factor double-stranded RNA-dependent protein kinase (PKR).An evolutionary analysis of all available HCV genomes supports the hypothesis that immune selection was a significant driving force in the divergence of the major HCV genotypes and that viral factors that acquired the ability to inhibit the immune response may play a role in determining genotype-specific response rates to interferon therapy
    corecore