Search CORE

35 research outputs found

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

Author: Carignan Dean
Edgar Richard
Fusi Nicolo
Horvitz Eric
King Nicholas
Larson Jonathan
Lee Yin Tat
Li Yuanzhi
Liu Weishung
Luo Renqian
McKinney Scott Mayer
Ness Robert Osazuwa
Nori Harsha
Poon Hoifung
Qin Tao
Usuyama Naoto
White Chris
Zhang Sheng
Publication venue
Publication date: 27/11/2023
Field of study

Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot match specialist capabilities of fine-tuned models. For example, most explorations to date on medical competency benchmarks have leveraged domain-specific training, as exemplified by efforts on BioGPT and Med-PaLM. We build on a prior study of GPT-4's capabilities on medical challenge benchmarks in the absence of special training. Rather than using simple prompting to highlight the model's out-of-the-box capabilities, we perform a systematic exploration of prompt engineering. We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks. The prompting methods we explore are general purpose, and make no specific use of domain expertise, removing the need for expert-curated content. Our experimental design carefully controls for overfitting during the prompt engineering process. We introduce Medprompt, based on a composition of several prompting strategies. With Medprompt, GPT-4 achieves state-of-the-art results on all nine of the benchmark datasets in the MultiMedQA suite. The method outperforms leading specialist models such as Med-PaLM 2 by a significant margin with an order of magnitude fewer calls to the model. Steering GPT-4 with Medprompt achieves a 27% reduction in error rate on the MedQA dataset over the best methods to date achieved with specialist models and surpasses a score of 90% for the first time. Beyond medical problems, we show the power of Medprompt to generalize to other domains and provide evidence for the broad applicability of the approach via studies of the strategy on exams in electrical engineering, machine learning, philosophy, accounting, law, nursing, and clinical psychology.Comment: 21 pages, 7 figure

arXiv.org e-Print Archive

Recommended from our members

International evaluation of an AI system for breast cancer screening.

Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening.Professor Fiona Gilbert receives funding from the National Institute for Health Research (Senior Investigator award)

Apollo (Cambridge)

Not all green space is created equal: biodiversity predicts psychological restorative benefits from urban green space

Author: Akinwande
Alvarsson
Anguelovski
Annerstedt
Annerstedt
Aronson
Aspinall
Barton
Barton
Bates
Berman
Capaldi
Carvell
Clayton
Cohen-Shacham
Cox
Cox
Dadvand
Dallimer
Davis
Dean
Dearborn
Dickerson
Ekkel
Fox
Francis
Frantz
Fuller
Gascon
Gidlow
Gidlow
Gobster
Goddard
Gonzalez
Grueber
Hand
Harpe
Hartig
Hartig
Herzele
Honold
Jorm
Kaplan
Kazmierczak
Keesstra
Keniger
Krekel
Lee
Lee
Lepczyk
Linton
Lovell
Luck
Margaritis
Marselle
Matthies
Mayer
Mayer
Mceachan
Mceachan
Mckinney
Nassauer
Ngiam
Nisbet
Nordh
Nordh
Norton
Nowak
Palliwoda
Pollard
Randrup
Roberts
Roe
Roe
Rook
Roszak
Sandifer
Schebella
Schroeder
Schultz
Scott
Shanahan
Southon
Steptoe
Stiglitz
Taylor
Tremblay
Triguero-Mas
Van Den Berg
Van Den Berg
Vieira
Wilson
Young
Žlender
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Contemporary epidemiological methods testing the associations between green space and psychological well-being treat all vegetation cover as equal. However, there is very good reason to expect that variations in ecological "quality" (number of species, integrity of ecological processes) may influence the link between access to green space and benefits to human health and well-being. We test the relationship between green space quality and restorative benefit in an inner city urban population in Bradford, UK. We selected 12 urban parks for study where we carried out botanical and faunal surveys to quantify biodiversity and assessed the site facilities of the green space (cleanliness, provision of amenities). We also conducted 128 surveys with park users to quantify psychological restoration based on four self-reported measure of general restoration, attention-grabbing distractions, being away from everyday life, and site preference. We present three key results. First, there is a positive association between site facilities and biodiversity. Second, restorative benefit is predicted by biodiversity, which explained 43% of the variance in restorative benefit across the parks, with minimal input from other variables. Third, the benefits accrued through access to green space were unrelated to age, gender, and ethnic background. The results add to a small but growing body of evidence that emphasise the role of nature in contributing to the well-being of urban populations and, hence, the need to consider biodiversity in the design of landscapes that enhance multiple ecosystem services

Crossref

LSHTM Research Online

Directory of Open Access Journals

Frontiers - Publisher Connector

White Rose Research Online

FigShare

Kinetics of DNA Refolding from Longitudinal Exchange NMR Spectroscopy

Author: Ahmadian
Antao
Antao
Di Leva
Flamm
Fürtig
Fürtig
Hermann
Hwang
Höbartner
Hüttenhofer
Karymov
Kellenbach
Kloiber
Kreutz
Lim
Mayer
McKinney
Micura
Nakano
Overmars
Patel
Puffer
Rieder
Scott
Serganov
Shajani
Stoltenburg
Thiel
Tollinger
Washietl
Wenter
Wenter
Wenter
Woodson
Zhang
Zuker
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Robust and Efficient Medical Imaging with Self-Supervision

Recent progress in Medical Artificial Intelligence (AI) has delivered systems that can reach clinical expert level performance. However, such systems tend to demonstrate sub-optimal "out-of-distribution" performance when evaluated in clinical settings different from the training environment. A common mitigation strategy is to develop separate systems for each clinical setting using site-specific data [1]. However, this quickly becomes impractical as medical data is time-consuming to acquire and expensive to annotate [2]. Thus, the problem of "data-efficient generalization" presents an ongoing difficulty for Medical AI development. Although progress in representation learning shows promise, their benefits have not been rigorously studied, specifically for out-of-distribution settings. To meet these challenges, we present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI. REMEDIS uses a generic combination of large-scale supervised transfer learning with self-supervised learning and requires little task-specific customization. We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data. REMEDIS exhibits significantly improved in-distribution performance with up to 11.5% relative improvement in diagnostic accuracy over a strong supervised baseline. More importantly, our strategy leads to strong data-efficient generalization of medical imaging AI, matching strong supervised baselines using between 1% to 33% of retraining data across tasks. These results suggest that REMEDIS can significantly accelerate the life-cycle of medical imaging AI development thereby presenting an important step forward for medical imaging AI to deliver broad impact

arXiv.org e-Print Archive

Serotonin gene variants are unlikely to play a significant role in the pathogenesis of the sudden infant death syndrome

Author: Ackerman
Alm
Ambler
Anderson
Antila
Arens
Arneil
Arnestad
Arnestad
Bach
Baker-Herman
Barnes
Battersby
Beal
Beal
Beal
Bennett
Berner
Biondo
Blair
Blair
Blair
Blakely
Broadbelt
Broadbelt
Broadbelt
Cheng
Comet
Courts
Dashash
David S. Paterson
Deckert
Dergacheva
Duncan
Duncan
Dwyer
Edner
Engelberts
Fellermann
Ferrante
Ferrante
Ferrante
Fifer
Filiano
Filiano
Filonzi
Fiskerstrand
Fleming
Fleming
Franco
Franco
Froggatt
Getahun
Gillis
Gonzalez
Greenberg
Haas
Haglund
Harper
Harper
Hauck
Hauck
Heils
Heils
Hendricks
Hendricks
Hoffman
Hranilovic
Hunt
Hunt
Irgens
Irgens
Iyasu
Kattwinkel
Kelly
Kelly
Kibel
King
Kinney
Kinney
Kinney
Kinney
Kinney
Kitsantas
Klintschar
Klug
Korachi
Krous
Lalley
Lalley
Ledwidge
Lee
Lesch
MacDorman
Machaalani
MacKenzie
Maher
Maurer
McKinney
Millar
Mitchell
Mitchell
Mitchell
Mitchell
Moon
Morley
Morrow
Moscovis
Moscovis
Narita
Nonnis Marzano
Ogilvie
Opdal
Opdal
Opdal
Opdal
Opdal
Opdal
Opdal
Otagiri
Oyen
Ozawa
Ozawa
O’Kusky
O’Kusky
O’Leary
O’Mara
Panigrahy
Paterson
Paterson
Paterson
Paterson
Paterson
Pelayo
Pena
Peterson
Pfaar
Pharoah
Pincus
Ponsonby
Prandota
Ramanathan
Ramirez
Rand
Rand
Rantonen
Raul
Sabol
Scherer
Schoendorf
Schwartz
Scott
Scragg
Sebat
Seneviratne
Shannon
Shannon
Shen
Shih
Shih
Stewart
Summers
Takashima
Tappin
Tappin
Toruner
Trachtenberg
Tryba
Vege
Vennemann
Villalon
Walsh
Walther
Waters
Waters
Weese-Mayer
Weese-Mayer
Weese-Mayer
Willinger
Wisborg
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis

Author: A Bernardo
A Scott
A Scott
Alessandro Vespignani
Alexander Hanna
Alexander Hanna
Anders Olof Larsson
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andreas Jungherr
Andy Guess
Axel Bruns
Bernhard Rieder
Claudio Cioffi-Revilla
Claudio Cioffi-Revilla
Clay Shirky
Cornelius Puschmann
Cristian Vaccari
Damian Trilling
Daniel Gayo-Avello
Daniel Kreiss
David A Shamma
David A Shamma
David Easley
David Lazer
David Lazer
De Roure
Deen Freelon
Deen Freelon
Deen Freelon
Derek Hansen
Derek Ruths
E J Mark
Elizabeth Dubois
Eni Panagiotis Takis Metaxas
Eric D Kolaczyk
Eric D Kolaczyk
Erik Borra
Fernando Diaz
Fred Morstatter
Fred Morstatter
Gary King
Grant Allen
H Kay
H Robert
Hadley Wickham
Hadley Wickham
Hadley Wickham
Homero Gil De Z��iga
J Peter
J Rob
J Rob
James Howison
Janet M Box-Steffensmeier
Jay A Kreibich
John W Tukey
Kate Crawford
Keith Bradnam
Kevin Makice
Malcolm R Parks
Marco T Bastos
Mark Edward Huberty
Markus Strohmaier
Matthew A Russell
Maurice Vergeer
Merja Mahrt
Michael D Conover
Morgan R Frank
Nick Anstead
Nigel Gilbert
Norman Matloff
Pablo Barber�
Pablo Barber�
Pablo Barber�
Pascal JJrgens
Pascal J�rgens
Pascal J�rgens
Pascal J�rgens
Pascal J�rgens
Paul Teetor
R Core Team
R Peter
Richard Rogers
Richard Rogers
Robert I Kabacoff
Roger D Peng
Rolfe Daus Peterson
Rosaria Conte
Sandra Gonz�lez
Sandra Gonz�lez-Bail�n
Sandra Gonz�lez-Bail�n
Sandra Gonz�lez-Bail�n
Sarah J Jackson
Seth Myers
Shamanth Kumar
Sharad Goel
Simon Munzert
Stanley Wasserman
Tarleton Gillespie
Thomas Zeitzoff
Toby Segaran
Todd Graham
Tony Ojeda
V Dhavan
Viktor Mayer
W Lance Bennett
Wes Mckinney
William Roberts
Winston Chang
Yannis Theocharis
Yu-Ru Lin
Yu-Ru Lin
Zed A Shaw
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Crossref