Search CORE

285 research outputs found

Estimating the number needed to treat from continuous outcomes in randomised controlled trials: methodological challenges and worked example using data from the UK Back Pain Exercise and Manipulation (BEAM) trial

Author: A Beurskens
E Harvey
EH Turner
EH Turner
F Gorouhi
FM Kovacs
G Guyatt
G Rose
GH Guyatt
GH Guyatt
GR Norman
GR Norman
H de Vet
HC de Vet
HC de Vet
HC de Vet
HH Lauridsen
HH Lauridsen
I Kirsch
IC Marschner
J Cohen
J Covey
JK Moffett
JT Farrar
K Jordan
LA Wu
M Davidson
M Roland
M Roland
M Underwood
Martin Underwood
N van der Roer
PW Stratford
PW Stratford
PW Stratford
R Bender
Ranjit Lall
Robert Froud
RW Ostelo
RWJG Ostelo
S Walter
Sandra Eldridge
SD Walter
TH Tveito
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2009
Field of study

Background Reporting numbers needed to treat (NNT) improves interpretability of trial results. It is unusual that continuous outcomes are converted to numbers of individual responders to treatment (i.e., those who reach a particular threshold of change); and deteriorations prevented are only rarely considered. We consider how numbers needed to treat can be derived from continuous outcomes; illustrated with a worked example showing the methods and challenges. Methods We used data from the UK BEAM trial (n = 1, 334) of physical treatments for back pain; originally reported as showing, at best, small to moderate benefits. Participants were randomised to receive 'best care' in general practice, the comparator treatment, or one of three manual and/or exercise treatments: 'best care' plus manipulation, exercise, or manipulation followed by exercise. We used established consensus thresholds for improvement in Roland-Morris disability questionnaire scores at three and twelve months to derive NNTs for improvements and for benefits (improvements gained+deteriorations prevented). Results At three months, NNT estimates ranged from 5.1 (95% CI 3.4 to 10.7) to 9.0 (5.0 to 45.5) for exercise, 5.0 (3.4 to 9.8) to 5.4 (3.8 to 9.9) for manipulation, and 3.3 (2.5 to 4.9) to 4.8 (3.5 to 7.8) for manipulation followed by exercise. Corresponding between-group mean differences in the Roland-Morris disability questionnaire were 1.6 (0.8 to 2.3), 1.4 (0.6 to 2.1), and 1.9 (1.2 to 2.6) points. Conclusion In contrast to small mean differences originally reported, NNTs were small and could be attractive to clinicians, patients, and purchasers. NNTs can aid the interpretation of results of trials using continuous outcomes. Where possible, these should be reported alongside mean differences. Challenges remain in calculating NNTs for some continuous outcomes

Crossref

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

STARD for Abstracts: Essential items for reporting diagnostic accuracy studies in journal or conference abstracts

Author: Bossuyt PM
Cohen JF
De Vet HC
Gatsonis CA
Glasziou PP
Heneghan C
Hooft L
Korevaar DA
Moher D
Reitsma JB
Stard Group
Publication venue: 'BMJ'
Publication date: 01/01/2017
Field of study

Many abstracts of diagnostic accuracy studies are currently insufficiently informative. We extended the STARD (Standards for Reporting Diagnostic Accuracy) statement by developing a list of essential items that authors should consider when reporting diagnostic accuracy studies in journal or conference abstracts. After a literature review of published guidance for reporting biomedical studies, we identified 39 items potentially relevant to report in an abstract. We then selected essential items through a two round web based survey among the 85 members of the STARD Group, followed by discussions within an executive committee. Seventy three STARD Group members responded (86%), with 100% completion rate. STARD for Abstracts is a list of 11 quintessential items, to be reported in every abstract of a diagnostic accuracy study. We provide examples of complete reporting, and developed template text for writing informative abstract

Bond University Research Portal

Queen's University Belfast Research Portal

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Coventry University Pure Portal

Queen Mary Research Online

Utrecht University Repository

Smallest detectable change in volume differs between mass flow sensor and pneumotachograph

Author: Anton Vonk-Noordegraaf
BE Pennock
Caroline B Terwee
HC de Vet
Herman Groepenhoff
JE Cotes
JL Hankinson
MR Miller
N Macintyre
Patrick MC Jak
PH Quanjer
R Farre
RL Jensen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background To assess a pulmonary function change over time the mass flow sensor and the pneumotachograph are widely used in commercially available instruments. However, the smallest detectable change for both devices has never been compared. Therefore, the aim of this study is to determine the smallest detectable change in vital capacity (VC) and single-breath diffusion parameters measured by mass flow sensor and or pneumotachograph. Method In 28 healthy pulmonary function technicians VC, transfer factor for carbon monoxide (DLCO) and alveolar volume (VA) was repeatedly (10×) measured. The smallest detectable change was calculated by 1.96 x Standard Error of Measurement ×√2. Findings The mean (range) of the smallest detectable change measured by mass flow sensor and pneumotachograph respectively, were for VC (in Liter): 0.53 (0.46-0.65); 0.25 (0.17-0.36) (<it>p </it>= 0.04), DLCO (in mmol*kPa-1*min-1): 1.53 (1.26-1.7); 1.18 (0.84-1.39) (<it>p </it>= 0.07), VA (in Liter): 0.66. (0.53-0.82); 0.43 (0.34-0.53) (<it>p </it>= 0.04) and DLCO/VA (in mmol*kPa-1*min-1*L-1): 0.22 (0.19-0.28); 0.19 (0.14-0.22) (<it>p </it>= 0.79). Conclusions Smallest detectable significant change in VC and VA as measured by pneumotachograph are smaller than by mass flow sensor. Therefore, the pneumotachograph is the preferred instrument to estimate lung volume change over time in individual patients.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Understanding Ferguson's delta: time to say good-bye?

Author: B Terluin
B Terluin
Berend Terluin
Caroline B Terwee
Dirk L Knol
GH Guyatt
GR Norman
HC de Vet
Henrica CW de Vet
KW Wyrwich
M Hankins
M Hankins
M Hankins
ML Hermens
Publication venue
Publication date: 01/01/2009
Field of study

A critique of Hankins, M: 'How discriminating are discriminative instruments?' Health and Quality of Life Outcomes 2008, 6:3

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Voting for Image Scoring and Assessment (VISA) - theory and application of a 2 + 1 reader algorithm to improve accuracy of imaging endpoints in clinical trials

Author: BG Feagan
DK Rex
DT Rubin
EA Mohammed
Fez Hussain
G Latif-Shabgahi
GE Tontini
H Ahmad
HC Vet De
HC Vet De
JE Pandolfino
Klaus Gottlieb
L Lam
M Sokolova
MA Samaan
PJ Boland
RM Cooney
S Jamieson
SC Weller
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Reproducibility of goniometric measurement of the knee in the in-hospital phase following total knee arthroplasty

Author: AF de Winter
Anton F Lenssen
CB Terwee
CC Norkin
CJ Linden-Peters van den
Ellen M van Dam
FR Genderen
HC de Vet
HC de Vet
J Naylor
JM Bland
JM Bland
JM Rothstein
JM Rothstein
JZ Edwards
L Brosseau
L Brosseau
MA Watkins
Mark Verhey
MC Tammemagi
Norusis
Piet A van den Brandt
PP Gogia
RL Gajdosik
Rob A de Bie
Ruud JT Geesink
Scientific Advisory Committee of the Medical Outcome Trust
W Kafer
W Rheault
Yvonne HF Crijns
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The objective of the present study was to assess interobserver reproducibility (in terms of reliability and agreement) of active and passive measurements of knee RoM using a long arm goniometer, performed by trained physical therapists in a clinical setting in total knee arthroplasty patients, within the first four days after surgery. Methods Test-retest analysis Setting: University hospital departments of orthopaedics and physical therapy Participants: Two experienced physical therapists assessed 30 patients, three days after total knee arthroplasty. Main outcome measure: RoM measurement using a long-arm (50 cm) goniometer Agreement was calculated as the mean difference between observers ± 95% CI of this mean difference. The intraclass correlation coefficient (ICC) was calculated as a measure of reliability, based on two-way random effects analysis of variance. Results The lowest level of agreement was that for measurement of passive flexion with the patient in supine position (mean difference 1.4°; limits of agreement 16.2° to 19° for the difference between the two observers. The highest levels of agreement were found for measurement of passive flexion with the patient in sitting position and for measurement of passive extension (mean difference 2.7°; limits of agreement -6.7 to 12.1 and mean difference 2.2°; limits of agreement -6.2 to 10.6 degrees, respectively). The ability to differentiate between subjects ranged from 0.62 for measurement of passive extension to 0.89 for measurements of active flexion (ICC values). Conclusion Interobserver agreement for flexion as well as extension was only fair. When two different observers assess the same patients in the acute phase after total knee arthroplasty using a long arm goniometer, differences in RoM of less than eight degrees cannot be distinguished from measurement error. Reliability was found to be acceptable for comparison on group level, but poor for individual comparisons over time.</p

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The FDA guidance for industry on PROs: the point of view of a pharmaceutical company

Author: A Leplege
A Szende
C Acquadro
DA Revicki
Fabio Arpinelli
Francesco Bamfi
G Apolone
G Guyatt
G Guyatt
G Guyatt
HC De Vet
J Greenhalgh
MS Donaldson
NK Leidy
P Marquis
RJ Wilke
T Delate
Publication venue: BioMed Central
Publication date: 01/10/2006
Field of study

The importance of the patients point of view on their health status is widely recognised. Patient-reported outcomes is a broad term encompassing a large variety of different health data reported by patients, as symptoms, functional status, Quality of Life and Health-Related Quality of Life. Measurements of Health-Related Quality of Life have been developed during many years of researches, and a lot of validated questionnaires exist. However, few attempts have been made to standardise the evaluation of instruments characteristics, no recommendations are made about interpretation on Health-Related Quality of Life results, especially regarding the clinical significance of a change leading a therapeutic approach. Moreover, the true value of Health-Related Quality of Life evaluations in clinical trials has not yet been completely defined. An important step towards a more structured and frequent use of Patient-Reported Outcomes in drug development is represented by the FDA Guidance, issued on February 2006. In our paper we aim to report some considerations on this Guidance. Our comments focus especially on the characteristics of instruments to use, the Minimal Important Difference, and the methods to calculate it. Furthermore, we present the advantages and opportunities of using the Patient-Reported Outcomes in drug development, as seen by a pharmaceutical company. The Patient-Reported Outcomes can provide additional data to make a drug more competitive than others of the same pharmacological class, and a well demonstrated positive impact on the patient' health status and daily life might allow a higher price and/or the inclusion in a reimbursement list. Applying extensively the FDA Guidance in the next trials could lead to a wider culture of subjective measurement, and to a greater consideration for the patient's opinions on his/her care. Moreover, prescribing doctors and payers could benefit from subjective information to better define the value of drugs

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Reproducibility and responsiveness of the Symptom Severity Scale and the hand and finger function subscale of the Dutch arthritis impact measurement scales (Dutch-AIMS2-HFF) in primary care patients with wrist or hand problems

Author: AF De Bruin
AW Evers
CA McHorney
Caroline B Terwee
CB Terwee
CB Terwee
Daniëlle AWM van der Windt
DL Streiner
DW Levine
GR Norman
GR Norman
HC De Vet
HCW De Vet
JA Hanley
JC Nunally
JM Bland
KO McGraw
KW Wyrwich
KW Wyrwich
Marinda N Spies-Dorgelo
MR De Boer
N Van der Roer
R Jaeschke
RA Deyo
RF Meenan
RF Meenan
RP Riemsma
Wim AB Stalman
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: To determine the clinimetric properties of two questionnaires assessing symptoms (Symptom Severity Scale) and physical functioning (hand and finger function subscale of the AIMS2) in a Dutch primary care population. METHODS: The first 84 participants in a 1-year follow-up study on the diagnosis and prognosis of hand and wrist problems completed the Symptom Severity Scale and the hand and finger function subscale of the Dutch-AIMS2 twice within 1 to 2 weeks. The data were used to assess test-retest reliability (ICC) and smallest detectable change (SDC, based on the standard error of measurement (SEM)). To assess responsiveness, changes in scores between baseline and the 3 month follow-up were related to an external criterion to estimate the minimal important change (MIC). We calculated the group size needed to detect the MIC beyond measurement error. RESULTS: The ICC for the Symptom Severity Scale was 0.68 (95% CI: 0.54–0.78). The SDC was 1.00 at individual level and 0.11 at group level, both on a 5-point scale. The MIC was 0.23, exceeding the SDC at group level. The group size required to detect a MIC beyond measurement error was 19 for the Symptom Severity Scale. The ICC for the hand and finger function subscale of the Dutch-AIMS2 was 0.62 (95% CI: 0.47–0.74). The SDC was 3.80 at individual level and 0.42 at group level, both on an 11-point scale. The MIC was 0.31, which was less than the SDC at group level. The group size required to detect a MIC beyond measurement error was 150. CONCLUSION: In our heterogeneous primary care population the Symptom Severity Scale was found to be a suitable instrument to assess the severity of symptoms, whereas the hand and finger function subscale of the Dutch-AIMS2 was less suitable for the measurement of physical functioning in patients with hand and wrist problems

Keele Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: A clarification of its content

Author: C Powell
CA McHorney
Caroline B Terwee
CB Terwee
CM Goodman
DA Revicki
DG Altman
Dirk L Knol
DL Streiner
DL Streiner
Donald L Patrick
DW Levine
F Hasson
FJ Floyd
GH Guyatt
GJ Van der Heijden
GR Norman
H De Vet
HC De Vet
Henrica CW de Vet
I McDowell
IB Wilson
J Cohen
JM Cortina
Jordi Alonso
LB Mokkink
LB Mokkink
LB Mokkink
Lex M Bouter
Lidwine B Mokkink
LJ Cronbach
LJ Cronbach
ME Strauss
MR De Boer
MR Stockler
Paul W Stratford
PM Fayers
S Keeney
S Messick
World Health Organization
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The COSMIN checklist (COnsensus-based Standards for the selection of health status Measurement INstruments) was developed in an international Delphi study to evaluate the methodological quality of studies on measurement properties of health-related patient reported outcomes (HR-PROs). In this paper, we explain our choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion. Methods The issues described in this paper are a reflection of the Delphi process in which 43 panel members participated. Results The topics discussed are internal consistency (relevance for reflective and formative models, and distinction with unidimensionality), content validity (judging relevance and comprehensiveness), hypotheses testing as an aspect of construct validity (specificity of hypotheses), criterion validity (relevance for PROs), and responsiveness (concept and relation to validity, and (in) appropriate measures). Conclusions We expect that this paper will contribute to a better understanding of the rationale behind the items, thereby enhancing the acceptance and use of the COSMIN checklist.</p

Crossref

VU Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UPF Digital Repository

Inexperienced clinicians can extract pathoanatomic information from MRI narrative reports with high reproducibility for use in research/quality assurance

Author: AR Feinstein
C Dora
DG Altman
DS Mulconrey
E Arana
FM Kovacs
HC de Vet
J Sim
JA Carrino
JD Lurie
JD Lurie
JD Lurie
JJ Jarvik
JR Landis
MC Jensen
P Brennan
SJ Sorensen
TS Jensen
TS Jensen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Abstract Background Although reproducibility in reading MRI images amongst radiologists and clinicians has been studied previously, no studies have examined the reproducibility of inexperienced clinicians in extracting pathoanatomic information from magnetic resonance imaging (MRI) narrative reports and transforming that information into quantitative data. However, this process is frequently required in research and quality assurance contexts. The purpose of this study was to examine inter-rater reproducibility (agreement and reliability) among an inexperienced group of clinicians in extracting spinal pathoanatomic information from radiologist-generated MRI narrative reports. Methods Twenty MRI narrative reports were randomly extracted from an institutional database. A group of three physiotherapy students independently reviewed the reports and coded the presence of 14 common pathoanatomic findings using a categorical electronic coding matrix. Decision rules were developed after initial coding in an effort to resolve ambiguities in narrative reports. This process was repeated a further three times using separate samples of 20 MRI reports until no further ambiguities were identified (total n = 80). Reproducibility between trainee clinicians and two highly trained raters was examined in an arbitrary coding round, with agreement measured using percentage agreement and reliability measured using unweighted Kappa (<it>k</it>). Reproducibility was then examined in another group of three trainee clinicians who had not participated in the production of the decision rules, using another sample of 20 MRI reports. Results The mean percentage agreement for paired comparisons between the initial trainee clinicians improved over the four coding rounds (97.9-99.4%), although the greatest improvement was observed after the first introduction of coding rules. High inter-rater reproducibility was observed between trainee clinicians across 14 pathoanatomic categories over the four coding rounds (agreement range: 80.8-100%; reliability range <it>k </it>= 0.63-1.00). Concurrent validity was high in paired comparisons between trainee clinicians and highly trained raters (agreement 97.8-98.1%, reliability <it>k </it>= 0.83-0.91). Reproducibility was also high in the second sample of trainee clinicians (inter-rater agreement 96.7-100.0% and reliability <it>k </it>= 0.76-1.00; intra-rater agreement 94.3-100.0% and reliability <it>k </it>= 0.61-1.00). Conclusions A high level of radiological training is not required in order to transform MRI-derived pathoanatomic information from a narrative format to a quantitative format with high reproducibility for research or quality assurance purposes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Southern Denmark Research Output

espace@Curtin

University of Melbourne Institutional Repository