Search CORE

122 research outputs found

Locating bugs without looking back

Author: CD Manning
D Poshyvanyk
EM Voorhees
G Antoniol
G Salton
J Sillito
M Petrenko
MF Porter
Michel Wermelinger
N Wilde
T Zimmermann
Tezcan Dilshener
Yijun Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/10/2017
Field of study

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history

Crossref

Open Research Online (The Open University)

PubMed related articles: a probabilistic topic-based model for content similarity

Author: A Berger
A Singhal
C Zhai
CW Cleverdon
D Metzler
DK Harman
EL Margulis
EM Voorhees
EM Voorhees
G Salton
J Lin
Jimmy Lin
K Sparck Jones
M Smucker
S Robertson
SE Robertson
SE Robertson
SP Harter
T Strohman
W Hersh
W John Wilbur
WJ Wilbur
WJ Wilbur
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background We present a probabilistic topic-based model for content similarity called <it>pmra </it>that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance–but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH ® in MEDLINE ®. Results The <it>pmra </it>retrieval model was compared against <it>bm25</it>, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of <it>pmra </it>over <it>bm25 </it>in terms of precision. Conclusion Our experiments suggest that the <it>pmra </it>model provides an effective ranking algorithm for related article search.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

Evaluation of a Bayesian inference network for ligand-based virtual screening

Author: A Abdo
A Bender
AG Maldonado
AN Jain
AR Leach
AR Leach
Beining Chen
Christoph Mueller
CX Zhai
D Metzler
EJ Gardiner
EM Voorhees
G Salton
GW Bemis
H Eckert
H Turtle
J Bajorath
J Hert
J Hert
J-F Truchon
JA Grant
JD Holliday
JP Callan
JP Callan
JR Fischer
K Spärck Jones
K Spärck Jones
N Nikolova
P Prathipati
P Willett
P Willett
P Willett
P Willett
P Willett
Peter Willett
RC Glen
RD Brown
RP Sheridan
RP Sheridan
S Siegel
SJ Edgar
T Lengauer
T Strohman
TI Oprea
WR Greiff
X Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity. Results Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought. Conclusion A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

White Rose Research Online

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

Author: AM Cohen
D Demner-Fushman
E Amitay
EM Voorhees
Fabien Campagne
I Soboroff
JA Aslam
K Sparck Jones
K Sparck Jones
KC Dorff
M Fuller
P Boldi
P Dong
R Nuray
S Buttcher
SE Robertson
SF Kim
Y Yue
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. Results We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Conclusion Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Relevance similarity: an alternative means to monitor information retrieval systems

Author: A Spink
Adrian Mondry
AM Rees
AR Feinstein
CA Cuadra
CL Barry
CL Barry
CW Cleverdon
CW Cleverdon
DL Sackett
DV Cicchetti
EM Voorhees
G Hripcsak
G Peterson
J Castro
J Cohen
J Kekäläinen
JL Fleiss
KC Abbott
L Schamber
L Schamber
Marie Loh
ME Lesk
P Dong
P Dong
P Vakkari
Peng Dong
R Burgin
SP Harter
T Saracevic
TV Kazhdan
V Curro
WR Hersh
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Relevance assessment is a major problem in the evaluation of information retrieval systems. The work presented here introduces a new parameter, "Relevance Similarity", for the measurement of the variation of relevance assessment. In a situation where individual assessment can be compared with a gold standard, this parameter is used to study the effect of such variation on the performance of a medical information retrieval system. In such a setting, Relevance Similarity is the ratio of assessors who rank a given document same as the gold standard over the total number of assessors in the group. METHODS: The study was carried out on a collection of Critically Appraised Topics (CATs). Twelve volunteers were divided into two groups of people according to their domain knowledge. They assessed the relevance of retrieved topics obtained by querying a meta-search engine with ten keywords related to medical science. Their assessments were compared to the gold standard assessment, and Relevance Similarities were calculated as the ratio of positive concordance with the gold standard for each topic. RESULTS: The similarity comparison among groups showed that a higher degree of agreements exists among evaluators with more subject knowledge. The performance of the retrieval system was not significantly different as a result of the variations in relevance assessment in this particular query set. CONCLUSION: In assessment situations where evaluators can be compared to a gold standard, Relevance Similarity provides an alternative evaluation technique to the commonly used kappa scores, which may give paradoxically low scores in highly biased situations such as document repositories containing large quantities of relevant data

Crossref

Springer - Publisher Connector

PubMed Central

Rethinking the test collection methodology for personal self-tracking data

Author: AJ Sellen
B Ionescu
C Stevinson
D Harman
D Walsh
EM Voorhees
F Hopfgartner
M Aghaei
M Dodge
P Herruzo
R Gupta
RM Chang
S Hodges
T Hayashi
Werner Bailer
WJ Korotitisch
Yu-Chuan Su
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/12/2019
Field of study

While vast volumes of personal data are being gathered daily by individuals, the MMM community has not really been tackling the challenge of developing novel retrieval algorithms for this data, due to the challenges of getting access to the data in the first place. While initial efforts have taken place on a small scale, it is our conjecture that a new evaluation paradigm is required in order to make progress in analysing, modeling and retrieving from personal data archives. In this position paper, we propose a new model of Evaluation-as-a-Service that re-imagines the test collection methodology for personal multimedia data in order to address the many challenges of releasing test collections of personal multimedia data

Crossref

Irish Universities

DCU Online Research Access Service

White Rose Research Online

Rethink reporting of evaluation results in AI

Author: Burden J
Burnell R
Cheke LG
Cohn AG
Hernandez-Orallo J
Kiela D
Leibo JZ
Martinez-Plumed F
Mitchell M
Rutar D
Schellaert W
Shanahan M
Sohl-Dickstein J
Tenenbaum JB
Ullman TD
Voorhees EM
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 14/04/2023
Field of study

White Rose Research Online

The DAC system and associations with multiple myeloma

Author: A Atsumi
A Badros
A Chauchereau
AD Tran
AJ Ruijter de
BE Linggi
C Hubbert
C Schwartz
CS Mitsiades
CS Mitsiades
D Lavelle
D Siegel
D Weber
DM Weber
DM Weber
EJ Bennett
EL Davenport
EL Davenport
EM Hurt
EM Ocio
EM Ocio
Enrique M. Ocio
H Chang
H Ozdag
IV Gregoretti
J Arts
JE Bolden
Jesús F. San Miguel
JF San-Miguel
JL Wolf
JM Amann
JR Berenson
JS Damiano
K Kitazoe
L Catley
L Catley
M Akiyama
M Dimopoulos
M Dokmanovic
M Esteller
M Kaiser
N Mitsiades
NC Gutierrez
NJ Petrelli
O Witt
P Bali
P Maiso
P Marks
P Neri
P Richardson
PG Richardson
PM Voorhees
R Feng
R Feng
R Niesvizky
RA Kyle
RA Kyle
RR Kopito
S Chen
S Deleu
S Deleu
S Minucci
S Saji
S Singhal
SB Khan
SJ Haggarty
SJ Harrison
SK Kumar
T Hideshima
T Hideshima
T Sakuma
TA Miller
TE Fandy
W Weichert
W Weichert
W Weichert
W Xiong
WJ Chng
WK Rasheed
WS Xu
XY Pei
Y Choi
Y Kawaguchi
Z Zhang
Publication venue: Springer US
Publication date: 01/01/2010
Field of study

Despite the clear progress achieved in recent years in the treatment of MM, most patients eventually relapse and therefore novel therapeutic options are still necessary for these patients. In this regard, several drugs that target specific mechanisms of the tumor cells are currently being explored in the preclinical and clinical setting. This manuscripts offers a review of the rationale and current status of the antimyeloma activity of one of the most relevant examples of these targeted drugs: deacetylase inhibitors (DACi). Several studies have demonstrated the prooncogenic activity of deacetylases (DACs) through the targeting not only of histones but also of non histone proteins relevant to tumor progression, such as p53, E2F family members, Bcl-6, Hsp90, HIF-1α or Nur77. This fact together with the DACs overexpression present in several tumors, has prompted the development of some DACi with potential antitumor effect. This situation is also evident in the case of MM as two mechanisms of DACi, the inhibition of the epigenetic inactivation of p53 and the blockade of the unfolded protein response, through the inhibition of the aggressome formation (by targeting DAC6) and the inactivation of the chaperone system (by acetylating HSP-90), provides the rationale for the exploration of the potential antimyeloma activity of these compounds. Several DACi with different chemical structure and different selectivity for targeting the DAC families have been tested in MM. Their preclinical activity in monotherapy has been quite exciting and has been described to be mediated by various mechanisms: the induction of apoptosis and cell cycle arrest mainly by the upregulation of p21; the interferece with the interaction between plasma cells and the microenvironment, by reducing the expression and signalling of several cytokines or by inhibiting angiogenesis. Finally they also have a role in protecting murine models from myeloma bone disease. Neverteless, the clinical activity in monotherapy of these drugs in relapsed/refractory MM patients has been very modest. This has prompted the development of combinations such as the one with bortezomib or lenalidomide and dexamethasone, which have already been taken into the clinics with positive preliminary results

Crossref

Springer - Publisher Connector

PubMed Central

Digital.CSIC

Cost-effectiveness of nurse-led self-help for recurrent depression in the primary care setting: design of a pragmatic randomized trial

Author: AC Van Loenen
AC Viguera
AF Smelt
AJ Rush
American Psychiatric Association
Anneke van Schaik
B Terluin
BW Penninx
BW Van Voorhees
CJ Murray
CL Bockting
CL Bockting
CL Bockting
Claudi LH Bockting
DE Morisky
DJ van Schaik
DL Larsen
E Fenwick
E Frank
E Piek
E Van't Hof
EM Hunkeler
ES Paykel
ES Paykel
EuroQol group
Filip Smit
Harm WJ van Marwijk
Henriette E van der Horst
HJ Möller
HWJ Van Marwijk
J Ormel
J Ware
JB Oostenbrink
JD Teasdale
JE Brazier
JR Vittengl
JS Hoch
Judith E Bosmans
Karolien EM Biesheuvel-Leliefeld
KJ Joling
L Hakkaart-van Roijen
L Hakkaart-van Roijen
LA Le
LB Krupp
LM Lamers
LM Mynors-Wallis
M Tohen
MA Koopmanschap
MB First
MC ten Doesschate
MG Newman
N Kaymaz
NL McKendree-Smith
P Skapinakis
PC den Boer
R Schwarzer
RB Jarrett
RC Boston
RJ Gregory
RV Bijl
RW Marrs
S Van Buuren
Sandra MA Kersten
SB Cohen
SD Hollon
SG Thompson
SH Ma
SW Geerlings
T Vos
TI Mueller
V Spek
WA van der Kloot
WE Meijer
ZV Segal
Publication venue
Publication date: 01/01/2012
Field of study

Abstract Background Major Depressive Disorder is a leading cause of disability, tends to run a recurrent course and is associated with substantial economic costs due to increased healthcare utilization and productivity losses. Interventions aimed at the prevention of recurrences may reduce patients' suffering and costs. Besides antidepressants, several psychological treatments such as preventive cognitive therapy (PCT) are effective in the prevention of recurrences of depression. Yet, many patients find long-term use of antidepressants unattractive, do not want to engage in therapy sessions and in the primary care setting psychologists are often not available. Therefore, it is important to study whether PCT can be used in a nurse-led self-help format in primary care. This study sets out to test the hypothesis that usual care plus nurse-led self-help for recurrent depression in primary care is feasible, acceptable and cost-effective compared to usual care only. Design Patients are randomly assigned to ‘nurse-led self-help treatment plus usual care’ (134 participants) or ‘usual care’ (134 participants). Randomisation is stratified according to the number of previous episodes (2 or 3 previous episodes versus 4 or more). The primary clinical outcome is the cumulative recurrence rate of depression meeting DSM-IV criteria as assessed by the Structured-Clinical-Interview-for-DSM-IV- disorders at one year after completion of the intervention. Secondary clinical outcomes are quality of life, severity of depressive symptoms, co-morbid psychopathology and self-efficacy. As putative effect-moderators, demographic characteristics, number of previous episodes, type of treatment during previous episodes, age of onset, self-efficacy and symptoms of pain and fatigue are assessed. Cumulative recurrence rate ratios are obtained under a Poisson regression model. Number-needed-to-be-treated is calculated as the inverse of the risk-difference. The economic evaluation is conducted from a societal perspective, both as a cost-effectiveness analysis (costs per depression free survival year) and as a cost-utility analysis (costs per quality adjusted life-year). Discussion The purpose of this paper is to outline the rationale and design of a nurse-led, cognitive therapy based self-help aimed at preventing recurrence of depression in a primary care setting. Only few studies have focused on psychological self-help interventions aimed at the prevention of recurrences in primary care patients. Trial registration NTR3001 (<url>http://www.trialregister.nl</url>)</p

VU Research Portal

University of Groningen

Directory of Open Access Journals

Crossref

Proceedings - University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

Ghent University Academic Bibliography

PubMed Central

Dissertations of the University of Groningen