Search CORE

Open Access LMU ( Ludwig-Maximilians-Univ. München)

Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models

Author: A Seelig
ET Morgan
G Dutton
H Van de Waterbeemd
I Poggesi
IM Kapetanovic
James Bulgarelli
JE Penzotti
Kevin Rissolo
Leonard Tini
N Mizuno
Norman C LeDonne
R Guha
R Guha
R Guha
T Hou
T Kohonen
T Kohonen
WJ Conover
WJ Egan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI) curves via the SALI curve integral (SCI) as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists. Results The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK <it>in vitro </it>efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from <it>SimulationsPlus </it>and AstraZeneca. Conclusion The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.</p

Springer - Publisher Connector

Relative Effects of Juvenile and Adult Environmental Factors on Mate Attraction and Recognition in the Cricket, Allonemobius socius

Author: Alexander E. Olvido
Andersson MB
Boldman KG
Brooks MW
Conover WJ
Dingle H
Dobzhansky Th
Etges WJ
Ferreira GB
Grace JL
Howard DJ
Marshall JL
Mullen SP
Pearl R. Fernandes
Timothy A. Mousseau
Wagner WE
Walker TJ
Walker TJ
Publication venue: University of Wisconsin Library
Publication date
Field of study

Finding a mate is a fundamental aspect of sexual reproduction. To this end, specific-mate recognition systems (SMRS) have evolved that facilitate copulation between producers of the mating signal and their opposite-sex responders. Environmental variation, however, may compromise the efficiency with which SMRS operate. In this study, the degree to which seasonal climate experienced during juvenile and adult life-cycle stages affects the SMRS of a cricket, Allonemobius socius (Scudder) (Orthoptera: Gryllidae) was assessed. Results from two-choice behavioral trials suggest that adult ambient temperature, along with population and family origins, mediate variation in male mating call, and to a lesser extent directional response of females for those calls. Restricted maximum-likelihood estimates of heritability for male mating call components and for female response to mating call appeared statistically nonsignificant. However, appreciable “maternal genetic effects” suggest that maternal egg provisioning and other indirect maternal determinants of the embryonic environment significantly contributed to variation in male mating call and female response to mating calls. Thus, environmental factors can generate substantial variation in A. socius mating call, and, more importantly, their marginal effect on female responses to either fast-chirp or long-chirp mating calls suggest negative fitness consequences to males producing alternative types of calls. Future studies of sexual selection and SMRS evolution, particularly those focused on hybrid zone dynamics, should take explicit account of the loose concordance between signal producers and responders suggested by the current findings

Asymptotic behaviour and optimal word size for exact and approximate word matches between random sequences

Author: A Barbour
A Christoffels
CJ Burden
Conrad J Burden
J Burke
JE Carpenter
L Florea
M Kimura
Miriam R Kantorovitz
MR Kantorovitz
MS Waterman
OM Melko
RA Lippert
S Vinga
SF Altschul
Sylvain Forêt
TJ Wu
W Hide
WJ Conover
WJ Kent
WR Pearson
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The number of k-words shared between two sequences is a simple and effcient alignment-free sequence comparison method. This statistic, D(2), has been used for the clustering of EST sequences. Sequence comparison based on D(2 )is extremely fast, its runtime is proportional to the size of the sequences under scrutiny, whereas alignment-based comparisons have a worst-case run time proportional to the square of the size. Recent studies have tackled the rigorous study of the statistical distribution of D(2), and asymptotic regimes have been derived. The distribution of approximate k-word matches has also been studied. RESULTS: We have computed the D(2 )optimal word size for various sequence lengths, and for both perfect and approximate word matches. Kolmogorov-Smirnov tests show D(2 )to have a compound Poisson distribution at the optimal word size for small sequence lengths (below 400 letters) and a normal distribution at the optimal word size for large sequence lengths (above 1600 letters). We find that the D(2 )statistic outperforms BLAST in the comparison of artificially evolved sequences, and performs similarly to other methods based on exact word matches. These results obtained with randomly generated sequences are also valid for sequences derived from human genomic DNA. CONCLUSION: We have characterized the distribution of the D(2 )statistic at optimal word sizes. We find that the best trade-off between computational efficiency and accuracy is obtained with exact word matches. Given that our numerical tests have not included sequence shuffling, transposition or splicing, the improvements over existing methods reported here underestimate that expected in real sequences. Because of the linear run time and of the known normal asymptotic behavior, D(2)-based methods are most appropriate for large genomic sequences

Springer - Publisher Connector

The Australian National University

Using a formative simulated patient exercise for curriculum evaluation

Author: AM O'Conner
Carole W Keefe
CH Braddock
CW Keefe
David J Solomon
GE Miller
Heather S Laird-Fick
M Heisler
Margaret E Thompson
Mary Margaret Noel
RJ Adams
RJ Adams
RS Gotler
SH Kaplan
SL Sheridan
WJ Conover
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: It is not clear that teaching specific history taking, physical examination and patient teaching techniques to medical students results in durable behavioural changes. We used a quasi-experimental design that approximated a randomized double blinded trial to examine whether a Participatory Decision-Making (PDM) educational module taught in a clerkship improves performance on a Simulated Patient Exercise (SPE) in another clerkship, and how this is influenced by the time between training and assessment. METHODS: Third year medical students in an internal medicine clerkship were assessed on their use of PDM skills in an SPE conducted in the second week of the clerkship. The rotational structure of the third year clerkships formed a pseudo-randomized design where students had 1) completed the family practice clerkship containing a training module on PDM skills approximately four weeks prior to the SPE, 2) completed the family medicine clerkship and the training module approximately 12 weeks prior to the SPE or 3) had not completed the family medicine clerkship and the PDM training module at the time they were assessed via the SPE. RESULTS: Based on limited pilot data there were statistically significant differences between students who received PDM training approximately four weeks prior to the SPE and students who received training approximately 12 weeks prior to the SPE. Students who received training 12 weeks prior to the SPE performed better than those who received training four weeks prior to the SPE. In a second comparison students who received training four weeks prior to the SPE performed better than those who did not receive training but the differences narrowly missed statistical significance (P < 0.05). CONCLUSION: This pilot study demonstrated the feasibility of a methodology for conducting rigorous curricular evaluations using natural experiments based on the structure of clinical rotations. In addition, it provided preliminary data suggesting targeted educational interventions can result in marked improvements in the clinical skills spontaneously exhibited by physician trainees in a setting different from which the skills were taught

Springer - Publisher Connector

LIPS vs MOSA: a Replicated Empirical Study on Automated Test Case Generation

Author: A Panichella
A Vargha
F Shull
G Fraser
G Fraser
K Deb
K Deb
NJ Juzgado
P McMinn
RD Baker
S Scalabrino
WJ Conover
Publication venue
Publication date: 09/09/2017
Field of study

Replication is a fundamental pillar in the construction of scientific knowledge. Test data generation for procedural programs can be tackled using a single-target or a many-objective approach. The proponents of LIPS, a novel single-target test generator, conducted a preliminary empirical study to compare their approach with MOSA, an alternative many-objective test generator. However, their empirical investigation suffers from several external and internal validity threats, does not consider complex programs with many branches and does not include any qualitative analysis to interpret the results. In this paper, we report the results of a replication of the original study designed to address its major limitations and threats to validity. The new findings draw a completely different picture on the pros and cons of single-target vs many-objective approaches to test case generation

Archivio della ricerca - Fondazione Bruno Kessler

Open Repository and Bibliography - Luxembourg

Can sacrificial feeding areas protect aquatic plants from herbivore grazing? Using behavioural ecology to inform wildlife management

Author: A Jozkowicz
A Sih
AJ McLane
AK Pandit
BA Nolet
BA Nolet
C Bech
CD Ankney
CJ Spray
D van Vuren
DJ Decker
EC Rees
Francis Daunt
G Gayet
G Perry
GN Robb
GV Watola
H Blokpoel
HV McKay
I Gordon
J Sahlsten
JA Estes
JA Vickery
JA Vickery
JE Gross
JV López-Bao
KA Wood
KA Wood
KA Wood
KA Wood
KA Wood
KA Wood
KA Wood
Kevin A. Wood
KH Hodder
KM Ringelman
KS Tatu
LM Gosling
LP Hansen
M Kersten
M Owen
M Owen
M Owen
Matthew T. O’Hare
Maura (Gee) Geraldine Chapman
MJ Heydon
MR Conover
MR van Eerden
MT O’Hare
PV Rattray
RA Stillman
RA Stillman
Richard A. Stillman
RJ Greenwood
RJ Orr
S Boutin
S Takatsuki
SM Cooper
SM Percival
SM Redpath
SM Redpath
T Amano
T Amano
T Amano
TE Martin
W Meissner
WJ Sutherland
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 31/07/2014
Field of study

Effective wildlife management is needed for conservation, economic and human well-being objectives. However, traditional population control methods are frequently ineffective, unpopular with stakeholders, may affect non-target species, and can be both expensive and impractical to implement. New methods which address these issues and offer effective wildlife management are required. We used an individual-based model to predict the efficacy of a sacrificial feeding area in preventing grazing damage by mute swans (Cygnus olor) to adjacent river vegetation of high conservation and economic value. The accuracy of model predictions was assessed by a comparison with observed field data, whilst prediction robustness was evaluated using a sensitivity analysis. We used repeated simulations to evaluate how the efficacy of the sacrificial feeding area was regulated by (i) food quantity, (ii) food quality, and (iii) the functional response of the forager. Our model gave accurate predictions of aquatic plant biomass, carrying capacity, swan mortality, swan foraging effort, and river use. Our model predicted that increased sacrificial feeding area food quantity and quality would prevent the depletion of aquatic plant biomass by swans. When the functional response for vegetation in the sacrificial feeding area was increased, the food quantity and quality in the sacrificial feeding area required to protect adjacent aquatic plants were reduced. Our study demonstrates how the insights of behavioural ecology can be used to inform wildlife management. The principles that underpin our model predictions are likely to be valid across a range of different resource-consumer interactions, emphasising the generality of our approach to the evaluation of strategies for resolving wildlife management problems

Public Library of Science (PLOS)

Bournemouth University Research Online

NERC Open Research Archive

A phase I and pharmacokinetic study of MAG-CPT, a water-soluble polymer conjugate of camptothecin

Author: C Pellizzoni
C van Kesteren
CD Conover
CD Conover
CG Moertel
D Fraier
DF Kehrer
FM Muggia
GG Chabot
H Rosing
J H Beijnen
J H M Schellens
J Lieverst
JM Meerum Terwogt
LP Rivory
LW Seymour
M Breda
M Grazia Porro
M Swart
ME Wall
N E Schoemaker
NE Schoemaker
PA Vasey
PJ Houghton
R Simon
R Spinelli
RS Lott
S Jansen
SL Traub
VR Caiolfa
W W ten Bokkel Huinink
WD Kingsbury
WJ Slichenmyer
Publication venue: Nature Publishing Group
Publication date
Field of study

Polymeric drug conjugates are a new and experimental class of drug delivery systems with pharmacokinetic promises. The antineoplastic drug camptothecin was linked to a water-soluble polymeric backbone (MAG-CPT) and administrated as a 30 min infusion over 3 consecutive days every 4 weeks to patients with malignant solid tumours. The objectives of our study were to determine the maximal tolerated dose, the dose-limiting toxicities, and the plasma and urine pharmacokinetics of MAG-CPT, and to document anti-tumour activity. The starting dose was 17 mg m−2 day−1. Sixteen patients received 39 courses at seven dose levels. Maximal tolerated dose was at 68 mg m−2 day−1 and dose-limiting toxicities consisted of cumulative bladder toxicity. MAG-CPT and free camptothecin were accumulated during days 1–3 and considerable amounts of MAG-CPT could still be retrieved in plasma and urine after 4–5 weeks. The half-lives of bound and free camptothecin were equal indicating that the kinetics of free camptothecin were release rate dependent. In summary, the pharmacokinetics of camptothecin were dramatically changed, showing controlled prolonged exposure of camptothecin. Haematological toxicity was relatively mild, but serious bladder toxicity was encountered which is typical for camptothecin and was found dose limiting

Male reproductive health and environmental xenoestrogens

Author: A Giwercman
Ansell PE
Avellan L
B Jégou
Baird DD
Baumrucker GO
Berkowitz GD
Berkowitz GS
Bibbo M
Bigler WJ
Bj N
Blom K
Bostofte E
Bradbury RB
Bromage NR
Brown LM
Buemann B
Campbell DM
Cassidy A
Chung CS
Clemens MJ
Conover MR
Copeland PA
Czeizel A
Depue RH
Driscoll SG
E Rajpert-De Meyts
Forman D
Fox GA
Freeman HC
Frst P
Fry DM
Gray LEJ
Gray LEJ
Greco TL
H Leffers
Hakulinen T
Halevi HS
Harris LE
Hautau ER
Henderson BE
Hirasing RA
Hohlbein R
Hsieh J-T
Ikeda Y
J A McLachlan
J C Larsen
J Müller
J Sumpter
J Toppari
Jennings ML
Jost A
Kallen B
Kallen B
Korach KS
Kupfer D
L J Guillette
Leatherland JF
Leatherland JF
McDonnell DP
McIntosh R
N E Skakkebaek
N Keiding
Newbold RR
O Meyer
P Christiansen
P Grandjean
P Jouannet
Panayotou PC
Pearce N
Peterson RE
Pohjanvirta R
R Sharpe
Rey RA
Schumacher GFB
Scully RE
Sell S
Soto AM
Sweet RA
T K Jensen
T Scheike
Thomas KB
Thomas KB
Thorup J
Tryphonas L
Turscott B
Vessey MP
Wallace HM
Ward B
Wilkinson TJ
Publication venue: National Institute of Environmental Health Science
Publication date: 01/01/1996
Field of study

EHP is a publication of the U.S. government. Publication of EHP lies in the public domain and is therefore without copyright. Research articles from EHP may be used freely; however, articles from the News section of EHP may contain photographs or figures copyrighted by other commercial organizations and individuals that may not be used without obtaining prior approval from both the EHP editors and the holder of the copyright. Use of any materials published in EHP should be acknowledged (for example, "Reproduced with permission from Environmental Health Perspectives") and a reference provided for the article from which the material was reproduced.Male reproductive health has deteriorated in many countries during the last few decades. In the 1990s, declining semen quality has been reported from Belgium, Denmark, France, and Great Britain. The incidence of testicular cancer has increased during the same time incidences of hypospadias and cryptorchidism also appear to be increasing. Similar reproductive problems occur in many wildlife species. There are marked geographic differences in the prevalence of male reproductive disorders. While the reasons for these differences are currently unknown, both clinical and laboratory research suggest that the adverse changes may be inter-related and have a common origin in fetal life or childhood. Exposure of the male fetus to supranormal levels of estrogens, such as diethlylstilbestrol, can result in the above-mentioned reproductive defects. The growing number of reports demonstrating that common environmental contaminants and natural factors possess estrogenic activity presents the working hypothesis that the adverse trends in male reproductive health may be, at least in part, associated with exposure to estrogenic or other hormonally active (e.g., antiandrogenic) environmental chemicals during fetal and childhood development. An extensive research program is needed to understand the extent of the problem, its underlying etiology, and the development of a strategy for prevention and intervention.Supported by EU Contract BMH4-CT96-0314

Copenhagen University Research Information System

Edinburgh Research Explorer

Online Research Database In Technology

Brunel University Research Archive

A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection

Author: A Tomarken
D Freedman
D Freedman
D Zimmerman
DC Howell
DM Evans
DW Zimmerman
Elena S Gusareva
ES Pearson
François Van Lishout
H Jin
HB Mann
HJ Keselman
J Gibbons
J Pratt
Jestinah M Mahachie John
JH McDonald
JM Mahachie John
JM Mahachie John
JM Mahachie John
JM Mahachie John
JV Bradley
K Van Steen
K Yang
Kristel Van Steen
L Goh
M Pett
M Weber
MD Ritchie
MDRA Jeanmougin
MH Kutner
ML Calle
MS Bartlett
PH Westfall
R Mani
R Wolfe
S Dudoit
SIB-W Szymczak
SS Sawilowsky
T Cattaert
T Cattaert
VN Danh
W Conover
WJ Conover
X Wang
XY Lou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BACKGROUND: Applying a statistical method implies identifying underlying (model) assumptions and checking their validity in the particular context. One of these contexts is association modeling for epistasis detection. Here, depending on the technique used, violation of model assumptions may result in increased type I error, power loss, or biased parameter estimates. Remedial measures for violated underlying conditions or assumptions include data transformation or selecting a more relaxed modeling or testing strategy. Model-Based Multifactor Dimensionality Reduction (MB-MDR) for epistasis detection relies on association testing between a trait and a factor consisting of multilocus genotype information. For quantitative traits, the framework is essentially Analysis of Variance (ANOVA) that decomposes the variability in the trait amongst the different factors. In this study, we assess through simulations, the cumulative effect of deviations from normality and homoscedasticity on the overall performance of quantitative Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect 2-locus epistasis signals in the absence of main effects. METHODOLOGY: Our simulation study focuses on pure epistasis models with varying degrees of genetic influence on a quantitative trait. Conditional on a multilocus genotype, we consider quantitative trait distributions that are normal, chi-square or Student’s t with constant or non-constant phenotypic variances. All data are analyzed with MB-MDR using the built-in Student’s t-test for association, as well as a novel MB-MDR implementation based on Welch’s t-test. Traits are either left untransformed or are transformed into new traits via logarithmic, standardization or rank-based transformations, prior to MB-MDR modeling. RESULTS: Our simulation results show that MB-MDR controls type I error and false positive rates irrespective of the association test considered. Empirically-based MB-MDR power estimates for MB-MDR with Welch’s t-tests are generally lower than those for MB-MDR with Student’s t-tests. Trait transformations involving ranks tend to lead to increased power compared to the other considered data transformations. CONCLUSIONS: When performing MB-MDR screening for gene-gene interactions with quantitative traits, we recommend to first rank-transform traits to normality and then to apply MB-MDR modeling with Student’s t-tests as internal tests for association

Ghent University Academic Bibliography