695 research outputs found
Novel statistical approaches for non-normal censored immunological data: analysis of cytokine and gene expression data
Background: For several immune-mediated diseases, immunological analysis will become more complex in the future with datasets in which cytokine and gene expression data play a major role. These data have certain characteristics that require sophisticated statistical analysis such as strategies for non-normal distribution and censoring. Additionally, complex and multiple immunological relationships need to be adjusted for potential confounding and interaction effects.
Objective: We aimed to introduce and apply different methods for statistical analysis of non-normal censored cytokine and gene expression data. Furthermore, we assessed the performance and accuracy of a novel regression approach in order to allow adjusting for covariates and potential confounding.
Methods: For non-normally distributed censored data traditional means such as the Kaplan-Meier method or the generalized Wilcoxon test are described. In order to adjust for covariates the novel approach named Tobit regression on ranks was introduced. Its performance and accuracy for analysis of non-normal censored cytokine/gene expression data was evaluated by a simulation study and a statistical experiment applying permutation and bootstrapping.
Results: If adjustment for covariates is not necessary traditional statistical methods are adequate for non-normal censored data. Comparable with these and appropriate if additional adjustment is required, Tobit regression on ranks is a valid method. Its power, type-I error rate and accuracy were comparable to the classical Tobit regression.
Conclusion: Non-normally distributed censored immunological data require appropriate statistical methods. Tobit regression on ranks meets these requirements and can be used for adjustment for covariates and potential confounding in large and complex immunological datasets
Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models
<p>Abstract</p> <p>Background</p> <p>Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI) curves via the SALI curve integral (SCI) as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists.</p> <p>Results</p> <p>The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK <it>in vitro </it>efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from <it>SimulationsPlus </it>and AstraZeneca.</p> <p>Conclusion</p> <p>The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.</p
Relative Effects of Juvenile and Adult Environmental Factors on Mate Attraction and Recognition in the Cricket, Allonemobius socius
Finding a mate is a fundamental aspect of sexual reproduction. To this end, specific-mate recognition systems (SMRS) have evolved that facilitate copulation between producers of the mating signal and their opposite-sex responders. Environmental variation, however, may compromise the efficiency with which SMRS operate. In this study, the degree to which seasonal climate experienced during juvenile and adult life-cycle stages affects the SMRS of a cricket, Allonemobius socius (Scudder) (Orthoptera: Gryllidae) was assessed. Results from two-choice behavioral trials suggest that adult ambient temperature, along with population and family origins, mediate variation in male mating call, and to a lesser extent directional response of females for those calls. Restricted maximum-likelihood estimates of heritability for male mating call components and for female response to mating call appeared statistically nonsignificant. However, appreciable “maternal genetic effects” suggest that maternal egg provisioning and other indirect maternal determinants of the embryonic environment significantly contributed to variation in male mating call and female response to mating calls. Thus, environmental factors can generate substantial variation in A. socius mating call, and, more importantly, their marginal effect on female responses to either fast-chirp or long-chirp mating calls suggest negative fitness consequences to males producing alternative types of calls. Future studies of sexual selection and SMRS evolution, particularly those focused on hybrid zone dynamics, should take explicit account of the loose concordance between signal producers and responders suggested by the current findings
Asymptotic behaviour and optimal word size for exact and approximate word matches between random sequences
BACKGROUND: The number of k-words shared between two sequences is a simple and effcient alignment-free sequence comparison method. This statistic, D(2), has been used for the clustering of EST sequences. Sequence comparison based on D(2 )is extremely fast, its runtime is proportional to the size of the sequences under scrutiny, whereas alignment-based comparisons have a worst-case run time proportional to the square of the size. Recent studies have tackled the rigorous study of the statistical distribution of D(2), and asymptotic regimes have been derived. The distribution of approximate k-word matches has also been studied. RESULTS: We have computed the D(2 )optimal word size for various sequence lengths, and for both perfect and approximate word matches. Kolmogorov-Smirnov tests show D(2 )to have a compound Poisson distribution at the optimal word size for small sequence lengths (below 400 letters) and a normal distribution at the optimal word size for large sequence lengths (above 1600 letters). We find that the D(2 )statistic outperforms BLAST in the comparison of artificially evolved sequences, and performs similarly to other methods based on exact word matches. These results obtained with randomly generated sequences are also valid for sequences derived from human genomic DNA. CONCLUSION: We have characterized the distribution of the D(2 )statistic at optimal word sizes. We find that the best trade-off between computational efficiency and accuracy is obtained with exact word matches. Given that our numerical tests have not included sequence shuffling, transposition or splicing, the improvements over existing methods reported here underestimate that expected in real sequences. Because of the linear run time and of the known normal asymptotic behavior, D(2)-based methods are most appropriate for large genomic sequences
Using a formative simulated patient exercise for curriculum evaluation
BACKGROUND: It is not clear that teaching specific history taking, physical examination and patient teaching techniques to medical students results in durable behavioural changes. We used a quasi-experimental design that approximated a randomized double blinded trial to examine whether a Participatory Decision-Making (PDM) educational module taught in a clerkship improves performance on a Simulated Patient Exercise (SPE) in another clerkship, and how this is influenced by the time between training and assessment. METHODS: Third year medical students in an internal medicine clerkship were assessed on their use of PDM skills in an SPE conducted in the second week of the clerkship. The rotational structure of the third year clerkships formed a pseudo-randomized design where students had 1) completed the family practice clerkship containing a training module on PDM skills approximately four weeks prior to the SPE, 2) completed the family medicine clerkship and the training module approximately 12 weeks prior to the SPE or 3) had not completed the family medicine clerkship and the PDM training module at the time they were assessed via the SPE. RESULTS: Based on limited pilot data there were statistically significant differences between students who received PDM training approximately four weeks prior to the SPE and students who received training approximately 12 weeks prior to the SPE. Students who received training 12 weeks prior to the SPE performed better than those who received training four weeks prior to the SPE. In a second comparison students who received training four weeks prior to the SPE performed better than those who did not receive training but the differences narrowly missed statistical significance (P < 0.05). CONCLUSION: This pilot study demonstrated the feasibility of a methodology for conducting rigorous curricular evaluations using natural experiments based on the structure of clinical rotations. In addition, it provided preliminary data suggesting targeted educational interventions can result in marked improvements in the clinical skills spontaneously exhibited by physician trainees in a setting different from which the skills were taught
LIPS vs MOSA: a Replicated Empirical Study on Automated Test Case Generation
Replication is a fundamental pillar in the construction of scientific knowledge. Test data generation for procedural programs can be tackled using a single-target or a many-objective approach. The proponents of LIPS, a novel single-target test generator, conducted a preliminary empirical study to compare their approach with MOSA, an alternative many-objective test generator. However, their empirical investigation suffers from several external and internal validity threats, does not consider complex programs with many branches and does not include any qualitative analysis to interpret the results. In this paper, we report the results of a replication of the original study designed to address its major limitations and threats to validity. The new findings draw a completely different picture on the pros and cons of single-target vs many-objective approaches to test case generation
Can sacrificial feeding areas protect aquatic plants from herbivore grazing? Using behavioural ecology to inform wildlife management
Effective wildlife management is needed for conservation, economic and human well-being objectives. However, traditional population control methods are frequently ineffective, unpopular with stakeholders, may affect non-target species, and can be both expensive and impractical to implement. New methods which address these issues and offer effective wildlife management are required. We used an individual-based model to predict the efficacy of a sacrificial feeding area in preventing grazing damage by mute swans (Cygnus olor) to adjacent river vegetation of high conservation and economic value. The accuracy of model predictions was assessed by a comparison with observed field data, whilst prediction robustness was evaluated using a sensitivity analysis. We used repeated simulations to evaluate how the efficacy of the sacrificial feeding area was regulated by (i) food quantity, (ii) food quality, and (iii) the functional response of the forager. Our model gave accurate predictions of aquatic plant biomass, carrying capacity, swan mortality, swan foraging effort, and river use. Our model predicted that increased sacrificial feeding area food quantity and quality would prevent the depletion of aquatic plant biomass by swans. When the functional response for vegetation in the sacrificial feeding area was increased, the food quantity and quality in the sacrificial feeding area required to protect adjacent aquatic plants were reduced. Our study demonstrates how the insights of behavioural ecology can be used to inform wildlife management. The principles that underpin our model predictions are likely to be valid across a range of different resource-consumer interactions, emphasising the generality of our approach to the evaluation of strategies for resolving wildlife management problems
A phase I and pharmacokinetic study of MAG-CPT, a water-soluble polymer conjugate of camptothecin
Polymeric drug conjugates are a new and experimental class of drug delivery systems with pharmacokinetic promises. The antineoplastic drug camptothecin was linked to a water-soluble polymeric backbone (MAG-CPT) and administrated as a 30 min infusion over 3 consecutive days every 4 weeks to patients with malignant solid tumours. The objectives of our study were to determine the maximal tolerated dose, the dose-limiting toxicities, and the plasma and urine pharmacokinetics of MAG-CPT, and to document anti-tumour activity. The starting dose was 17 mg m−2 day−1. Sixteen patients received 39 courses at seven dose levels. Maximal tolerated dose was at 68 mg m−2 day−1 and dose-limiting toxicities consisted of cumulative bladder toxicity. MAG-CPT and free camptothecin were accumulated during days 1–3 and considerable amounts of MAG-CPT could still be retrieved in plasma and urine after 4–5 weeks. The half-lives of bound and free camptothecin were equal indicating that the kinetics of free camptothecin were release rate dependent. In summary, the pharmacokinetics of camptothecin were dramatically changed, showing controlled prolonged exposure of camptothecin. Haematological toxicity was relatively mild, but serious bladder toxicity was encountered which is typical for camptothecin and was found dose limiting
Male reproductive health and environmental xenoestrogens
EHP is a publication of the U.S. government. Publication of EHP lies in the public domain and is therefore without copyright.
Research articles from EHP may be used freely; however, articles from the News section of EHP may contain photographs or figures copyrighted by other commercial organizations and individuals that may not be used without obtaining prior approval from both the EHP editors and the holder of the copyright.
Use of any materials published in EHP should be acknowledged (for example, "Reproduced with permission from Environmental Health Perspectives") and a reference provided for the article from which the material was reproduced.Male reproductive health has deteriorated in many countries during the last few decades. In the 1990s, declining semen quality has been reported from Belgium, Denmark, France, and Great Britain. The incidence of testicular cancer has increased during the same time incidences of hypospadias and cryptorchidism also appear to be increasing. Similar reproductive problems occur in many wildlife species. There are marked geographic differences in the prevalence of male reproductive disorders. While the reasons for these differences are currently unknown, both clinical and laboratory research suggest that the adverse changes may be inter-related and have a common origin in fetal life or childhood. Exposure of the male fetus to supranormal levels of estrogens, such as diethlylstilbestrol, can result in the above-mentioned reproductive defects. The growing number of reports demonstrating that common environmental contaminants and natural factors possess estrogenic activity presents the working hypothesis that the adverse trends in male reproductive health may be, at least in part, associated with exposure to estrogenic or other hormonally active (e.g., antiandrogenic) environmental chemicals during fetal and childhood development. An extensive research program is needed to understand the extent of the problem, its underlying etiology, and the development of a strategy for prevention and intervention.Supported by EU Contract BMH4-CT96-0314
A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection
BACKGROUND: Applying a statistical method implies identifying underlying (model) assumptions and checking their validity in the particular context. One of these contexts is association modeling for epistasis detection. Here, depending on the technique used, violation of model assumptions may result in increased type I error, power loss, or biased parameter estimates. Remedial measures for violated underlying conditions or assumptions include data transformation or selecting a more relaxed modeling or testing strategy. Model-Based Multifactor Dimensionality Reduction (MB-MDR) for epistasis detection relies on association testing between a trait and a factor consisting of multilocus genotype information. For quantitative traits, the framework is essentially Analysis of Variance (ANOVA) that decomposes the variability in the trait amongst the different factors. In this study, we assess through simulations, the cumulative effect of deviations from normality and homoscedasticity on the overall performance of quantitative Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect 2-locus epistasis signals in the absence of main effects. METHODOLOGY: Our simulation study focuses on pure epistasis models with varying degrees of genetic influence on a quantitative trait. Conditional on a multilocus genotype, we consider quantitative trait distributions that are normal, chi-square or Student’s t with constant or non-constant phenotypic variances. All data are analyzed with MB-MDR using the built-in Student’s t-test for association, as well as a novel MB-MDR implementation based on Welch’s t-test. Traits are either left untransformed or are transformed into new traits via logarithmic, standardization or rank-based transformations, prior to MB-MDR modeling. RESULTS: Our simulation results show that MB-MDR controls type I error and false positive rates irrespective of the association test considered. Empirically-based MB-MDR power estimates for MB-MDR with Welch’s t-tests are generally lower than those for MB-MDR with Student’s t-tests. Trait transformations involving ranks tend to lead to increased power compared to the other considered data transformations. CONCLUSIONS: When performing MB-MDR screening for gene-gene interactions with quantitative traits, we recommend to first rank-transform traits to normality and then to apply MB-MDR modeling with Student’s t-tests as internal tests for association
- …