21 research outputs found
Epidemiology of and surveillance for postpartum infections.
We screened automated ambulatory medical records, hospital and emergency room claims, and pharmacy records of 2,826 health maintenance organization (HMO) members who gave birth over a 30-month period. Full-text ambulatory records were reviewed for the 30-day postpartum period to confirm infection status for a weighted sample of cases. The overall postpartum infection rate was 6.0%, with rates of 7.4% following cesarean section and 5.5% following vaginal delivery. Rehospitalization; cesarean delivery; antistaphylococcal antibiotics; diagnosis codes for mastitis, endometritis, and wound infection; and ambulatory blood or wound cultures were important predictors of infection. Use of automated information routinely collected by HMOs and insurers allows efficient identification of postpartum infections not detected by conventional surveillance
A joint individual-based model coupling growth and mortality reveals that tree vigor is a key component of tropical forest dynamics
Tree vigor is often used as a covariate when tree mortality is predicted from tree growth in tropical forest dynamic models, but it is rarely explicitly accounted for in a coherent modeling framework. We quantify tree vigor at the individual tree level, based on the difference between expected and observed growth. The available methods to join nonlinear tree growth and mortality processes are not commonly used by forest ecologists so that we develop an inference methodology based on an MCMC approach, allowing us to sample the parameters of the growth and mortality model according to their posterior distribution using the joint model likelihood. We apply our framework to a set of data on the 20-year dynamics of a forest in Paracou, French Guiana, taking advantage of functional trait-based growth and mortality models already developed independently. Our results showed that growth and mortality are intimately linked and that the vigor estimator is an essential predictor of mortality, highlighting that trees growing more than expected have a far lower probability of dying. Our joint model methodology is sufficiently generic to be used to join two longitudinal and punctual linked processes and thus may be applied to a wide range of growth and mortality models. In the context of global changes, such joint models are urgently needed in tropical forests to analyze, and then predict, the effects of the ongoing changes on the tree dynamics in hyperdiverse tropical forests. (Résumé d'auteur
Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery
<p>Abstract</p> <p>Background</p> <p>Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states.</p> <p>Results</p> <p>The proposed methodology was applied to a pilot narcolepsy study using logistic regression, hierarchical clustering, t-test, and CART. Consensus, differential mass peaks with high predictive power were identified across three of the four statistical platforms. Based on the diagnostic accuracy measures investigated, the performance of the consensus-peak model was a compromise between logistic regression and CART, which produced better models than hierarchical clustering and t-test. However, consensus peaks confer a higher level of confidence in their ability to distinguish between disease states since they do not represent peaks that are a result of biases to a particular statistical algorithm. Instead, they were selected as differential across differing data distribution assumptions, demonstrating their true discriminatory potential.</p> <p>Conclusion</p> <p>The methodology described here is applicable to any high-resolution MALDI mass spectrometry-derived data set with minimal mass drift which is essential for peak-to-peak comparison studies. Four statistical approaches with differing data distribution assumptions were applied to the same raw data set to obtain consensus peaks that were found to be statistically differential between the two groups compared. These consensus peaks demonstrated high diagnostic accuracy when used to form a predictive model as evaluated by receiver operating characteristics curve analysis. They should demonstrate a higher discriminatory ability as they are not biased to a particular algorithm. Thus, they are prime candidates for downstream identification and validation efforts.</p
Structure of shocks in Burgers turbulence with L\'evy noise initial data
We study the structure of the shocks for the inviscid Burgers equation in
dimension 1 when the initial velocity is given by L\'evy noise, or equivalently
when the initial potential is a two-sided L\'evy process . When
is abrupt in the sense of Vigon or has bounded variation with
, we prove that the set
of points with zero velocity is regenerative, and that in the latter case this
set is equal to the set of Lagrangian regular points, which is non-empty. When
is abrupt we show that the shock structure is discrete. When
is eroded we show that there are no rarefaction intervals.Comment: 22 page
Regression with Empirical Variable Selection: Description of a New Method and Application to Ecological Datasets
Despite recent papers on problems associated with full-model and stepwise regression, their use is still common throughout ecological and environmental disciplines. Alternative approaches, including generating multiple models and comparing them post-hoc using techniques such as Akaike's Information Criterion (AIC), are becoming more popular. However, these are problematic when there are numerous independent variables and interpretation is often difficult when competing models contain many different variables and combinations of variables. Here, we detail a new approach, REVS (Regression with Empirical Variable Selection), which uses all-subsets regression to quantify empirical support for every independent variable. A series of models is created; the first containing the variable with most empirical support, the second containing the first variable and the next most-supported, and so on. The comparatively small number of resultant models (n = the number of predictor variables) means that post-hoc comparison is comparatively quick and easy. When tested on a real dataset – habitat and offspring quality in the great tit (Parus major) – the optimal REVS model explained more variance (higher R2), was more parsimonious (lower AIC), and had greater significance (lower P values), than full, stepwise or all-subsets models; it also had higher predictive accuracy based on split-sample validation. Testing REVS on ten further datasets suggested that this is typical, with R2 values being higher than full or stepwise models (mean improvement = 31% and 7%, respectively). Results are ecologically intuitive as even when there are several competing models, they share a set of “core” variables and differ only in presence/absence of one or two additional variables. We conclude that REVS is useful for analysing complex datasets, including those in ecology and environmental disciplines
Linking genes to literature: text mining, information extraction, and retrieval applications for biology
Efficient access to information contained in online scientific literature collections is essential for life science research, playing a crucial role from the initial stage of experiment planning to the final interpretation and communication of the results. The biological literature also constitutes the main information source for manual literature curation used by expert-curated databases. Following the increasing popularity of web-based applications for analyzing biological data, new text-mining and information extraction strategies are being implemented. These systems exploit existing regularities in natural language to extract biologically relevant information from electronic texts automatically. The aim of the BioCreative challenge is to promote the development of such tools and to provide insight into their performance. This review presents a general introduction to the main characteristics and applications of currently available text-mining systems for life sciences in terms of the following: the type of biological information demands being addressed; the level of information granularity of both user queries and results; and the features and methods commonly exploited by these applications. The current trend in biomedical text mining points toward an increasing diversification in terms of application types and techniques, together with integration of domain-specific resources such as ontologies. Additional descriptions of some of the systems discussed here are available on the internet