69 research outputs found
Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
We establish stability of random forests under the mild condition that the
squared response () does not have a heavy tail. In particular, our
analysis holds for the practical version of random forests that is implemented
in popular packages like \texttt{randomForest} in \texttt{R}. Empirical results
show that stability may persist even beyond our assumption and hold for
heavy-tailed . Using the stability property, we prove a non-asymptotic
lower bound for the coverage probability of prediction intervals constructed
from the out-of-bag error of random forests. With another mild condition that
is typically satisfied when is continuous, we also establish a
complementary upper bound, which can be similarly established for the jackknife
prediction interval constructed from an arbitrary stable algorithm. We also
discuss the asymptotic coverage probability under assumptions weaker than those
considered in previous literature. Our work implies that random forests, with
its stability property, is an effective machine learning method that can
provide not only satisfactory point prediction but also justified interval
prediction at almost no extra computational cost.Comment: NeurIPS 202
Analysis of Body Weight and Feed Intake Curves in Selection Lines for Residual Feed Intake in Pigs
A selection experiment for reducing residual feed intake (RFI= feed consumed over and above expected requirements for production and maintenance) in Yorkshire pigs consists of a line selected for lower RFI (LRFI) and a random control line (CTRL). Using 64 LRFI and 87 CTRL boars from generation 5 of the selection experiment, cubic polynomial random regression with heterogeneous residual variance for daily feed intake (DFI) and with homogeneous residual variance for bi-weekly body weight (BW) were identified as the best linear mixed models to describe feed intake and body weight curves. Based on the Gompertz model, significant differences in the decay parameter for DFI and in mature body weight and the inflection point for BW were observed between the lines. In conclusion, selection for lower RFI has resulted in a lower feed intake curve toward maturity, lower mature body weight, and earlier inflection points for growth
Understanding and Addressing the Unbounded “Likelihood” Problem
The joint probability density function, evaluated at the observed data, is commonly used as the likelihood function to compute maximum likelihood estimates. For some models, however, there exist paths in the parameter space along which this density-approximation likelihood goes to infinity and maximum likelihood estimation breaks down. In all applications, however, observed data are really discrete due to the round-off or grouping error of measurements. The “correct likelihood” based on interval censoring can eliminate the problem of an unbounded likelihood. This article categorizes the models leading to unbounded likelihoods into three groups and illustrates the density-approximation breakdown with specific examples. Although it is usually possible to infer how given data were rounded, when this is not possible, one must choose the width for interval censoring, so we study the effect of the round-off on estimation. We also give sufficient conditions for the joint density to provide the same maximum likelihood estimate as the correct likelihood, as the round-off error goes to zero
Impact of High-Intensity Ultrasound on Strength of Surgical Mesh when Treating Biofilm Infections
The use of cavitation-based ultrasound histotripsy to treat infections on surgical mesh has shown great potential. However, any impact of the therapy on the mesh must be assessed before the therapy can be applied in the clinic. The goal of this study was to determine if the cavitation-based therapy would reduce the strength of the mesh thus compromising the functionality of the mesh. First, S. aureus biofilms were grown on surgical mesh samples and exposed to high-intensity ultrasound pulses. For each exposure, the effectiveness of the therapy was confirmed by counting the number of colony forming units (CFUs) on the mesh. Most of the exposed meshes had no CFUs with an average reduction of 5.4-log10 relative to the sham exposures. To quantify the impact of the exposure on mesh strength, the force required to tear the mesh and the maximum mesh expansion before damage were quantified for control, sham, and exposed mesh samples. There was no statistical difference between the exposed and sham/control mesh samples in terms of ultimate tensile strength and corresponding mesh expansion. The only statistical difference was with respect to mesh orientation relative to the applied load. The tensile strength increased by 1.36 N while the expansion was reduced by 1.33 mm between the different mesh orientations
High glucose upregulates connective tissue growth factor expression in human vascular smooth muscle cells
BACKGROUND: Connective tissue growth factor (CTGF) is a potent profibrotic factor, which is implicated in fibroblast proliferation, angiogenesis and extracellular matrix (ECM) synthesis. It is a downstream mediator of some of the effects of transforming growth factor β (TGFβ) and is potentially induced by hyperglycemia in human renal mesangial cells. However, whether high glucose could induce the CTGF expression in vascular smooth muscle cells (VSMCs) remains unknown. Therefore, this study was designed to test whether high glucose could regulate CTGF expression in human VSMC. The effect of modulating CTGF expression on VSMC proliferation and migration was further investigated. RESULTS: Expression of CTGF mRNA was up-regulated as early as 6 hours in cultured human VSMCs after exposed to high glucose condition, followed by ECM components (collagen type I and fibronectin) accumulation. The upregulation of CTGF mRNA appears to be TGFβ-dependent since anti-TGFβ antibody blocks the effect of high glucose on CTGF gene expression. A small interference RNA (siRNA) targeting CTGF mRNA (CTGF-siRNA) effectively suppressed CTGF up-regulation stimulated by high glucose up to 79% inhibition. As a consequence of decreased expression of CTGF gene, the deposition of ECM proteins in the VSMC was also declined. Moreover, CTGF-siRNA expressing vector partially inhibited the high glucose-induced VSMC proliferation and migration. CONCLUSION: Our data suggest that in the development of macrovascular complications in diabetes, CTGF might be an important factor involved in the patho-physiological responses to high glucose in human VSMCs. In addition, the modulatory effects of CTGF-siRNA during this process suggest that specific targeting CTGF by RNA interference could be useful in preventing intimal hyperplasia in diabetic macrovascular complications
Ozonation Efficacy in the Treatment of Soil-Borne Phytophthora sojae in Cultivating Soybeans
Ozonation was studied for inactivating Phytophthora sojae, a predominant soybean pathogen that causes root and stem rot, and pre-and post-emergence soybean damping-off. Typically, fungicides are used to treat soils to control the damage from P. sojae to soybean production. An environmentally friendly method of ozonation was studied for inactivating P. sojae, a model Phytophthora pathogen that affects a wide range of high-value crops. Assays of artificially inoculated soil samples with P. sojae were treated with different doses of gaseous ozone. This study showed that a dosage of 0.47 g.kg-1 O3 in the soil totally prevented root and stem-rot disease incidence by P. sojae. The findings of this research clearly indicate that ozonation is an efficient alternative to chemical fungicides in the inhibition of Phytophthora diseases in the soil, hence a balancing feedback loop reinforcing the soil system as natural capital
Soil Ozonation for Nematode Disinfestation as an Alternative to Methyl Bromide and Nematicides
Phytoparasitic nematodes are important pests that cause severe crop yield losses. In the past, methyl bromide and other proprietary nematicides have been used as management practices, but these practices are unsustainable and lead to atmospheric pollution and ozone layer destruction. Ozonation was studied as an alternative management practice since it is highly effective against microorganisms and degenerates quickly to oxygen. Soil samples that were naturally infested with nematodes were treated with different levels of gaseous ozone at 21 ºC and 5 ºC. Regression analysis results show that a medium level of ozonation (2.1 g O3 kg-1 for 15 min at a rate of ozonation 0.14 g O3 kg-1 min-1) and low temperature (5 ºC) resulted in 94% mean nematode inhibition
Scan Parameter Optimization for Histotripsy Treatment of S. Aureus Biofilms on Surgical Mesh
There is a critical need to develop new noninvasive therapies to treat bacteria biofilms. Previous studies have demonstrated the effectiveness of cavitation-based ultrasound histotripsy to destroy these biofilms. In this study, the dependence of biofilm destruction on multiple scan parameters was assessed by conducting exposures at different scan speeds (0.3-1.4 beam widths/sec), step sizes (0.25-0.5 beam widths), and number of passes of the focus across the mesh (2-6). For each of the exposure conditions, the number of colony forming units (CFUs) remaining on the mesh was quantified. A regression analysis was then conducted revealing that scan speed was the most critical parameter for biofilm destruction. Reducing the number of passes and the scan speed should allow for more efficient biofilm destruction in the future reducing the treatment time
Elementary Statistical Methods and Measurement Error
How the sources of physical variation interact with a data collection plan determines what can be learned from the resulting dataset, and in particular, how measurement error is reflected in the dataset. The implications of this fact are rarely given much attention in most statistics courses. Even the most elementary statistical methods have their practical effectiveness limited by measurement variation; and understanding how measurement variation interacts with data collection and the methods is helpful in quantifying the nature of measurement error. We illustrate how simple one- and two-sample statistical methods can be effectively used in introducing important concepts of metrology and the implications of those concepts when drawing conclusions from data
- …