1,109 research outputs found

    UTOPIA: Universally Trainable Optimal Prediction Intervals Aggregation

    Full text link
    Uncertainty quantification for prediction is an intriguing problem with significant applications in various fields, such as biomedical science, economic studies, and weather forecasts. Numerous methods are available for constructing prediction intervals, such as quantile regression and conformal predictions, among others. Nevertheless, model misspecification (especially in high-dimension) or sub-optimal constructions can frequently result in biased or unnecessarily-wide prediction intervals. In this paper, we propose a novel and widely applicable technique for aggregating multiple prediction intervals to minimize the average width of the prediction band along with coverage guarantee, called Universally Trainable Optimal Predictive Intervals Aggregation (UTOPIA). The method also allows us to directly construct predictive bands based on elementary basis functions. Our approach is based on linear or convex programming which is easy to implement. All of our proposed methodologies are supported by theoretical guarantees on the coverage probability and optimal average length, which are detailed in this paper. The effectiveness of our approach is convincingly demonstrated by applying it to synthetic data and two real datasets on finance and macroeconomics

    Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift

    Full text link
    A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting. That is, no algorithm performs better than MLE in this setting (up to a constant factor), justifying MLE is all you need. Our result holds for a very rich class of parametric models, and does not require any boundedness condition on the density ratio. We illustrate the wide applicability of our framework by instantiating it to three concrete examples -- linear regression, logistic regression, and phase retrieval. This paper further complement the study by proving that, under the misspecified setting, MLE is no longer the optimal choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax optimal in certain scenarios

    Diversified pattern of the human colorectal cancer microbiome

    Get PDF
    BACKGROUND: The aim of this study is to expand existing knowledge about the CRC-associated microbiome among Han Chinese, and to further discover the variation pattern of the human CRC microbiome across all population. FINDINGS: Using pyrosequencing-based molecular monitoring of bacterial 16S rRNA gene from eight tumor/normal tissue pairs of eight Chinese CRC patients, we analyzed and characterized the basic features of the CRC-associated microbiome. Firstly, we discovered an increasing diversity among tumor-associated bacterial communities. Secondly, in 50% of Chinese CRC patients, we found a significant increase of Roseburia (P = 0.017), and a concurrent decrease of both Microbacterium (P = 0.009) and Anoxybacillus (P = 0.009) in tumor tissue. CONCLUSIONS: We discovered a novel CRC microbiome pattern in Chinese. Both the over-represented Roseburia bacteria at tumor sites and the over-represented Microbacterium and Anoxybacillus bacteria away from tumor sites were both closely related in Chinese CRC patients. Across several populations reported in this study and previously, we observed both common and distinctive patterns of human CRC microbiome’s association with a high-risk of CRC

    Residual Denoising Diffusion Models

    Full text link
    We propose residual denoising diffusion models (RDDM), a novel dual diffusion process that decouples the traditional single denoising diffusion process into residual diffusion and noise diffusion. This dual diffusion framework expands the denoising-based diffusion models, initially uninterpretable for image restoration, into a unified and interpretable model for both image generation and restoration by introducing residuals. Specifically, our residual diffusion represents directional diffusion from the target image to the degraded input image and explicitly guides the reverse generation process for image restoration, while noise diffusion represents random perturbations in the diffusion process. The residual prioritizes certainty, while the noise emphasizes diversity, enabling RDDM to effectively unify tasks with varying certainty or diversity requirements, such as image generation and restoration. We demonstrate that our sampling process is consistent with that of DDPM and DDIM through coefficient transformation, and propose a partially path-independent generation process to better understand the reverse process. Notably, our RDDM enables a generic UNet, trained with only an 1\ell _1 loss and a batch size of 1, to compete with state-of-the-art image restoration methods. We provide code and pre-trained models to encourage further exploration, application, and development of our innovative framework (https://github.com/nachifur/RDDM)