1,186 research outputs found
UTOPIA: Universally Trainable Optimal Prediction Intervals Aggregation
Uncertainty quantification for prediction is an intriguing problem with
significant applications in various fields, such as biomedical science,
economic studies, and weather forecasts. Numerous methods are available for
constructing prediction intervals, such as quantile regression and conformal
predictions, among others. Nevertheless, model misspecification (especially in
high-dimension) or sub-optimal constructions can frequently result in biased or
unnecessarily-wide prediction intervals. In this paper, we propose a novel and
widely applicable technique for aggregating multiple prediction intervals to
minimize the average width of the prediction band along with coverage
guarantee, called Universally Trainable Optimal Predictive Intervals
Aggregation (UTOPIA). The method also allows us to directly construct
predictive bands based on elementary basis functions. Our approach is based on
linear or convex programming which is easy to implement. All of our proposed
methodologies are supported by theoretical guarantees on the coverage
probability and optimal average length, which are detailed in this paper. The
effectiveness of our approach is convincingly demonstrated by applying it to
synthetic data and two real datasets on finance and macroeconomics
Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift
A key challenge of modern machine learning systems is to achieve
Out-of-Distribution (OOD) generalization -- generalizing to target data whose
distribution differs from that of source data. Despite its significant
importance, the fundamental question of ``what are the most effective
algorithms for OOD generalization'' remains open even under the standard
setting of covariate shift. This paper addresses this fundamental question by
proving that, surprisingly, classical Maximum Likelihood Estimation (MLE)
purely using source data (without any modification) achieves the minimax
optimality for covariate shift under the well-specified setting. That is, no
algorithm performs better than MLE in this setting (up to a constant factor),
justifying MLE is all you need. Our result holds for a very rich class of
parametric models, and does not require any boundedness condition on the
density ratio. We illustrate the wide applicability of our framework by
instantiating it to three concrete examples -- linear regression, logistic
regression, and phase retrieval. This paper further complement the study by
proving that, under the misspecified setting, MLE is no longer the optimal
choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax
optimal in certain scenarios
Diversified pattern of the human colorectal cancer microbiome
BACKGROUND: The aim of this study is to expand existing knowledge about the CRC-associated microbiome among Han Chinese, and to further discover the variation pattern of the human CRC microbiome across all population. FINDINGS: Using pyrosequencing-based molecular monitoring of bacterial 16S rRNA gene from eight tumor/normal tissue pairs of eight Chinese CRC patients, we analyzed and characterized the basic features of the CRC-associated microbiome. Firstly, we discovered an increasing diversity among tumor-associated bacterial communities. Secondly, in 50% of Chinese CRC patients, we found a significant increase of Roseburia (P = 0.017), and a concurrent decrease of both Microbacterium (P = 0.009) and Anoxybacillus (P = 0.009) in tumor tissue. CONCLUSIONS: We discovered a novel CRC microbiome pattern in Chinese. Both the over-represented Roseburia bacteria at tumor sites and the over-represented Microbacterium and Anoxybacillus bacteria away from tumor sites were both closely related in Chinese CRC patients. Across several populations reported in this study and previously, we observed both common and distinctive patterns of human CRC microbiome’s association with a high-risk of CRC
Residual Denoising Diffusion Models
We propose residual denoising diffusion models (RDDM), a novel dual diffusion
process that decouples the traditional single denoising diffusion process into
residual diffusion and noise diffusion. This dual diffusion framework expands
the denoising-based diffusion models, initially uninterpretable for image
restoration, into a unified and interpretable model for both image generation
and restoration by introducing residuals. Specifically, our residual diffusion
represents directional diffusion from the target image to the degraded input
image and explicitly guides the reverse generation process for image
restoration, while noise diffusion represents random perturbations in the
diffusion process. The residual prioritizes certainty, while the noise
emphasizes diversity, enabling RDDM to effectively unify tasks with varying
certainty or diversity requirements, such as image generation and restoration.
We demonstrate that our sampling process is consistent with that of DDPM and
DDIM through coefficient transformation, and propose a partially
path-independent generation process to better understand the reverse process.
Notably, our RDDM enables a generic UNet, trained with only an loss
and a batch size of 1, to compete with state-of-the-art image restoration
methods. We provide code and pre-trained models to encourage further
exploration, application, and development of our innovative framework
(https://github.com/nachifur/RDDM)
- …