419 research outputs found
Towards Accelerated Model Training via Bayesian Data Selection
Mislabeled, duplicated, or biased data in real-world scenarios can lead to
prolonged training and even hinder model convergence. Traditional solutions
prioritizing easy or hard samples lack the flexibility to handle such a variety
simultaneously. Recent work has proposed a more reasonable data selection
principle by examining the data's impact on the model's generalization loss.
However, its practical adoption relies on less principled approximations and
additional clean holdout data. This work solves these problems by leveraging a
lightweight Bayesian treatment and incorporating off-the-shelf zero-shot
predictors built on large-scale pre-trained models. The resulting algorithm is
efficient and easy-to-implement. We perform extensive empirical studies on
challenging benchmarks with considerable data noise and imbalance in the online
batch selection scenario, and observe superior training efficiency over
competitive baselines. Notably, on the challenging WebVision benchmark, our
method can achieve similar predictive performance with significantly fewer
training iterations than leading data selection methods
Heat transfer in conduction Report on Heat Sink Design
The purpose of the project is to design a heat sink with limited information given and make sure it reaches certain
requirements. Design and optimization process will be done in the beginning, 2-D analytical, and 2-D numerical solution will be generated
and used to check the result. Also, a fl ow simulation will be made by using SOLIDWORKS. In the end, result will be compared, diff erent
between each result will be analyzed
Unsupervised Discovery of Interpretable Directions in h-space of Pre-trained Diffusion Models
We propose the first unsupervised and learning-based method to identify
interpretable directions in h-space of pre-trained diffusion models. Our method
is derived from an existing technique that operates on the GAN latent space.
Specifically, we employ a shift control module that works on h-space of
pre-trained diffusion models to manipulate a sample into a shifted version of
itself, followed by a reconstructor to reproduce both the type and the strength
of the manipulation. By jointly optimizing them, the model will spontaneously
discover disentangled and interpretable directions. To prevent the discovery of
meaningless and destructive directions, we employ a discriminator to maintain
the fidelity of shifted sample. Due to the iterative generative process of
diffusion models, our training requires a substantial amount of GPU VRAM to
store numerous intermediate tensors for back-propagating gradient. To address
this issue, we propose a general VRAM-efficient training algorithm based on
gradient checkpointing technique to back-propagate any gradient through the
whole generative process, with acceptable occupancy of VRAM and sacrifice of
training efficiency. Compared with existing related works on diffusion models,
our method inherently identifies global and scalable directions, without
necessitating any other complicated procedures. Extensive experiments on
various datasets demonstrate the effectiveness of our method
- …