39 research outputs found
NCL++: Nested Collaborative Learning for Long-Tailed Visual Recognition
Long-tailed visual recognition has received increasing attention in recent
years. Due to the extremely imbalanced data distribution in long-tailed
learning, the learning process shows great uncertainties. For example, the
predictions of different experts on the same image vary remarkably despite the
same training settings. To alleviate the uncertainty, we propose a Nested
Collaborative Learning (NCL++) which tackles the long-tailed learning problem
by a collaborative learning. To be specific, the collaborative learning
consists of two folds, namely inter-expert collaborative learning (InterCL) and
intra-expert collaborative learning (IntraCL). In-terCL learns multiple experts
collaboratively and concurrently, aiming to transfer the knowledge among
different experts. IntraCL is similar to InterCL, but it aims to conduct the
collaborative learning on multiple augmented copies of the same image within
the single expert. To achieve the collaborative learning in long-tailed
learning, the balanced online distillation is proposed to force the consistent
predictions among different experts and augmented copies, which reduces the
learning uncertainties. Moreover, in order to improve the meticulous
distinguishing ability on the confusing categories, we further propose a Hard
Category Mining (HCM), which selects the negative categories with high
predicted scores as the hard categories. Then, the collaborative learning is
formulated in a nested way, in which the learning is conducted on not just all
categories from a full perspective but some hard categories from a partial
perspective. Extensive experiments manifest the superiority of our method with
outperforming the state-of-the-art whether with using a single model or an
ensemble. The code will be publicly released.Comment: arXiv admin note: text overlap with arXiv:2203.1535
Flew Over Learning Trap: Learn Unlearnable Samples by Progressive Staged Training
Unlearning techniques are proposed to prevent third parties from exploiting
unauthorized data, which generate unlearnable samples by adding imperceptible
perturbations to data for public publishing. These unlearnable samples
effectively misguide model training to learn perturbation features but ignore
image semantic features. We make the in-depth analysis and observe that models
can learn both image features and perturbation features of unlearnable samples
at an early stage, but rapidly go to the overfitting stage since the shallow
layers tend to overfit on perturbation features and make models fall into
overfitting quickly. Based on the observations, we propose Progressive Staged
Training to effectively prevent models from overfitting in learning
perturbation features. We evaluated our method on multiple model architectures
over diverse datasets, e.g., CIFAR-10, CIFAR-100, and ImageNet-mini. Our method
circumvents the unlearnability of all state-of-the-art methods in the
literature and provides a reliable baseline for further evaluation of
unlearnable techniques
Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation
Diffusion models have demonstrated remarkable performance in image generation
tasks, paving the way for powerful AIGC applications. However, these
widely-used generative models can also raise security and privacy concerns,
such as copyright infringement, and sensitive data leakage. To tackle these
issues, we propose a method, Unlearnable Diffusion Perturbation, to safeguard
images from unauthorized exploitation. Our approach involves designing an
algorithm to generate sample-wise perturbation noise for each image to be
protected. This imperceptible protective noise makes the data almost
unlearnable for diffusion models, i.e., diffusion models trained or fine-tuned
on the protected data cannot generate high-quality and diverse images related
to the protected training data. Theoretically, we frame this as a max-min
optimization problem and introduce EUDP, a noise scheduler-based method to
enhance the effectiveness of the protective noise. We evaluate our methods on
both Denoising Diffusion Probabilistic Model and Latent Diffusion Models,
demonstrating that training diffusion models on the protected data lead to a
significant reduction in the quality of the generated images. Especially, the
experimental results on Stable Diffusion demonstrate that our method
effectively safeguards images from being used to train Diffusion Models in
various tasks, such as training specific objects and styles. This achievement
holds significant importance in real-world scenarios, as it contributes to the
protection of privacy and copyright against AI-generated content
Rethinking the Open-Loop Evaluation of End-to-End Autonomous Driving in nuScenes
Modern autonomous driving systems are typically divided into three main
tasks: perception, prediction, and planning. The planning task involves
predicting the trajectory of the ego vehicle based on inputs from both internal
intention and the external environment, and manipulating the vehicle
accordingly. Most existing works evaluate their performance on the nuScenes
dataset using the L2 error and collision rate between the predicted
trajectories and the ground truth. In this paper, we reevaluate these existing
evaluation metrics and explore whether they accurately measure the superiority
of different methods. Specifically, we design an MLP-based method that takes
raw sensor data (e.g., past trajectory, velocity, etc.) as input and directly
outputs the future trajectory of the ego vehicle, without using any perception
or prediction information such as camera images or LiDAR. Our simple method
achieves similar end-to-end planning performance on the nuScenes dataset with
other perception-based methods, reducing the average L2 error by about 20%.
Meanwhile, the perception-based methods have an advantage in terms of collision
rate. We further conduct in-depth analysis and provide new insights into the
factors that are critical for the success of the planning task on nuScenes
dataset. Our observation also indicates that we need to rethink the current
open-loop evaluation scheme of end-to-end autonomous driving in nuScenes. Codes
are available at https://github.com/E2E-AD/AD-MLP.Comment: Technical report. Code is availabl
Online identification of lithium-ion battery model parameters with initial value uncertainty and measurement noise
Online parameter identification is essential for the accuracy of the battery equivalent circuit model (ECM). The traditional recursive least squares (RLS) method is easily biased with the noise disturbances from sensors, which degrades the modeling accuracy in practice. Meanwhile, the recursive total least squares (RTLS) method can deal with the noise interferences, but the parameter slowly converges to the reference with initial value uncertainty. To alleviate the above issues, this paper proposes a co-estimation framework utilizing the advantages of RLS and RTLS for a higher parameter identification performance of the battery ECM. RLS converges quickly by updating the parameters along the gradient of the cost function. RTLS is applied to attenuate the noise effect once the parameters have converged. Both simulation and experimental results prove that the proposed method has good accuracy, a fast convergence rate, and also robustness against noise corruption
A Survey of Large Language Models
Language is essentially a complex, intricate system of human expressions
governed by grammatical rules. It poses a significant challenge to develop
capable AI algorithms for comprehending and grasping a language. As a major
approach, language modeling has been widely studied for language understanding
and generation in the past two decades, evolving from statistical language
models to neural language models. Recently, pre-trained language models (PLMs)
have been proposed by pre-training Transformer models over large-scale corpora,
showing strong capabilities in solving various NLP tasks. Since researchers
have found that model scaling can lead to performance improvement, they further
study the scaling effect by increasing the model size to an even larger size.
Interestingly, when the parameter scale exceeds a certain level, these enlarged
language models not only achieve a significant performance improvement but also
show some special abilities that are not present in small-scale language
models. To discriminate the difference in parameter scale, the research
community has coined the term large language models (LLM) for the PLMs of
significant size. Recently, the research on LLMs has been largely advanced by
both academia and industry, and a remarkable progress is the launch of ChatGPT,
which has attracted widespread attention from society. The technical evolution
of LLMs has been making an important impact on the entire AI community, which
would revolutionize the way how we develop and use AI algorithms. In this
survey, we review the recent advances of LLMs by introducing the background,
key findings, and mainstream techniques. In particular, we focus on four major
aspects of LLMs, namely pre-training, adaptation tuning, utilization, and
capacity evaluation. Besides, we also summarize the available resources for
developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Core-collapse supernova (CCSN) is one of the most energetic astrophysical
events in the Universe. The early and prompt detection of neutrinos before
(pre-SN) and during the SN burst is a unique opportunity to realize the
multi-messenger observation of the CCSN events. In this work, we describe the
monitoring concept and present the sensitivity of the system to the pre-SN and
SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is
a 20 kton liquid scintillator detector under construction in South China. The
real-time monitoring system is designed with both the prompt monitors on the
electronic board and online monitors at the data acquisition stage, in order to
ensure both the alert speed and alert coverage of progenitor stars. By assuming
a false alert rate of 1 per year, this monitoring system can be sensitive to
the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos
up to about 370 (360) kpc for a progenitor mass of 30 for the case
of normal (inverted) mass ordering. The pointing ability of the CCSN is
evaluated by using the accumulated event anisotropy of the inverse beta decay
interactions from pre-SN or SN neutrinos, which, along with the early alert,
can play important roles for the followup multi-messenger observations of the
next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure
Sensorless Temperature Estimation of Lithium-Ion Battery Based on Broadband Impedance Measurements
Temperature monitoring is of paramount importance for guaranteeing the safety and proper operation of lithium-ion batteries. Traditional temperature sensors suffer from heat transfer delay, where internal battery temperature cannot be measured directly. Motivated by this, this letter proposes a novel sensorless temperature estimation method based on broadband impedance spectroscopy. In this letter, pseudorandom sequence (PRS) with finite signal levels is utilized for impedance measurements. The measured impedance information is merged into an impedance-temperature model, which cooperates with a specially designed least-square method for temperature estimation. The proposed framework is robust against interference, whereas simple enough for online implementation. Experimental results suggest excellent estimation accuracy of the proposed method under different circumstances.</p