14 research outputs found
NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
The goal of speech enhancement (SE) is to eliminate the background
interference from the noisy speech signal. Generative models such as diffusion
models (DM) have been applied to the task of SE because of better
generalization in unseen noisy scenes. Technical routes for the DM-based SE
methods can be summarized into three types: task-adapted diffusion process
formulation, generator-plus-conditioner (GPC) structures and the multi-stage
frameworks. We focus on the first two approaches, which are constructed under
the GPC architecture and use the task-adapted diffusion process to better deal
with the real noise. However, the performance of these SE models is limited by
the following issues: (a) Non-Gaussian noise estimation in the task-adapted
diffusion process. (b) Conditional domain bias caused by the weak conditioner
design in the GPC structure. (c) Large amount of residual noise caused by
unreasonable interpolation operations during inference. To solve the above
problems, we propose a noise-aware diffusion-based SE model (NADiffuSE) to
boost the SE performance, where the noise representation is extracted from the
noisy speech signal and introduced as a global conditional information for
estimating the non-Gaussian components. Furthermore, the anchor-based inference
algorithm is employed to achieve a compromise between the speech distortion and
noise residual. In order to mitigate the performance degradation caused by the
conditional domain bias in the GPC framework, we investigate three model
variants, all of which can be viewed as multi-stage SE based on the
preprocessing networks for Mel spectrograms. Experimental results show that
NADiffuSE outperforms other DM-based SE models under the GPC infrastructure.
Audio samples are available at: https://square-of-w.github.io/NADiffuSE-demo/
Model and Data Agreement for Learning with Noisy Labels
Learning with noisy labels is a vital topic for practical deep learning as
models should be robust to noisy open-world datasets in the wild. The
state-of-the-art noisy label learning approach JoCoR fails when faced with a
large ratio of noisy labels. Moreover, selecting small-loss samples can also
cause error accumulation as once the noisy samples are mistakenly selected as
small-loss samples, they are more likely to be selected again. In this paper,
we try to deal with error accumulation in noisy label learning from both model
and data perspectives. We introduce mean point ensemble to utilize a more
robust loss function and more information from unselected samples to reduce
error accumulation from the model perspective. Furthermore, as the flip images
have the same semantic meaning as the original images, we select small-loss
samples according to the loss values of flip images instead of the original
ones to reduce error accumulation from the data perspective. Extensive
experiments on CIFAR-10, CIFAR-100, and large-scale Clothing1M show that our
method outperforms state-of-the-art noisy label learning methods with different
levels of label noise. Our method can also be seamlessly combined with other
noisy label learning methods to further improve their performance and
generalize well to other tasks. The code is available in
https://github.com/zyh-uaiaaaa/MDA-noisy-label-learning.Comment: Accepted by AAAI2023 Worksho
Gradient Attention Balance Network: Mitigating Face Recognition Racial Bias via Gradient Attention
Although face recognition has made impressive progress in recent years, we
ignore the racial bias of the recognition system when we pursue a high level of
accuracy. Previous work found that for different races, face recognition
networks focus on different facial regions, and the sensitive regions of
darker-skinned people are much smaller. Based on this discovery, we propose a
new de-bias method based on gradient attention, called Gradient Attention
Balance Network (GABN). Specifically, we use the gradient attention map (GAM)
of the face recognition network to track the sensitive facial regions and make
the GAMs of different races tend to be consistent through adversarial learning.
This method mitigates the bias by making the network focus on similar facial
regions. In addition, we also use masks to erase the Top-N sensitive facial
regions, forcing the network to allocate its attention to a larger facial
region. This method expands the sensitive region of darker-skinned people and
further reduces the gap between GAM of darker-skinned people and GAM of
Caucasians. Extensive experiments show that GABN successfully mitigates racial
bias in face recognition and learns more balanced performance for people of
different races.Comment: Accepted by CVPR 2023 worksho
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation
Despite the recent works on knowledge distillation (KD) have achieved a
further improvement through elaborately modeling the decision boundary as the
posterior knowledge, their performance is still dependent on the hypothesis
that the target network has a powerful capacity (representation ability). In
this paper, we propose a knowledge representing (KR) framework mainly focusing
on modeling the parameters distribution as prior knowledge. Firstly, we suggest
a knowledge aggregation scheme in order to answer how to represent the prior
knowledge from teacher network. Through aggregating the parameters distribution
from teacher network into more abstract level, the scheme is able to alleviate
the phenomenon of residual accumulation in the deeper layers. Secondly, as the
critical issue of what the most important prior knowledge is for better
distilling, we design a sparse recoding penalty for constraining the student
network to learn with the penalized gradients. With the proposed penalty, the
student network can effectively avoid the over-regularization during knowledge
distilling and converge faster. The quantitative experiments exhibit that the
proposed framework achieves the state-ofthe-arts performance, even though the
target network does not have the expected capacity. Moreover, the framework is
flexible enough for combining with other KD methods based on the posterior
knowledge
Improving Autism Spectrum Disorder Prediction by Fusion of Multiple Measures of Resting-State Functional MRI Data
data-language="eng" data-ev-field="abstract">Autism spectrum disorder (ASD) is a lifelong neurodevelopmental condition characterized by social communication, language and behavior impairments. Leveraging deep learning to automatically predict ASD has attracted more and more attention in the medical and machine learning communities. However, how to select effective measure signals for deep learning prediction is still a challenging problem. In this paper, we studied two kinds of measure signals, i.e., regional homogeneity (ReHo) and Craddock 200 (CC200), which both represents homogeneous functional activity, in the framework of deep learning, and designed a new mechanism to effectively joint them for deep learning based ASD prediction. Extensive experiments on the ABIDE dataset provide empirical evidence in support of effectiveness of our method. In particular, we obtained 79% in terms of accuracy by effectively fusing these two kinds of signals, much better than any single-measure model (ReHo SM-model: ∼69% and CC200 SM-model: ∼70%). These results suggest that leveraging multi-measure signals together are effective for ASD prediction.</p
Study of the Hypoglycemic Activity of Derivatives of Isoflavones from Cicer arietinum L.
The chickpea, a food and medicine used by the people of Xinjiang, has a beneficial hypoglycemic effect. To better utilize this national resource and develop hypoglycemic agents from components of the chickpea, a series of new derivatives of isoflavone compounds from the chickpea were synthesized. An insulin-resistant (IR) HepG2 cell model was used to screen the hypoglycemic activities of these compounds. And the structure-activity relationships of these compounds were explored. Additionally, several combinations of these compound displayed higher hypoglycemic activity than any single compound, and they had similar hypoglycemic activity to that of the positive control group (p>0.05). In addition, combination 3 and combination 6 exerted different effects on the insulin sensitivity of H4IIE cells stimulated with resistin. And the results indicated that combination 3 would have higher hypoglycemic activity. These findings demonstrate the characteristics of multiple components and targets of Chinese herbal medicine. This evidence may provide new ideas for the development of hypoglycemic drugs