14 research outputs found

    NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

    Full text link
    The goal of speech enhancement (SE) is to eliminate the background interference from the noisy speech signal. Generative models such as diffusion models (DM) have been applied to the task of SE because of better generalization in unseen noisy scenes. Technical routes for the DM-based SE methods can be summarized into three types: task-adapted diffusion process formulation, generator-plus-conditioner (GPC) structures and the multi-stage frameworks. We focus on the first two approaches, which are constructed under the GPC architecture and use the task-adapted diffusion process to better deal with the real noise. However, the performance of these SE models is limited by the following issues: (a) Non-Gaussian noise estimation in the task-adapted diffusion process. (b) Conditional domain bias caused by the weak conditioner design in the GPC structure. (c) Large amount of residual noise caused by unreasonable interpolation operations during inference. To solve the above problems, we propose a noise-aware diffusion-based SE model (NADiffuSE) to boost the SE performance, where the noise representation is extracted from the noisy speech signal and introduced as a global conditional information for estimating the non-Gaussian components. Furthermore, the anchor-based inference algorithm is employed to achieve a compromise between the speech distortion and noise residual. In order to mitigate the performance degradation caused by the conditional domain bias in the GPC framework, we investigate three model variants, all of which can be viewed as multi-stage SE based on the preprocessing networks for Mel spectrograms. Experimental results show that NADiffuSE outperforms other DM-based SE models under the GPC infrastructure. Audio samples are available at: https://square-of-w.github.io/NADiffuSE-demo/

    Model and Data Agreement for Learning with Noisy Labels

    Full text link
    Learning with noisy labels is a vital topic for practical deep learning as models should be robust to noisy open-world datasets in the wild. The state-of-the-art noisy label learning approach JoCoR fails when faced with a large ratio of noisy labels. Moreover, selecting small-loss samples can also cause error accumulation as once the noisy samples are mistakenly selected as small-loss samples, they are more likely to be selected again. In this paper, we try to deal with error accumulation in noisy label learning from both model and data perspectives. We introduce mean point ensemble to utilize a more robust loss function and more information from unselected samples to reduce error accumulation from the model perspective. Furthermore, as the flip images have the same semantic meaning as the original images, we select small-loss samples according to the loss values of flip images instead of the original ones to reduce error accumulation from the data perspective. Extensive experiments on CIFAR-10, CIFAR-100, and large-scale Clothing1M show that our method outperforms state-of-the-art noisy label learning methods with different levels of label noise. Our method can also be seamlessly combined with other noisy label learning methods to further improve their performance and generalize well to other tasks. The code is available in https://github.com/zyh-uaiaaaa/MDA-noisy-label-learning.Comment: Accepted by AAAI2023 Worksho

    Gradient Attention Balance Network: Mitigating Face Recognition Racial Bias via Gradient Attention

    Full text link
    Although face recognition has made impressive progress in recent years, we ignore the racial bias of the recognition system when we pursue a high level of accuracy. Previous work found that for different races, face recognition networks focus on different facial regions, and the sensitive regions of darker-skinned people are much smaller. Based on this discovery, we propose a new de-bias method based on gradient attention, called Gradient Attention Balance Network (GABN). Specifically, we use the gradient attention map (GAM) of the face recognition network to track the sensitive facial regions and make the GAMs of different races tend to be consistent through adversarial learning. This method mitigates the bias by making the network focus on similar facial regions. In addition, we also use masks to erase the Top-N sensitive facial regions, forcing the network to allocate its attention to a larger facial region. This method expands the sensitive region of darker-skinned people and further reduces the gap between GAM of darker-skinned people and GAM of Caucasians. Extensive experiments show that GABN successfully mitigates racial bias in face recognition and learns more balanced performance for people of different races.Comment: Accepted by CVPR 2023 worksho

    Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation

    Full text link
    Despite the recent works on knowledge distillation (KD) have achieved a further improvement through elaborately modeling the decision boundary as the posterior knowledge, their performance is still dependent on the hypothesis that the target network has a powerful capacity (representation ability). In this paper, we propose a knowledge representing (KR) framework mainly focusing on modeling the parameters distribution as prior knowledge. Firstly, we suggest a knowledge aggregation scheme in order to answer how to represent the prior knowledge from teacher network. Through aggregating the parameters distribution from teacher network into more abstract level, the scheme is able to alleviate the phenomenon of residual accumulation in the deeper layers. Secondly, as the critical issue of what the most important prior knowledge is for better distilling, we design a sparse recoding penalty for constraining the student network to learn with the penalized gradients. With the proposed penalty, the student network can effectively avoid the over-regularization during knowledge distilling and converge faster. The quantitative experiments exhibit that the proposed framework achieves the state-ofthe-arts performance, even though the target network does not have the expected capacity. Moreover, the framework is flexible enough for combining with other KD methods based on the posterior knowledge

    Improving Autism Spectrum Disorder Prediction by Fusion of Multiple Measures of Resting-State Functional MRI Data

    No full text
    data-language=&quot;eng&quot;&nbsp;data-ev-field=&quot;abstract&quot;&gt;Autism&nbsp;spectrum&nbsp;disorder&nbsp;(ASD) is a lifelong neurodevelopmental condition characterized&nbsp;by&nbsp;social communication, language and behavior impairments. Leveraging deep learning to automatically&nbsp;predict&nbsp;ASD has attracted more and more attention in the medical and machine learning communities. However, how to select effective&nbsp;measure&nbsp;signals for deep learning&nbsp;prediction&nbsp;is still a challenging problem. In this paper, we studied two kinds&nbsp;of&nbsp;measure&nbsp;signals, i.e., regional homogeneity (ReHo) and Craddock 200 (CC200), which both represents homogeneous&nbsp;functional&nbsp;activity, in the framework&nbsp;of&nbsp;deep learning, and designed a new mechanism to effectively joint them for deep learning based ASD&nbsp;prediction. Extensive experiments on the ABIDE dataset provide empirical evidence in support&nbsp;of&nbsp;effectiveness&nbsp;of&nbsp;our method. In particular, we obtained 79% in terms&nbsp;of&nbsp;accuracy&nbsp;by&nbsp;effectively fusing these two kinds&nbsp;of&nbsp;signals, much better than any single-measure&nbsp;model (ReHo SM-model: &sim;69% and CC200 SM-model: &sim;70%). These results suggest that leveraging multi-measure&nbsp;signals together are effective for ASD&nbsp;prediction.</p

    Study of the Hypoglycemic Activity of Derivatives of Isoflavones from Cicer arietinum L.

    No full text
    The chickpea, a food and medicine used by the people of Xinjiang, has a beneficial hypoglycemic effect. To better utilize this national resource and develop hypoglycemic agents from components of the chickpea, a series of new derivatives of isoflavone compounds from the chickpea were synthesized. An insulin-resistant (IR) HepG2 cell model was used to screen the hypoglycemic activities of these compounds. And the structure-activity relationships of these compounds were explored. Additionally, several combinations of these compound displayed higher hypoglycemic activity than any single compound, and they had similar hypoglycemic activity to that of the positive control group (p>0.05). In addition, combination 3 and combination 6 exerted different effects on the insulin sensitivity of H4IIE cells stimulated with resistin. And the results indicated that combination 3 would have higher hypoglycemic activity. These findings demonstrate the characteristics of multiple components and targets of Chinese herbal medicine. This evidence may provide new ideas for the development of hypoglycemic drugs
    corecore