219 research outputs found
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Recently, massive architectures based on Convolutional Neural Network (CNN)
and self-attention mechanisms have become necessary for audio classification.
While these techniques are state-of-the-art, these works' effectiveness can
only be guaranteed with huge computational costs and parameters, large amounts
of data augmentation, transfer from large datasets and some other tricks. By
utilizing the lightweight nature of audio, we propose an efficient network
structure called Paired Inverse Pyramid Structure (PIP) and a network called
Paired Inverse Pyramid Structure MLP Network (PIPMN). The PIPMN reaches 96\% of
Environmental Sound Classification (ESC) accuracy on the UrbanSound8K dataset
and 93.2\% of Music Genre Classification (MGC) on the GTAZN dataset, with only
1 million parameters. Both of the results are achieved without data
augmentation or model transfer. Public code is available at:
https://github.com/JNAIC/PIPM
Spatial Parameter Identification for MIMO Systems in the Presence of Non-Gaussian Interference
Reliable identification of spatial parameters for multiple-input multiple-output (MIMO) systems, such as the number of transmit antennas (NTA) and the direction of arrival (DOA), is a prerequisite for MIMO signal separation and detection. Most existing parameter estimation methods for MIMO systems only consider a single parameter in Gaussian noise. This paper develops a reliable identification scheme based on generalized multi-antenna time-frequency distribution (GMTFD) for MIMO systems with non-Gaussian interference and Gaussian noise. First, a new generalized correlation matrix is introduced to construct a generalized MTFD matrix. Then, the covariance matrix based on time-frequency distribution (CM-TF) is characterized by using the diagonal entries from the auto-source signal components and the non-diagonal entries from the cross-source signal components in the generalized MTFD matrix. Finally, by making use of the CM-TF, the Gerschgorin disk criterion is modified to estimate NTA, and the multiple signal classification (MUSIC) is exploited to estimate DOA for MIMO system. Simulation results indicate that the proposed scheme based on GMTFD has good robustness to non-Gaussian interference without prior information and that it can achieve high estimation accuracy and resolution at low and medium signal-to-noise ratios (SNRs)
Provable Probabilistic Imaging using Score-Based Generative Priors
Estimating high-quality images while also quantifying their uncertainty are
two desired features in an image reconstruction algorithm for solving ill-posed
inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as
a principled framework for characterizing the space of possible solutions to a
general inverse problem. PMC is able to incorporate expressive score-based
generative priors for high-quality image reconstruction while also performing
uncertainty quantification via posterior sampling. In particular, we introduce
two PMC algorithms which can be viewed as the sampling analogues of the
traditional plug-and-play priors (PnP) and regularization by denoising (RED)
algorithms. We also establish a theoretical analysis for characterizing the
convergence of the PMC algorithms. Our analysis provides non-asymptotic
stationarity guarantees for both algorithms, even in the presence of
non-log-concave likelihoods and imperfect score networks. We demonstrate the
performance of the PMC algorithms on multiple representative inverse problems
with both linear and nonlinear forward models. Experimental results show that
PMC significantly improves reconstruction quality and enables high-fidelity
uncertainty quantification
Data Augmentation for Environmental Sound Classification Using Diffusion Probabilistic Model with Top-k Selection Discriminator
Despite consistent advancement in powerful deep learning techniques in recent
years, large amounts of training data are still necessary for the models to
avoid overfitting. Synthetic datasets using generative adversarial networks
(GAN) have recently been generated to overcome this problem. Nevertheless,
despite advancements, GAN-based methods are usually hard to train or fail to
generate high-quality data samples. In this paper, we propose an environmental
sound classification augmentation technique based on the diffusion
probabilistic model with DPM-Solver for fast sampling. In addition, to
ensure the quality of the generated spectrograms, we train a top-k selection
discriminator on the dataset. According to the experiment results, the
synthesized spectrograms have similar features to the original dataset and can
significantly increase the classification accuracy of different
state-of-the-art models compared with traditional data augmentation techniques.
The public code is available on
https://github.com/JNAIC/DPMs-for-Audio-Data-Augmentation
Family history of cancer is a prognostic factor for better survival in operable esophageal squamous cell carcinoma: A propensity score matching analysis
Lay summaryPatients with a family history of cancer, especially digestive tract cancer and esophageal cancer, a family history of cancer in the first degree, and more than one relative affected by cancer were associated with favorable survival when compared to those without a family history of cancer.Precis for use in the Table of ContentsA family history of cancer is a favorable independent prognostic factor in ESCC. Patients with a family history of cancer, especially digestive tract cancer and esophageal cancer, a family history of cancer in the first degree, and more than one relative affected by cancer were associated with favorable survival when compared to those without a family history of cancer.BackgroundA family history of cancer (FH) is closely associated with the risk and survival of many cancers. However, the effect of FH on the prognosis of patients with esophageal squamous cell carcinoma (ESCC) remains unclear. We performed a large cohort study in the Chinese population to obtain insight into the prognostic value of FH in patients with operable ESCC.MethodsA total of 1,322 consecutive patients with thoracic ESCC who had undergone esophagectomy between January 1997 and December 2013 were included. The FH group included patients with any degree of FH, while the non-FH group included patients without any degree of FH. In total, 215 patients with FH and 215 without FH were matched using the propensity score matching analysis method to adjust for differences in baseline variables between the two groups. The impact of FH on disease-free survival (DFS) and overall survival (OS) was estimated using the Kaplan–Meier method and Cox’s proportional hazards models.ResultsBefore matching, 280 (21.2%) patients were included in the FH group and 1,042 (78.8%) in the non-FH group. FH was associated with early pathological T stage (p = 0.001), lymph node-negative status (p = 0.022), and early pathological stage (p = 0.006). After matching, FH was an independent prognostic factor for DFS and OS in ESCC patients. Patients with FH had 35% lower risk of disease progression (hazard ratio [HR] = 0.65, 95% CI: 0.51–0.84, p = 0.001) and 34% lower risk of death (HR = 0.66, 95% CI: 0.51–0.86, p = 0.002) than those without FH. Patients with a family history of digestive tract cancer (FH-DC), a family history of esophageal cancer (FH-EC), FH in first-degree relatives (FH-FD), and more than one relative affected by cancer were associated with favorable DFS and OS as compared to those without FH.ConclusionFH is a favorable independent prognostic factor in ESCC. Patients with FH, especially those with FH-DC, FH-EC, FH-FD, and more than one relative affected by cancer, had improved survival
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors
Prompt-tuning has emerged as an attractive paradigm for deploying large-scale
language models due to its strong downstream task performance and efficient
multitask serving ability. Despite its wide adoption, we empirically show that
prompt-tuning is vulnerable to downstream task-agnostic backdoors, which reside
in the pretrained models and can affect arbitrary downstream tasks. The
state-of-the-art backdoor detection approaches cannot defend against
task-agnostic backdoors since they hardly converge in reversing the backdoor
triggers. To address this issue, we propose LMSanitator, a novel approach for
detecting and removing task-agnostic backdoors on Transformer models. Instead
of directly inverting the triggers, LMSanitator aims to invert the predefined
attack vectors (pretrained models' output when the input is embedded with
triggers) of the task-agnostic backdoors, which achieves much better
convergence performance and backdoor detection accuracy. LMSanitator further
leverages prompt-tuning's property of freezing the pretrained model to perform
accurate and fast output monitoring and input purging during the inference
phase. Extensive experiments on multiple language models and NLP tasks
illustrate the effectiveness of LMSanitator. For instance, LMSanitator achieves
92.8% backdoor detection accuracy on 960 models and decreases the attack
success rate to less than 1% in most scenarios.Comment: To Appear in the Network and Distributed System Security (NDSS)
Symposium 2024, 26 February - 1 March 2024, San Diego, CA, USA; typos
correcte
- …