315 research outputs found
Adaptive Growth: Real-time CNN Layer Expansion
Deep Neural Networks (DNNs) have shown unparalleled achievements in numerous
applications, reflecting their proficiency in managing vast data sets. Yet,
their static structure limits their adaptability in ever-changing environments.
This research presents a new algorithm that allows the convolutional layer of a
Convolutional Neural Network (CNN) to dynamically evolve based on data input,
while still being seamlessly integrated into existing DNNs. Instead of a rigid
architecture, our approach iteratively introduces kernels to the convolutional
layer, gauging its real-time response to varying data. This process is refined
by evaluating the layer's capacity to discern image features, guiding its
growth. Remarkably, our unsupervised method has outstripped its supervised
counterparts across diverse datasets like MNIST, Fashion-MNIST, CIFAR-10, and
CIFAR-100. It also showcases enhanced adaptability in transfer learning
scenarios. By introducing a data-driven model scalability strategy, we are
filling a void in deep learning, leading to more flexible and efficient DNNs
suited for dynamic settings.
Code:(https://github.com/YunjieZhu/Extensible-Convolutional-Layer-git-version).Comment: Code:
https://github.com/YunjieZhu/Extensible-Convolutional-Layer-git-versio
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Recently, massive architectures based on Convolutional Neural Network (CNN)
and self-attention mechanisms have become necessary for audio classification.
While these techniques are state-of-the-art, these works' effectiveness can
only be guaranteed with huge computational costs and parameters, large amounts
of data augmentation, transfer from large datasets and some other tricks. By
utilizing the lightweight nature of audio, we propose an efficient network
structure called Paired Inverse Pyramid Structure (PIP) and a network called
Paired Inverse Pyramid Structure MLP Network (PIPMN). The PIPMN reaches 96\% of
Environmental Sound Classification (ESC) accuracy on the UrbanSound8K dataset
and 93.2\% of Music Genre Classification (MGC) on the GTAZN dataset, with only
1 million parameters. Both of the results are achieved without data
augmentation or model transfer. Public code is available at:
https://github.com/JNAIC/PIPM
MagicNet: Semi-Supervised Multi-Organ Segmentation via Magic-Cube Partition and Recovery
We propose a novel teacher-student model for semi-supervised multi-organ
segmentation. In teacher-student model, data augmentation is usually adopted on
unlabeled data to regularize the consistent training between teacher and
student. We start from a key perspective that fixed relative locations and
variable sizes of different organs can provide distribution information where a
multi-organ CT scan is drawn. Thus, we treat the prior anatomy as a strong tool
to guide the data augmentation and reduce the mismatch between labeled and
unlabeled images for semi-supervised learning. More specifically, we propose a
data augmentation strategy based on partition-and-recovery N cubes cross-
and within- labeled and unlabeled images. Our strategy encourages unlabeled
images to learn organ semantics in relative locations from the labeled images
(cross-branch) and enhances the learning ability for small organs
(within-branch). For within-branch, we further propose to refine the quality of
pseudo labels by blending the learned representations from small cubes to
incorporate local attributes. Our method is termed as MagicNet, since it treats
the CT volume as a magic-cube and N-cube partition-and-recovery process
matches with the rule of playing a magic-cube. Extensive experiments on two
public CT multi-organ datasets demonstrate the effectiveness of MagicNet, and
noticeably outperforms state-of-the-art semi-supervised medical image
segmentation approaches, with +7% DSC improvement on MACT dataset with 10%
labeled images. Code is available at
https://github.com/DeepMed-Lab-ECNU/MagicNet.Comment: Accepted by CVPR 202
Data Augmentation for Environmental Sound Classification Using Diffusion Probabilistic Model with Top-k Selection Discriminator
Despite consistent advancement in powerful deep learning techniques in recent
years, large amounts of training data are still necessary for the models to
avoid overfitting. Synthetic datasets using generative adversarial networks
(GAN) have recently been generated to overcome this problem. Nevertheless,
despite advancements, GAN-based methods are usually hard to train or fail to
generate high-quality data samples. In this paper, we propose an environmental
sound classification augmentation technique based on the diffusion
probabilistic model with DPM-Solver for fast sampling. In addition, to
ensure the quality of the generated spectrograms, we train a top-k selection
discriminator on the dataset. According to the experiment results, the
synthesized spectrograms have similar features to the original dataset and can
significantly increase the classification accuracy of different
state-of-the-art models compared with traditional data augmentation techniques.
The public code is available on
https://github.com/JNAIC/DPMs-for-Audio-Data-Augmentation
Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities
The vast proliferation of sensor devices and Internet of Things enables the
applications of sensor-based activity recognition. However, there exist
substantial challenges that could influence the performance of the recognition
system in practical scenarios. Recently, as deep learning has demonstrated its
effectiveness in many areas, plenty of deep methods have been investigated to
address the challenges in activity recognition. In this study, we present a
survey of the state-of-the-art deep learning methods for sensor-based human
activity recognition. We first introduce the multi-modality of the sensory data
and provide information for public datasets that can be used for evaluation in
different challenge tasks. We then propose a new taxonomy to structure the deep
methods by challenges. Challenges and challenge-related deep methods are
summarized and analyzed to form an overview of the current research progress.
At the end of this work, we discuss the open issues and provide some insights
for future directions
- …