4,475 research outputs found
Noisy-ArcMix: Additive Noisy Angular Margin Loss Combined With Mixup Anomalous Sound Detection
Unsupervised anomalous sound detection (ASD) aims to identify anomalous
sounds by learning the features of normal operational sounds and sensing their
deviations. Recent approaches have focused on the self-supervised task
utilizing the classification of normal data, and advanced models have shown
that securing representation space for anomalous data is important through
representation learning yielding compact intra-class and well-separated
intra-class distributions. However, we show that conventional approaches often
fail to ensure sufficient intra-class compactness and exhibit angular disparity
between samples and their corresponding centers. In this paper, we propose a
training technique aimed at ensuring intra-class compactness and increasing the
angle gap between normal and abnormal samples. Furthermore, we present an
architecture that extracts features for important temporal regions, enabling
the model to learn which time frames should be emphasized or suppressed.
Experimental results demonstrate that the proposed method achieves the best
performance giving 0.90%, 0.83%, and 2.16% improvement in terms of AUC, pAUC,
and mAUC, respectively, compared to the state-of-the-art method on DCASE 2020
Challenge Task2 dataset.Comment: Submitted to ICASSP 202
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement
In this study, we propose a dense frequency-time attentive network (DeFT-AN)
for multichannel speech enhancement. DeFT-AN is a mask estimation network that
predicts a complex spectral masking pattern for suppressing the noise and
reverberation embedded in the short-time Fourier transform (STFT) of an input
signal. The proposed mask estimation network incorporates three different types
of blocks for aggregating information in the spatial, spectral, and temporal
dimensions. It utilizes a spectral transformer with a modified feed-forward
network and a temporal conformer with sequential dilated convolutions. The use
of dense blocks and transformers dedicated to the three different
characteristics of audio signals enables more comprehensive enhancement in
noisy and reverberant environments. The remarkable performance of DeFT-AN over
state-of-the-art multichannel models is demonstrated based on two popular noisy
and reverberant datasets in terms of various metrics for speech quality and
intelligibility.Comment: 5 pages, 2 figures, 3 table
RGI-Net: 3D Room Geometry Inference from Room Impulse Responses in the Absence of First-order Echoes
Room geometry is important prior information for implementing realistic 3D
audio rendering. For this reason, various room geometry inference (RGI) methods
have been developed by utilizing the time of arrival (TOA) or time difference
of arrival (TDOA) information in room impulse responses. However, the
conventional RGI technique poses several assumptions, such as convex room
shapes, the number of walls known in priori, and the visibility of first-order
reflections. In this work, we introduce the deep neural network (DNN), RGI-Net,
which can estimate room geometries without the aforementioned assumptions.
RGI-Net learns and exploits complex relationships between high-order
reflections in room impulse responses (RIRs) and, thus, can estimate room
shapes even when the shape is non-convex or first-order reflections are missing
in the RIRs. The network takes RIRs measured from a compact audio device
equipped with a circular microphone array and a single loudspeaker, which
greatly improves its practical applicability. RGI-Net includes the evaluation
network that separately evaluates the presence probability of walls, so the
geometry inference is possible without prior knowledge of the number of walls.Comment: 5 pages, 3 figures, 3 table
Statistical Analysis of the Metropolitan Seoul Subway System: Network Structure and Passenger Flows
The Metropolitan Seoul Subway system, consisting of 380 stations, provides
the major transportation mode in the metropolitan Seoul area. Focusing on the
network structure, we analyze statistical properties and topological
consequences of the subway system. We further study the passenger flows on the
system, and find that the flow weight distribution exhibits a power-law
behavior. In addition, the degree distribution of the spanning tree of the
flows also follows a power law.Comment: 10 pages, 4 figure
Sleepless in Seoul: `The Ant and the Metrohopper'
One of Aesop's (La Fontain's) famous fables `The Ant and the Grasshopper' is
widely known to give a moral lesson through comparison between the hard working
ant and the party-loving grasshopper. Here we show a slightly different version
of this fable, namely, "The Ant and the Metrohopper," which describes human
mobility patterns in modern urban life. Numerous real transportation networks
and the trajectory data have been studied in order to understand mobility
patterns. We study trajectories of commuters on the public transportation of
Metropolitan Seoul, Korea. Smart cards (Integrated Circuit Cards; ICCs) are
used in the public transportation system, which allow collection of transit
transaction data, including departure and arrival stations and time. This
empirical analysis provides human mobility patterns, which impact traffic
forecasting and transportation optimization, as well as urban planning.Comment: to be appeared in Journal of the Korean Physical Societ
- โฆ