159 research outputs found
Extracting an Informative Latent Representation of High-Dimensional Galaxy Spectra
To understand the fundamental parameters of galaxy evolution, we investigated
the minimum set of parameters that explain the observed galaxy spectra in the
local Universe. We identified four latent variables that efficiently represent
the diversity of high-dimensional galaxy spectral energy distributions (SEDs)
observed by the Sloan Digital Sky Survey. Additionally, we constructed
meaningful latent representation using conditional variational autoencoders
trained with different permutations of galaxy physical properties, which helped
us quantify the information that these traditionally used properties have on
the reconstruction of galaxy spectra. The four parameters suggest a view that
complex SED population models with a very large number of parameters will be
difficult to constrain even with spectroscopic galaxy data. Through an
Explainable AI (XAI) method, we found that the region below 5000\textup{\AA}
and prominent emission lines ([O II], [O III], and H) are particularly
informative for predicting the latent variables. Our findings suggest that
these latent variables provide a more efficient and fundamental representation
of galaxy spectra than conventionally considered galaxy physical properties.Comment: 5 pages, 6 figures, accepted by NeurIPS 202
Statistics of seismic cluster durations
Using the standard ETAS model of triggered seismicity, we present a rigorous
theoretical analysis of the main statistical properties of temporal clusters,
defined as the group of events triggered by a given main shock of fixed
magnitude m that occurred at the origin of time, at times larger than some
present time t. Using the technology of generating probability function (GPF),
we derive the explicit expressions for the GPF of the number of future
offsprings in a given temporal seismic cluster, defining, in particular, the
statistics of the cluster's duration and the cluster's offsprings maximal
magnitudes. We find the remarkable result that the magnitude difference between
the largest and second largest event in the future temporal cluster is
distributed according to the regular Gutenberg-Richer law that controls the
unconditional distribution of earthquake magnitudes. For earthquakes obeying
the Omori-Utsu law for the distribution of waiting times between triggering and
triggered events, we show that the distribution of the durations of temporal
clusters of events of magnitudes above some detection threshold \nu has a power
law tail that is fatter in the non-critical regime than in the critical
case n=1. This paradoxical behavior can be rationalised from the fact that
generations of all orders cascade very fast in the critical regime and
accelerate the temporal decay of the cluster dynamics.Comment: 45 pages, 15 figure
Deep sound-field denoiser: optically-measured sound-field denoising using deep neural network
This paper proposes a deep sound-field denoiser, a deep neural network (DNN)
based denoising of optically measured sound-field images. Sound-field imaging
using optical methods has gained considerable attention due to its ability to
achieve high-spatial-resolution imaging of acoustic phenomena that conventional
acoustic sensors cannot accomplish. However, the optically measured sound-field
images are often heavily contaminated by noise because of the low sensitivity
of optical interferometric measurements to airborne sound. Here, we propose a
DNN-based sound-field denoising method. Time-varying sound-field image
sequences are decomposed into harmonic complex-amplitude images by using a
time-directional Fourier transform. The complex images are converted into
two-channel images consisting of real and imaginary parts and denoised by a
nonlinear-activation-free network. The network is trained on a sound-field
dataset obtained from numerical acoustic simulations with randomized
parameters. We compared the method with conventional ones, such as image
filters and a spatiotemporal filter, on numerical and experimental data. The
experimental data were measured by parallel phase-shifting interferometry and
holographic speckle interferometry. The proposed deep sound-field denoiser
significantly outperformed the conventional methods on both the numerical and
experimental data.Comment: 13 pages, 8 figures, 2 table
Selective Inference for Changepoint detection by Recurrent Neural Network
In this study, we investigate the quantification of the statistical
reliability of detected change points (CPs) in time series using a Recurrent
Neural Network (RNN). Thanks to its flexibility, RNN holds the potential to
effectively identify CPs in time series characterized by complex dynamics.
However, there is an increased risk of erroneously detecting random noise
fluctuations as CPs. The primary goal of this study is to rigorously control
the risk of false detections by providing theoretically valid p-values to the
CPs detected by RNN. To achieve this, we introduce a novel method based on the
framework of Selective Inference (SI). SI enables valid inferences by
conditioning on the event of hypothesis selection, thus mitigating selection
bias. In this study, we apply SI framework to RNN-based CP detection, where
characterizing the complex process of RNN selecting CPs is our main technical
challenge. We demonstrate the validity and effectiveness of the proposed method
through artificial and real data experiments.Comment: 41pages, 16figure
Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation
Self-supervised learning general-purpose audio representations have
demonstrated high performance in a variety of tasks. Although they can be
optimized for application by fine-tuning, even higher performance can be
expected if they can be specialized to pre-train for an application. This paper
explores the challenges and solutions in specializing general-purpose audio
representations for a specific application using speech, a highly demanding
field, as an example. We enhance Masked Modeling Duo (M2D), a general-purpose
model, to close the performance gap with state-of-the-art (SOTA) speech models.
To do so, we propose a new task, denoising distillation, to learn from
fine-grained clustered features, and M2D for Speech (M2D-S), which jointly
learns the denoising distillation task and M2D masked prediction task.
Experimental results show that M2D-S performs comparably to or outperforms SOTA
speech models on the SUPERB benchmark, demonstrating that M2D can specialize in
a demanding field. Our code is available at:
https://github.com/nttcslab/m2d/tree/master/speechComment: Interspeech 2023; 5 pages, 2 figures, 6 tables, Code:
https://github.com/nttcslab/m2d/tree/master/speec
- …