99 research outputs found

    Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex

    Full text link
    This paper presents a new deep clustering (DC) method called manifold-aware DC (M-DC) that can enhance hyperspace utilization more effectively than the original DC. The original DC has a limitation in that a pair of two speakers has to be embedded having an orthogonal relationship due to its use of the one-hot vector-based loss function, while our method derives a unique loss function aimed at maximizing the target angle in the hyperspace based on the nature of a regular simplex. Our proposed loss imposes a higher penalty than the original DC when the speaker is assigned incorrectly. The change from DC to M-DC can be easily achieved by rewriting just one term in the loss function of DC, without any other modifications to the network architecture or model parameters. As such, our method has high practicability because it does not affect the original inference part. The experimental results show that the proposed method improves the performances of the original DC and its expansion method.Comment: Accepted by Interspeech 202

    An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification

    Full text link
    Although music is typically multi-label, many works have studied hierarchical music tagging with simplified settings such as single-label data. Moreover, there lacks a framework to describe various joint training methods under the multi-label setting. In order to discuss the above topics, we introduce hierarchical multi-label music instrument classification task. The task provides a realistic setting where multi-instrument real music data is assumed. Various hierarchical methods that jointly train a DNN are summarized and explored in the context of the fusion of deep learning and conventional techniques. For the effective joint training in the multi-label setting, we propose two methods to model the connection between fine- and coarse-level tags, where one uses rule-based grouped max-pooling, the other one uses the attention mechanism obtained in a data-driven manner. Our evaluation reveals that the proposed methods have advantages over the method without joint training. In addition, the decision procedure within the proposed methods can be interpreted by visualizing attention maps or referring to fixed rules.Comment: To appear at ICASSP 202

    Heterogeneous Impacts of Grazing Animals and Vegetational Change in Japanese Native Pastures

    Get PDF
    Defoliation, defecation and trampling are the major modes whereby grazing animals give impacts on vegetation. Due to the uneven distribution, such grazing behavior can have profound effects on vegetation. For extensive grazing systems in native pastures, understandings of the plant-animal interaction are vital for adequate control of vegetation and animal conditions and sustainable use of natural resources. This paper reviews recent studies of the grazing impacts on vegetation in Japanese native pastures. Most of the studies were carried out in the Kawatabi Field Science Center (Kawatabi FSC), Tohoku University. 1.Native pastures in the Kawatabi FSC are composed of 61-155 plant species, of which cattle graze upon 26-76 species. Among these species, Miscanthus sinensis (Japanese plume-grass) was the most frequently grazed by cattle. The spatial distribution of available forage is a major factor affecting diet selection and consumption of cattle. Such selective grazing results in significant reduction of M. sinensis in native pastures. 2.Seed dispersal of plants by defecation of grazing animals can also result in significant vegetational change. Recent studies have shown that Carex spp. is the major plant whose seeds are dispersed by defecation of animals rotationally grazed in a native and a sown pasture. The mechanisms of the seed dispersal and its possible effects on vegetational succession are discussed. 3.Heavy trampling is known to degrade vegetative ground cover. Our research has shown that trampling by cattle promotes the invasion of a shrub, Weigela hortensis into Miscanthus-dominant pastures. Because the seeds of W. hortensis are light sensitive germinators, trampling by removing ground cover promotes its seed germination. These findings provide new perspectives on plantanimal interactions in Japanese native pastures and help estimate the impact of animals on plant succession. They also contribute to efforts to ensure sustainable grazing use of pastures

    Zero- and Few-shot Sound Event Localization and Detection

    Full text link
    Sound event localization and detection (SELD) systems estimate direction-of-arrival (DOA) and temporal activation for sets of target classes. Neural network (NN)-based SELD systems have performed well in various sets of target classes, but they only output the DOA and temporal activation of preset classes that are trained before inference. To customize target classes after training, we tackle zero- and few-shot SELD tasks, in which we set new classes with a text sample or a few audio samples. While zero-shot sound classification tasks are achievable by embedding from contrastive language-audio pretraining (CLAP), zero-shot SELD tasks require assigning an activity and a DOA to each embedding, especially in overlapping cases. To tackle the assignment problem in overlapping cases, we propose an embed-ACCDOA model, which is trained to output track-wise CLAP embedding and corresponding activity-coupled Cartesian direction-of-arrival (ACCDOA). In our experimental evaluations on zero- and few-shot SELD tasks, the embed-ACCDOA model showed a better location-dependent scores than a straightforward combination of the CLAP audio encoder and a DOA estimation model. Moreover, the proposed combination of the embed-ACCDOA model and CLAP audio encoder with zero- or few-shot samples performed comparably to an official baseline system trained with complete train data in an evaluation dataset.Comment: 5 pages, 4 figure

    STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

    Get PDF
    This report presents the Sony-TAu Realistic Spatial Soundscapes 2022 (STARS22) dataset for sound event localization and detection, comprised of spatial recordings of real scenes collected in various interiors of two different sites. The dataset is captured with a high resolution spherical microphone array and delivered in two 4-channel formats, first-order Ambisonics and tetrahedral microphone array. Sound events in the dataset belonging to 13 target sound classes are annotated both temporally and spatially through a combination of human annotation and optical tracking. The dataset serves as the development and evaluation dataset for the Task 3 of the DCASE2022 Challenge on Sound Event Localization and Detection and introduces significant new challenges for the task compared to the previous iterations, which were based on synthetic spatialized sound scene recordings. Dataset specifications are detailed including recording and annotation process, target classes and their presence, and details on the development and evaluation splits. Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format. Results of the baseline indicate that with a suitable training strategy a reasonable detection and localization performance can be achieved on real sound scene recordings. The dataset is available in https://zenodo.org/record/6387880

    A self-consistent first-principles calculation scheme for correlated electron systems

    Full text link
    A self-consistent calculation scheme for correlated electron systems is created based on the density-functional theory (DFT). Our scheme is a multi-reference DFT (MR-DFT) calculation in which the electron charge density is reproduced by an auxiliary interacting Fermion system. A short-range Hubbard-type interaction is introduced by a rigorous manner with a residual term for the exchange-correlation energy. The Hubbard term is determined uniquely by referencing the density fluctuation at a selected localized orbital. This strategy to obtain an extension of the Kohn-Sham scheme provides a self-consistent electronic structure calculation for the materials design. Introducing an approximation for the residual exchange-correlation energy functional, we have the LDA+U energy functional. Practical self-consistent calculations are exemplified by simulations of Hydrogen systems, i.e. a molecule and a periodic one-dimensional array, which is a proof of existence of the interaction strength U as a continuous function of the local fluctuation and structural parameters of the system.Comment: 23 pages, 8 figures, to appear in J. Phys. Condens. Matte
    corecore