329 research outputs found

    Weakly Labelled AudioSet Tagging with Attention Neural Networks

    Full text link
    Audio tagging is the task of predicting the presence or absence of sound classes within an audio clip. Previous work in audio tagging focused on relatively small datasets limited to recognising a small number of sound classes. We investigate audio tagging on AudioSet, which is a dataset consisting of over 2 million audio clips and 527 classes. AudioSet is weakly labelled, in that only the presence or absence of sound classes is known for each clip, while the onset and offset times are unknown. To address the weakly-labelled audio tagging problem, we propose attention neural networks as a way to attend the most salient parts of an audio clip. We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging. We investigate attention neural networks modeled by different functions, depths and widths. Experiments on AudioSet show that the feature-level attention neural network achieves a state-of-the-art mean average precision (mAP) of 0.369, outperforming the best multiple instance learning (MIL) method of 0.317 and Google's deep neural network baseline of 0.314. In addition, we discover that the audio tagging performance on AudioSet embedding features has a weak correlation with the number of training samples and the quality of labels of each sound class.Comment: 13 page

    Temporal Patterns in Fine Particulate Matter Time Series in Beijing: A Calendar View

    Get PDF
    published_or_final_versio

    A comparison of hole-filling methods in 3D

    Get PDF
    This paper presents a review of the most relevant current techniques that deal with hole-filling in 3D models. Contrary to earlier reports, which approach mesh repairing in a sparse and global manner, the objective of this review is twofold. First, a specific and comprehensive review of hole-filling techniques (as a relevant part in the field of mesh repairing) is carried out. We present a brief summary of each technique with attention paid to its algorithmic essence, main contributions and limitations. Second, a solid comparison between 34 methods is established. To do this, we define 19 possible meaningful features and properties that can be found in a generic hole-filling process. Then, we use these features to assess the virtues and deficiencies of the method and to build comparative tables. The purpose of this review is to make a comparative hole-filling state-of-the-art available to researchers, showing pros and cons in a common framework.• Ministerio de Economía y Competitividad: Proyecto DPI2013-43344-R (I+D+i) • Gobierno de Castilla-La Mancha: Proyecto PEII-2014-017-PpeerReviewe

    Planar hexagonal meshing for architecture

    Get PDF
    published_or_final_versio

    Visual and Camera Sensors

    Get PDF
    This book includes 13 papers published in Special Issue ("Visual and Camera Sensors") of the journal Sensors. The goal of this Special Issue was to invite high-quality, state-of-the-art research papers dealing with challenging issues in visual and camera sensors

    From Capture to Display: A Survey on Volumetric Video

    Full text link
    Volumetric video, which offers immersive viewing experiences, is gaining increasing prominence. With its six degrees of freedom, it provides viewers with greater immersion and interactivity compared to traditional videos. Despite their potential, volumetric video services poses significant challenges. This survey conducts a comprehensive review of the existing literature on volumetric video. We firstly provide a general framework of volumetric video services, followed by a discussion on prerequisites for volumetric video, encompassing representations, open datasets, and quality assessment metrics. Then we delve into the current methodologies for each stage of the volumetric video service pipeline, detailing capturing, compression, transmission, rendering, and display techniques. Lastly, we explore various applications enabled by this pioneering technology and we present an array of research challenges and opportunities in the domain of volumetric video services. This survey aspires to provide a holistic understanding of this burgeoning field and shed light on potential future research trajectories, aiming to bring the vision of volumetric video to fruition.Comment: Submitte

    3D Facial Similarity Measure Based on Geodesic Network and Curvatures

    Get PDF
    Automated 3D facial similarity measure is a challenging and valuable research topic in anthropology and computer graphics. It is widely used in various fields, such as criminal investigation, kinship confirmation, and face recognition. This paper proposes a 3D facial similarity measure method based on a combination of geodesic and curvature features. Firstly, a geodesic network is generated for each face with geodesics and iso-geodesics determined and these network points are adopted as the correspondence across face models. Then, four metrics associated with curvatures, that is, the mean curvature, Gaussian curvature, shape index, and curvedness, are computed for each network point by using a weighted average of its neighborhood points. Finally, correlation coefficients according to these metrics are computed, respectively, as the similarity measures between two 3D face models. Experiments of different persons’ 3D facial models and different 3D facial models of the same person are implemented and compared with a subjective face similarity study. The results show that the geodesic network plays an important role in 3D facial similarity measure. The similarity measure defined by shape index is consistent with human’s subjective evaluation basically, and it can measure the 3D face similarity more objectively than the other indices
    • …
    corecore