94 research outputs found
Polyphonic Sound Event Detection by using Capsule Neural Networks
Artificial sound event detection (SED) has the aim to mimic the human ability
to perceive and understand what is happening in the surroundings. Nowadays,
Deep Learning offers valuable techniques for this goal such as Convolutional
Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has
been recently introduced in the image processing field with the intent to
overcome some of the known limitations of CNNs, specifically regarding the
scarce robustness to affine transformations (i.e., perspective, size,
orientation) and the detection of overlapped images. This motivated the authors
to employ CapsNets to deal with the polyphonic-SED task, in which multiple
sound events occur simultaneously. Specifically, we propose to exploit the
capsule units to represent a set of distinctive properties for each individual
sound event. Capsule units are connected through a so-called "dynamic routing"
that encourages learning part-whole relationships and improves the detection
performance in a polyphonic context. This paper reports extensive evaluations
carried out on three publicly available datasets, showing how the CapsNet-based
algorithm not only outperforms standard CNNs but also allows to achieve the
best results with respect to the state of the art algorithms
Multi-household energy management in a smart neighborhood in the presence of uncertainties and electric vehicles
none4noThe pathway toward the reduction of greenhouse gas emissions is dependent upon increasing Renewable Energy Sources (RESs), demand response, and electrification of public and private transportation. Energy management techniques are necessary to coordinate the operation in this complex scenario, and in recent years several works have appeared in the literature on this topic. This paper presents a study on multi-household energy management for Smart Neighborhoods integrating RESs and electric vehicles participating in Vehicle-to-Home (V2H) and Vehicle-to-Neighborhood (V2N) programs. The Smart Neighborhood comprises multiple households, a parking lot with public charging stations, and an aggregator that coordinates energy transactions using a Multi-Household Energy Manager (MH-EM). The MH-EM jointly maximizes the profits of the aggregator and the households by using the augmented ɛ-constraint approach. The generated Pareto optimal solutions allow for different decision policies to balance the aggregator’s and households’ profits, prioritizing one of them or the RES energy usage within the Smart Neighborhood. The experiments have been conducted over an entire year considering uncertainties related to the energy price, electric vehicles usage, energy production of RESs, and energy demand of the households. The results show that the MH-EM optimizes the Smart Neighborhood operation and that the solution that maximizes the RES energy usage provides the greatest benefits also in terms of peak-shaving and valley-filling capability of the energy demand.openLuca Serafini, Emanuele Principi, Susanna Spinsante, Stefano SquartiniSerafini, Luca; Principi, Emanuele; Spinsante, Susanna; Squartini, Stefan
automatic detection of cry sounds in neonatal intensive care units by using deep learning and acoustic scene simulation
Cry detection is an important facility in both residential and public environments, which can answer to different needs of both private and professional users. In this paper, we investigate the problem of cry detection in professional environments, such as Neonatal Intensive Care Units (NICUs). The aim of our work is to propose a cry detection method based on deep neural networks (DNNs) and also to evaluate whether a properly designed synthetic dataset can replace on-field acquired data for training the DNN-based cry detector. In this way, a massive data collection campaign in NICUs can be avoided, and the cry detector can be easily retargeted to different NICUs. The paper presents different solutions based on single-channel and multi-channel DNNs. The experimental evaluation is conducted on the synthetic dataset created by simulating the acoustic scene of a real NICU, and on a real dataset containing audio acquired on the same NICU. The evaluation revealed that using real data in the training phase allows achieving the overall highest performance, with an Area Under Precision-Recall Curve (PRC-AUC) equal to 87.28%, when signals are processed with a beamformer and a post-filter and a single-channel DNN is used. The same method, however, reduces the performance to 70.61% when training is performed on the synthetic dataset. On the contrary, under the same conditions, the new single-channel architecture introduced in this paper achieves the highest performance with a PRC-AUC equal to 80.48%, thus proving that the acoustic scene simulation strategy can be used to train a cry detection method with positive results
Improving knowledge distillation for non-intrusive load monitoring through explainability guided learning
Knowledge distillation (KD) is a machine learning technique widely used in recent years for the task of domain adaptation and complexity reduction. It relies on a Student-Teacher mechanism to transfer the knowledge of a large and complex Teacher network into a smaller Student model. Given the inherent complexity of large Deep Neural Network (DNN) models, and the need for deployment on edge devices with limited resources, complexity reduction techniques have become a hot topic in the Non-intrusive Load Monitoring (NILM) community. Recent literature in NILM has devoted increased effort to domain adaptation and architecture reduction via KD. However, the mechanism behind the transfer of knowledge from the Teacher to the Student is not clearly understood. In this work, we aim to address the aforementioned issue by placing the KD NILM approach in a framework of explainable AI (XAI). We identify the main inconsistency in the transfer of explainable knowledge, and exploit this information to propose a method for improvement of KD through explainability guided learning. We evaluate our approach on a variety of appliances and domain adaptation scenarios and demonstrate that solving inconsistencies in the transfer of explainable knowledge can lead to improvement in predictive performance
Knowledge distillation for scalable non-intrusive load monitoring
Smart meters allow the grid to interface with individual buildings and extract detailed consumption information using Non-Intrusive Load Monitoring (NILM) algorithms applied to the acquired data. Deep Neural Networks, which represent the state-of-the-art for NILM, are affected by scalability issues since they require high computational and memory resources, and by reduced performance when training and target domains mismatched. This paper proposes a knowledge distillation approach for NILM, in particular for multi-label appliance classification, to reduce model complexity and improve generalisation on unseen data domains. The approach uses weak supervision to reduce labelling effort, which is useful in practical scenarios. Experiments, conducted on UK-DALE and REFIT datasets, demonstrated that a low-complexity network can be obtained for deployment on edge devices while maintaining high performance on unseen data domains. The proposed approach outperformed benchmark methods in unseen target domains achieving a F1-score 0.14 higher than a benchmark model 78 times more complex
A weakly supervised active learning framework for non-intrusive load monitoring
Energy efficiency is at a critical point now with rising energy prices and decarbonisation of the residential sector to meet the global NetZero agenda. Non-Intrusive Load Monitoring is a software-based technique to monitor individual appliances inside a building from a single aggregate meter reading and recent approaches are based on supervised deep learning. Such approaches are affected by practical constraints related to labelled data collection, particularly when a pre-trained model is deployed in an unknown target environment and needs to be adapted to the new data domain. In this case, transfer learning is usually adopted and the end-user is directly involved in the labelling process. Unlike previous literature, we propose a combined weakly supervised and active learning approach to reduce the quantity of data to be labelled and the end user effort in providing the labels. We demonstrate the efficacy of our method comparing it to a transfer learning approach based on weak supervision. Our method reduces the quantity of weakly annotated data required by up to 82.6 - 98.5% in four target domains while improving the appliance classification performance
Optical constants modelling in silicon nitride membrane transiently excited by EUV radiation.
We hereby report on a set of transient optical reflectivity and transmissivity measurements performed on silicon nitride thin membranes excited by extreme ultraviolet (EUV) radiation from a free electron laser (FEL). Experimental data were acquired as a function of the membrane thickness, FEL fluence and probe polarization. The time dependence of the refractive index, retrieved using Jones matrix formalism, encodes the dynamics of electron and lattice excitation following the FEL interaction. The observed dynamics are interpreted in the framework of a two temperature model, which permits to extract the relevant time scales and magnitudes of the processes. We also found that in order to explain the experimental data thermo-optical effects and inter-band filling must be phenomenologically added to the model
Task-Aware Separation for the DCASE 2020 Task 4 Sound Event Detection and Separation Challenge
International audienceSource Separation is often used as a pre-processing step in many signal-processing tasks. In this work we propose a novel approach for combined Source Separation and Sound Event Detection in which a Source Separation algorithm is used to enhance the Sound Even-Detection back-end performance. In particular, we present a permutation-invariant training scheme for optimizing the Source Separation system directly with the back-end Sound Event Detection objective without requiring joint training or fine-tuning of the two systems. We show that such an approach has significant advantages over the more standard approach of training the Source Separation system separately using only a Source Separation based objective such as Scale-Invariant Signal-To-Distortion Ratio. On the 2020 Detection and Classification of Acoustic Scenes and Events Task 4 Challenge our proposed approach is able to outperform the baseline source separation system by more than one percent in event-based macro F1 score on the development set with significantly less computational requirements
- …