889 research outputs found
Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices
Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements.
This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
A reduced order modeling methodology for the parametric estimation and optimization of aviation noise
The successful mitigation of aviation noise is one of the key enablers of sustainable aviation growth. Technological improvements for noise reduction at the source have been countered by increasing number of operations at most airports. There are several consequences of aviation noise including direct health effects, effects on human and non-human environments, and economic costs. Several mitigation strategies exist including reduction of noise at source, land-use planning and management, noise abatement operational procedures, and operating restrictions. Most noise management programs at airports use a combination of such mitigation measures. To assess the efficacy of noise mitigation measures, a robust modeling and simulation capability is required. Due to the large number of factors which can influence aviation noise metrics, current state-of-the-art tools rely on physics-based and semi-empirical models. These models help in accurately predicting noise metrics in a wide range of scenarios; however, they are computationally expensive to evaluate. Therefore, current noise mitigation studies are limited to singular applications such as annual average day noise quantification. Many-query applications such as parametric trade-off analyses and optimization remain elusive with the current generation of tools and methods.
There are several efforts documented in literature which attempt to speed up the process using surrogate models. Techniques include the use of pre-computed noise grids with calibration models for non-standard conditions. These techniques are typically predicated on simplifying assumptions which greatly limit the applicability of such models. Simplifying assumptions are needed to downsize the number influencing factors to be modeled and make the problem tractable. Existing efforts also suffer due to the inclusion of categorical variables for operational profiles which are not conducive to surrogate modeling.
In this research, a methodology is developed to address the inherent complexities of the noise quantification process, and thus enable rapid noise modeling capabilities which can facilitate parametric trade-off analysis and optimization efforts. To achieve this objective, a research plan is developed and executed to address two major gaps in literature. First, a parametric representation of operational profiles is proposed to replace existing categorical descriptions. A technique is developed to allow real-world flight data to be efficiently mapped onto this parametric definition. A trajectory clustering method is used to group similar flights and representative flights are parametrized using an inverse-map of an aircraft performance model. Next, a field surrogate modeling method is developed based on Model Order Reduction techniques to reduce the high dimensionality of computed noise metric results. This greatly reduces the complexity of data to be modeled, and thus enables rapid noise quantification. With these two gaps addressed, the overall methodology is developed for rapid noise quantification and optimization. This methodology is demonstrated on a case study where a large number of real-world flight trajectories are efficiently modeled for their noise results. As each such flight trajectory has a unique representation, and typically lacks thrust information, such noise modeling is not computationally feasible with existing methods and tools. The developed parametric representations and field surrogate modeling capabilities enable such an application.Ph.D
Mesoscopic Physics of Quantum Systems and Neural Networks
We study three different kinds of mesoscopic systems – in the intermediate region between macroscopic and microscopic scales consisting of many interacting constituents:
We consider particle entanglement in one-dimensional chains of interacting fermions. By employing a field theoretical bosonization calculation, we obtain the one-particle entanglement entropy in the ground state and its time evolution after an interaction quantum quench which causes relaxation towards non-equilibrium steady states. By pushing the boundaries of the numerical exact diagonalization and density matrix renormalization group computations, we are able to accurately scale to the thermodynamic limit where we make contact to the analytic field theory model. This allows to fix an interaction cutoff required in the continuum bosonization calculation to account for the short range interaction of the lattice model, such that the bosonization result provides accurate predictions for the one-body reduced density matrix in the Luttinger liquid phase.
Establishing a better understanding of how to control entanglement in mesoscopic systems is also crucial for building qubits for a quantum computer. We further study a popular scalable qubit architecture that is based on Majorana zero modes in topological superconductors. The two major challenges with realizing Majorana qubits currently lie in trivial pseudo-Majorana states that mimic signatures of the topological bound states and in strong disorder in the proposed topological hybrid systems that destroys the topological phase. We study coherent transport through interferometers with a Majorana wire embedded into one arm.
By combining analytical and numerical considerations, we explain the occurrence of an amplitude maximum as a function of the Zeeman field at the onset of the topological phase – a signature unique to MZMs – which has recently been measured experimentally [Whiticar et al., Nature Communications, 11(1):3212, 2020]. By placing an array of gates in proximity to the nanowire, we made a fruitful connection to the field of Machine Learning by using the CMA-ES algorithm to tune the gate voltages in order to maximize the amplitude of coherent transmission. We find that the algorithm is capable of learning disorder profiles and even to restore Majorana modes that were fully destroyed by strong disorder by optimizing a feasible number of gates.
Deep neural networks are another popular machine learning approach which not only has many direct applications to physical systems but which also behaves similarly to physical mesoscopic systems. In order to comprehend the effects of the complex dynamics from the training, we employ Random Matrix Theory (RMT) as a zero-information hypothesis: before training, the weights are randomly initialized and therefore are perfectly described by RMT. After training, we attribute deviations from these predictions to learned information in the weight matrices.
Conducting a careful numerical analysis, we verify that the spectra of weight matrices consists of a random bulk and a few important large singular values and corresponding vectors that carry almost all learned information. By further adding label noise to the training data, we find that more singular values in intermediate parts of the spectrum contribute by fitting the randomly labeled images. Based on these observations, we propose a noise filtering algorithm that both removes the singular values storing the noise and reverts the level repulsion of the large singular values due to the random bulk
Proceedings of the 8th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2023)
This volume gathers the papers presented at the Detection and Classification of Acoustic Scenes and Events 2023 Workshop (DCASE2023), Tampere, Finland, during 21–22 September 2023
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
SuperCDMS HVeV Run 2 Low-Mass Dark Matter Search, Highly Multiplexed Phonon-Mediated Particle Detector with Kinetic Inductance Detector, and the Blackbody Radiation in Cryogenic Experiments
There is ample evidence of dark matter (DM), a phenomenon responsible for ≈ 85% of the matter content of the Universe that cannot be explained by the Standard Model (SM). One of the most compelling hypotheses is that DM consists of beyond-SM particle(s) that are nonluminous and nonbaryonic. So far, numerous efforts have been made to search for particle DM, and yet none has yielded an unambiguous observation of DM particles.
We present in Chapter 2 the SuperCDMS HVeV Run 2 experiment, where we search for DM in the mass ranges of 0.5--10⁴ MeV/c² for the electron-recoil DM and 1.2--50 eV/c² for the dark photon and the Axion-like particle (ALP). SuperCDMS utilizes cryogenic crystals as detectors to search for DM interaction with the crystal atoms. The interaction is detected in the form of recoil energy mediated by phonons. In the HVeV project, we look for electron recoil, where we enhance the signal by the Neganov-Trofimov-Luke effect under high-voltage biases. The technique enabled us to detect quantized e⁻h⁺ creation at a 3% ionization energy resolution. Our work is the first DM search analysis considering charge trapping and impact ionization effects for solid-state detectors. We report our results as upper limits for the assumed particle models as functions of DM mass. Our results exclude the DM-electron scattering cross section, the dark photon kinetic mixing parameter, and the ALP axioelectric coupling above 8.4 x 10⁻³⁴ cm², 3.3 x 10⁻¹⁴, and 1.0 x 10⁻⁹, respectively.
Currently every SuperCDMS detector is equipped with a few phonon sensors based on the transition-edge sensor (TES) technology. In order to improve phonon-mediated particle detectors' background rejection performance, we are developing highly multiplexed detectors utilizing kinetic inductance detectors (KIDs) as phonon sensors. This work is detailed in chapter 3 and chapter 4. We have improved our previous KID and readout line designs, which enabled us to produce our first ø3" detector with 80 phonon sensors. The detector yielded a frequency placement accuracy of 0.07%, indicating our capability of implementing hundreds of phonon sensors in a typical SuperCDMS-style detector. We detail our fabrication technique for simultaneously employing Al and Nb for the KID circuit. We explain our signal model that includes extracting the RF signal, calibrating the RF signal into pair-breaking energy, and then the pulse detection. We summarize our noise condition and develop models for different noise sources. We combine the signal and the noise models to be an energy resolution model for KID-based phonon-mediated detectors. From this model, we propose strategies to further improve future detectors' energy resolution and introduce our ongoing implementations.
Blackbody (BB) radiation is one of the plausible background sources responsible for the low-energy background currently preventing low-threshold DM experiments to search for lower DM mass ranges. In Chapter 5, we present our study for such background for cryogenic experiments. We have developed physical models and, based on the models, simulation tools for BB radiation propagation as photons or waves. We have also developed a theoretical model for BB photons' interaction with semiconductor impurities, which is one of the possible channels for generating the leakage current background in SuperCDMS-style detectors. We have planned for an experiment to calibrate our simulation and leakage current generation model. For the experiment, we have developed a specialized ``mesh TES'' photon detector inspired by cosmic microwave background experiments. We present its sensitivity model, the radiation source developed for the calibration, and the general plan of the experiment.</p
Elastic shape analysis of geometric objects with complex structures and partial correspondences
In this dissertation, we address the development of elastic shape analysis frameworks for the registration, comparison and statistical shape analysis of geometric objects with complex topological structures and partial correspondences. In particular, we introduce a variational framework and several numerical algorithms for the estimation of geodesics and distances induced by higher-order elastic Sobolev metrics on the space of parametrized and unparametrized curves and surfaces. We extend our framework to the setting of shape graphs (i.e., geometric objects with branching structures where each branch is a curve) and surfaces with complex topological structures and partial correspondences. To do so, we leverage the flexibility of varifold fidelity metrics in order to augment our geometric objects with a spatially-varying weight function, which in turn enables us to indirectly model topological changes and handle partial matching constraints via the estimation of vanishing weights within the registration process. In the setting of shape graphs, we prove the existence of solutions to the relaxed registration problem with weights, which is the main theoretical contribution of this thesis. In the setting of surfaces, we leverage our surface matching algorithms to develop a comprehensive collection of numerical routines for the statistical shape analysis of sets of 3D surfaces, which includes algorithms to compute Karcher means, perform dimensionality reduction via multidimensional scaling and tangent principal component analysis, and estimate parallel transport across surfaces (possibly with partial matching constraints).
Moreover, we also address the development of numerical shape analysis pipelines for large-scale data-driven applications with geometric objects. Towards this end, we introduce a supervised deep learning framework to compute the square-root velocity (SRV) distance for curves. Our trained network provides fast and accurate estimates of the SRV distance between pairs of geometric curves, without the need to find optimal reparametrizations. As a proof of concept for the suitability of such approaches in practical contexts, we use it to perform optical character recognition (OCR), achieving comparable performance in terms of computational speed and accuracy to other existing OCR methods.
Lastly, we address the difficulty of extracting high quality shape structures from imaging data in the field of astronomy. To do so, we present a state-of-the-art expectation-maximization approach for the challenging task of multi-frame astronomical image deconvolution and super-resolution. We leverage our approach to obtain a high-fidelity reconstruction of the night sky, from which high quality shape data can be extracted using appropriate segmentation and photometric techniques
Eddy current defect response analysis using sum of Gaussian methods
This dissertation is a study of methods to automatedly detect and produce approximations of eddy current differential coil defect signatures in terms of a summed collection of Gaussian functions (SoG). Datasets consisting of varying material, defect size, inspection frequency, and coil diameter were investigated. Dimensionally reduced representations of the defect responses were obtained utilizing common existing reduction methods and novel enhancements to them utilizing SoG Representations. Efficacy of the SoG enhanced representations were studied utilizing common Machine Learning (ML) interpretable classifier designs with the SoG representations indicating significant improvement of common analysis metrics
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction
Multi-modal regression is important in forecasting nonstationary processes or
with a complex mixture of distributions. It can be tackled with multiple
hypotheses frameworks but with the difficulty of combining them efficiently in
a learning model. A Structured Radial Basis Function Network is presented as an
ensemble of multiple hypotheses predictors for regression problems. The
predictors are regression models of any type that can form centroidal Voronoi
tessellations which are a function of their losses during training. It is
proved that this structured model can efficiently interpolate this tessellation
and approximate the multiple hypotheses target distribution and is equivalent
to interpolating the meta-loss of the predictors, the loss being a zero set of
the interpolation error. This model has a fixed-point iteration algorithm
between the predictors and the centers of the basis functions. Diversity in
learning can be controlled parametrically by truncating the tessellation
formation with the losses of individual predictors. A closed-form solution with
least-squares is presented, which to the authors knowledge, is the fastest
solution in the literature for multiple hypotheses and structured predictions.
Superior generalization performance and computational efficiency is achieved
using only two-layer neural networks as predictors controlling diversity as a
key component of success. A gradient-descent approach is introduced which is
loss-agnostic regarding the predictors. The expected value for the loss of the
structured model with Gaussian basis functions is computed, finding that
correlation between predictors is not an appropriate tool for diversification.
The experiments show outperformance with respect to the top competitors in the
literature.Comment: 63 Pages, 40 Figure
- …