13 research outputs found
Probabilistic Point Cloud Modeling via Self-Organizing Gaussian Mixture Models
This letter presents a continuous probabilistic modeling methodology for
spatial point cloud data using finite Gaussian Mixture Models (GMMs) where the
number of components are adapted based on the scene complexity. Few
hierarchical and adaptive methods have been proposed to address the challenge
of balancing model fidelity with size. Instead, state-of-the-art mapping
approaches require tuning parameters for specific use cases, but do not
generalize across diverse environments. To address this gap, we utilize a
self-organizing principle from information-theoretic learning to automatically
adapt the complexity of the GMM model based on the relevant information in the
sensor data. The approach is evaluated against existing point cloud modeling
techniques on real-world data with varying degrees of scene complexity.Comment: 8 pages, 6 figures, to appear in IEEE Robotics and Automation Letter
Incremental Multimodal Surface Mapping via Self-Organizing Gaussian Mixture Models
This letter describes an incremental multimodal surface mapping methodology,
which represents the environment as a continuous probabilistic model. This
model enables high-resolution reconstruction while simultaneously compressing
spatial and intensity point cloud data. The strategy employed in this work
utilizes Gaussian mixture models (GMMs) to represent the environment. While
prior GMM-based mapping works have developed methodologies to determine the
number of mixture components using information-theoretic techniques, these
approaches either operate on individual sensor observations, making them
unsuitable for incremental mapping, or are not real-time viable, especially for
applications where high-fidelity modeling is required. To bridge this gap, this
letter introduces a spatial hash map for rapid GMM submap extraction combined
with an approach to determine relevant and redundant data in a point cloud.
These contributions increase computational speed by an order of magnitude
compared to state-of-the-art incremental GMM-based mapping. In addition, the
proposed approach yields a superior tradeoff in map accuracy and size when
compared to state-of-the-art mapping methodologies (both GMM- and not
GMM-based). Evaluations are conducted using both simulated and real-world data.
The software is released open-source to benefit the robotics community.Comment: 7 pages, 7 figures, under review at IEEE Robotics and Automation
Letter
Kontextsensitive Modellhierarchien für Quantifizierung der höherdimensionalen Unsicherheit
We formulate four novel context-aware algorithms based on model hierarchies aimed to enable an efficient quantification of uncertainty in complex, computationally expensive problems, such as fluid-structure interaction and plasma microinstability simulations. Our results show that our algorithms are more efficient than standard approaches and that they are able to cope with the challenges of quantifying uncertainty in higher-dimensional, complex problems.Wir formulieren vier kontextsensitive Algorithmen auf der Grundlage von Modellhierarchien um eine effiziente Quantifizierung der Unsicherheit bei komplexen, rechenintensiven Problemen zu ermöglichen, wie Fluid-Struktur-Wechselwirkungs- und Plasma-Mikroinstabilitätssimulationen. Unsere Ergebnisse zeigen, dass unsere Algorithmen effizienter als Standardansätze sind und die Herausforderungen der Quantifizierung der Unsicherheit in höherdimensionalen, komplexen Problemen bewältigen können
GMMap: Memory-Efficient Continuous Occupancy Map Using Gaussian Mixture Model
Energy consumption of memory accesses dominates the compute energy in
energy-constrained robots which require a compact 3D map of the environment to
achieve autonomy. Recent mapping frameworks only focused on reducing the map
size while incurring significant memory usage during map construction due to
multi-pass processing of each depth image. In this work, we present a
memory-efficient continuous occupancy map, named GMMap, that accurately models
the 3D environment using a Gaussian Mixture Model (GMM). Memory-efficient GMMap
construction is enabled by the single-pass compression of depth images into
local GMMs which are directly fused together into a globally-consistent map. By
extending Gaussian Mixture Regression to model unexplored regions, occupancy
probability is directly computed from Gaussians. Using a low-power ARM Cortex
A57 CPU, GMMap can be constructed in real-time at up to 60 images per second.
Compared with prior works, GMMap maintains high accuracy while reducing the map
size by at least 56%, memory overhead by at least 88%, DRAM access by at least
78%, and energy consumption by at least 69%. Thus, GMMap enables real-time 3D
mapping on energy-constrained robots.Comment: 15 pages, 9 figure
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
By combining metal nodes with organic linkers we can potentially synthesize
millions of possible metal organic frameworks (MOFs). At present, we have
libraries of over ten thousand synthesized materials and millions of in-silico
predicted materials. The fact that we have so many materials opens many
exciting avenues to tailor make a material that is optimal for a given
application. However, from an experimental and computational point of view we
simply have too many materials to screen using brute-force techniques. In this
review, we show that having so many materials allows us to use big-data methods
as a powerful technique to study these materials and to discover complex
correlations. The first part of the review gives an introduction to the
principles of big-data science. We emphasize the importance of data collection,
methods to augment small data sets, how to select appropriate training sets. An
important part of this review are the different approaches that are used to
represent these materials in feature space. The review also includes a general
overview of the different ML techniques, but as most applications in porous
materials use supervised ML our review is focused on the different approaches
for supervised ML. In particular, we review the different method to optimize
the ML process and how to quantify the performance of the different methods. In
the second part, we review how the different approaches of ML have been applied
to porous materials. In particular, we discuss applications in the field of gas
storage and separation, the stability of these materials, their electronic
properties, and their synthesis. The range of topics illustrates the large
variety of topics that can be studied with big-data science. Given the
increasing interest of the scientific community in ML, we expect this list to
rapidly expand in the coming years.Comment: Editorial changes (typos fixed, minor adjustments to figures
2022 Review of Data-Driven Plasma Science
Data-driven science and technology offer transformative tools and methods to science. This review article highlights the latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS), i.e., plasma science whose progress is driven strongly by data and data analyses. Plasma is considered to be the most ubiquitous form of observable matter in the universe. Data associated with plasmas can, therefore, cover extremely large spatial and temporal scales, and often provide essential information for other scientific disciplines. Thanks to the latest technological developments, plasma experiments, observations, and computation now produce a large amount of data that can no longer be analyzed or interpreted manually. This trend now necessitates a highly sophisticated use of high-performance computers for data analyses, making artificial intelligence and machine learning vital components of DDPS. This article contains seven primary sections, in addition to the introduction and summary. Following an overview of fundamental data-driven science, five other sections cover widely studied topics of plasma science and technologies, i.e., basic plasma physics and laboratory experiments, magnetic confinement fusion, inertial confinement fusion and high-energy-density physics, space and astronomical plasmas, and plasma technologies for industrial and other applications. The final section before the summary discusses plasma-related databases that could significantly contribute to DDPS. Each primary section starts with a brief introduction to the topic, discusses the state-of-the-art developments in the use of data and/or data-scientific approaches, and presents the summary and outlook. Despite the recent impressive signs of progress, the DDPS is still in its infancy. This article attempts to offer a broad perspective on the development of this field and identify where further innovations are required
General-purpose Information-theoretical Bayesian Optimisation:A thesis by acronyms
Bayesian optimisation (BO) is an increasingly popular strategy for optimising functions with substantial query costs. By sequentially focusing evaluation resources into promising areas of the search space, BO is able to find reasonable solutions within heavily restricted evaluation budgets. Consequently, BO has become the de-facto approach for fine-tuning the hyper-parameters of machine learning models and has had numerous successful applications in industry and across the experimental sciences.This thesis seeks to increase the scope of information-theoretic BO, a popular class of search strategies that regularly achieves state-of-the-art optimisation. Unfortunately,current information-theoretic BO routines require sophisticated approximation schemes that incur substantially large computational overheads and are, therefore, applicable only to optimisation problems defined over low-dimensional and Euclidean search spaces. This thesis proposes information-theoretic approximations that extend theMax-value Entropy Search of Wang and Jegelka (2017) to a much wider class of optimisation tasks, including noisy, batch and multi-fidelity optimisation across both Euclidean and highly-structured discrete spaces. To comprehensively test our proposed search strategies, we construct novel frameworks for performing BO over the highly-structured string spaces that arise in synthetic gene design and molecular search problems, as well as for objective functions with controllable observation noise. Finally,we demonstrate the real-world applicability of BO as part of a sophisticated machine learning pipeline for fine-tuning multi-speaker text-to-speech models
Analyse de sensibilité des incertitudes paramétriques dans les évaluations d’aléas géotechniques
Epistemic uncertainty can be reduced via additional lab or in site measurements or additional numerical simulations. We focused here on parameter uncertainty: this corresponds to the incomplete knowledge of the correct setting of the input parameters (like values of soil properties) of the model supporting the geo-hazard assessment. A possible option tomanage it is via sensitivity analysis, which aims at identifying the contribution (i.e. the importance) of the different input parameters in the uncertainty on the final hazard outcome. For this purpose, advanced techniques exist, namely variance-based global sensitivity analysis. Yet, their practical implementation faces three major limitations related to the specificities of the geo-hazard domain: 1. the large computation time cost (several hours if not days) of numerical models; 2. the parameters are complex functions of time and space; 3. data are often scarce, limited if not vague. In the present PhD thesis, statistical approaches were developed, tested and adapted to overcome those limits. A special attention was paid to test the feasibility of those statistical tools by confronting them to real cases (natural hazards related to earthquakes, cavities and landslides).Les incertitudes épistémiques peuvent être réduites via des études supplémentaires (mesures labo, in situ, ou modélisations numériques, etc.). Nous nous concentrons ici sur celle "paramétrique" liée aux difficultés à évaluer quantitativement les paramètres d’entrée du modèle utilisé pour l’analysedes aléas géotechniques. Une stratégie de gestion possible est l’analyse de sensibilité, qui consiste à identifier la contribution (i.e. l’importance) des paramètres dans l’incertitude de l’évaluation de l’aléa. Des approches avancées existent pour conduire une telle analyse. Toutefois, leur applicationau domaine des aléas géotechniques se confronte à plusieurs contraintes : 1. le coût calculatoire des modèles numériques (plusieurs heures voire jours) ; 2. les paramètres sont souvent des fonctions complexes du temps et de l’espace ; 3. les données sont souvent limitées, imprécises voire vagues. Danscette thèse, nous avons testé et adapté des outils statistiques pour surmonter ces limites. Une attention toute particulière a été portée sur le test de faisabilité de ces procédures et sur la confrontation à des cas réels (aléas naturels liés aux séismes, cavités et glissements de terrain)