19 research outputs found

    Distributing Recognition in Computational Paralinguistics

    Get PDF

    Sound Representation and Classification Benchmark for Domestic Robots

    Get PDF
    International audienceWe address the problem of sound representation and classification and present results of a comparative study in the context of a domestic robotic scenario. A dataset of sounds was recorded in realistic conditions (background noise, presence of several sound sources, reverberations, etc.) using the humanoid robot NAO. An extended benchmark is carried out to test a variety of representations combined with several classifiers. We provide results obtained with the annotated dataset and we assess the methods quantitatively on the basis of their classification scores, computation times and memory requirements. The annotated dataset is publicly available at https://team.inria.fr/perception/nard/

    Complex queries and complex data

    Get PDF
    With the widespread availability of wearable computers, equipped with sensors such as GPS or cameras, and with the ubiquitous presence of micro-blogging platforms, social media sites and digital marketplaces, data can be collected and shared on a massive scale. A necessary building block for taking advantage from this vast amount of information are efficient and effective similarity search algorithms that are able to find objects in a database which are similar to a query object. Due to the general applicability of similarity search over different data types and applications, the formalization of this concept and the development of strategies for evaluating similarity queries has evolved to an important field of research in the database community, spatio-temporal database community, and others, such as information retrieval and computer vision. This thesis concentrates on a special instance of similarity queries, namely k-Nearest Neighbor (kNN) Queries and their close relative, Reverse k-Nearest Neighbor (RkNN) Queries. As a first contribution we provide an in-depth analysis of the RkNN join. While the problem of reverse nearest neighbor queries has received a vast amount of research interest, the problem of performing such queries in a bulk has not seen an in-depth analysis so far. We first formalize the RkNN join, identifying its monochromatic and bichromatic versions and their self-join variants. After pinpointing the monochromatic RkNN join as an important and interesting instance, we develop solutions for this class, including a self-pruning and a mutual pruning algorithm. We then evaluate these algorithms extensively on a variety of synthetic and real datasets. From this starting point of similarity queries on certain data we shift our focus to uncertain data, addressing nearest neighbor queries in uncertain spatio-temporal databases. Starting from the traditional definition of nearest neighbor queries and a data model for uncertain spatio-temporal data, we develop efficient query mechanisms that consider temporal dependencies during query evaluation. We define intuitive query semantics, aiming not only at returning the objects closest to the query but also their probability of being a nearest neighbor. After theoretically evaluating these query predicates we develop efficient querying algorithms for the proposed query predicates. Given the findings of this research on nearest neighbor queries, we extend these results to reverse nearest neighbor queries. Finally we address the problem of querying large datasets containing set-based objects, namely image databases, where images are represented by (multi-)sets of vectors and additional metadata describing the position of features in the image. We aim at reducing the number of kNN queries performed during query processing and evaluate a modified pipeline that aims at optimizing the query accuracy at a small number of kNN queries. Additionally, as feature representations in object recognition are moving more and more from the real-valued domain to the binary domain, we evaluate efficient indexing techniques for binary feature vectors.Nicht nur durch die Verbreitung von tragbaren Computern, die mit einer Vielzahl von Sensoren wie GPS oder Kameras ausgestattet sind, sondern auch durch die breite Nutzung von Microblogging-Plattformen, Social-Media Websites und digitale Marktplätze wie Amazon und Ebay wird durch die User eine gigantische Menge an Daten veröffentlicht. Um aus diesen Daten einen Mehrwert erzeugen zu können bedarf es effizienter und effektiver Algorithmen zur Ähnlichkeitssuche, die zu einem gegebenen Anfrageobjekt ähnliche Objekte in einer Datenbank identifiziert. Durch die Allgemeinheit dieses Konzeptes der Ähnlichkeit über unterschiedliche Datentypen und Anwendungen hinweg hat sich die Ähnlichkeitssuche zu einem wichtigen Forschungsfeld, nicht nur im Datenbankumfeld oder im Bereich raum-zeitlicher Datenbanken, sondern auch in anderen Forschungsgebieten wie dem Information Retrieval oder dem Maschinellen Sehen entwickelt. In der vorliegenden Arbeit beschäftigen wir uns mit einem speziellen Anfrageprädikat im Bereich der Ähnlichkeitsanfragen, mit k-nächste Nachbarn (kNN) Anfragen und ihrem Verwandten, den Revers k-nächsten Nachbarn (RkNN) Anfragen. In einem ersten Beitrag analysieren wir den RkNN Join. Obwohl das Problem von reverse nächsten Nachbar Anfragen in den letzten Jahren eine breite Aufmerksamkeit in der Forschungsgemeinschaft erfahren hat, wurde das Problem eine Menge von RkNN Anfragen gleichzeitig auszuführen nicht ausreichend analysiert. Aus diesem Grund formalisieren wir das Problem des RkNN Joins mit seinen monochromatischen und bichromatischen Varianten. Wir identifizieren den monochromatischen RkNN Join als einen wichtigen und interessanten Fall und entwickeln entsprechende Anfragealgorithmen. In einer detaillierten Evaluation vergleichen wir die ausgearbeiteten Verfahren auf einer Vielzahl von synthetischen und realen Datensätzen. Nach diesem Kapitel über Ähnlichkeitssuche auf sicheren Daten konzentrieren wir uns auf unsichere Daten, speziell im Bereich raum-zeitlicher Datenbanken. Ausgehend von der traditionellen Definition von Nachbarschaftsanfragen und einem Datenmodell für unsichere raum-zeitliche Daten entwickeln wir effiziente Anfrageverfahren, die zeitliche Abhängigkeiten bei der Anfragebearbeitung beachten. Zu diesem Zweck definieren wir Anfrageprädikate die nicht nur die Objekte zurückzugeben, die dem Anfrageobjekt am nächsten sind, sondern auch die Wahrscheinlichkeit mit der sie ein nächster Nachbar sind. Wir evaluieren die definierten Anfrageprädikate theoretisch und entwickeln effiziente Anfragestrategien, die eine Anfragebearbeitung zu vertretbaren Laufzeiten gewährleisten. Ausgehend von den Ergebnissen für Nachbarschaftsanfragen erweitern wir unsere Ergebnisse auf Reverse Nachbarschaftsanfragen. Zuletzt behandeln wir das Problem der Anfragebearbeitung bei Mengen-basierten Objekten, die zum Beispiel in Bilddatenbanken Verwendung finden: Oft werden Bilder durch eine Menge von Merkmalsvektoren und zusätzliche Metadaten (zum Beispiel die Position der Merkmale im Bild) dargestellt. Wir evaluieren eine modifizierte Pipeline, die darauf abzielt, die Anfragegenauigkeit bei einer kleinen Anzahl an kNN-Anfragen zu maximieren. Da reellwertige Merkmalsvektoren im Bereich der Objekterkennung immer öfter durch Bitvektoren ersetzt werden, die sich durch einen geringeren Speicherplatzbedarf und höhere Laufzeiteffizienz auszeichnen, evaluieren wir außerdem Indexierungsverfahren für Binärvektoren

    FEATURE EXTRACTION AND CLASSIFICATION THROUGH ENTROPY MEASURES

    Get PDF
    Entropy is a universal concept that represents the uncertainty of a series of random events. The notion \u201centropy" is differently understood in different disciplines. In physics, it represents the thermodynamical state variable; in statistics it measures the degree of disorder. On the other hand, in computer science, it is used as a powerful tool for measuring the regularity (or complexity) in signals or time series. In this work, we have studied entropy based features in the context of signal processing. The purpose of feature extraction is to select the relevant features from an entity. The type of features depends on the signal characteristics and classification purpose. Many real world signals are nonlinear and nonstationary and they contain information that cannot be described by time and frequency domain parameters, instead they might be described well by entropy. However, in practice, estimation of entropy suffers from some limitations and is highly dependent on series length. To reduce this dependence, we have proposed parametric estimation of various entropy indices and have derived analytical expressions (when possible) as well. Then we have studied the feasibility of parametric estimations of entropy measures on both synthetic and real signals. The entropy based features have been finally employed for classification problems related to clinical applications, activity recognition, and handwritten character recognition. Thus, from a methodological point of view our study deals with feature extraction, machine learning, and classification methods. The different versions of entropy measures are found in the literature for signals analysis. Among them, approximate entropy (ApEn), sample entropy (SampEn) followed by corrected conditional entropy (CcEn) are mostly used for physiological signals analysis. Recently, entropy features are used also for image segmentation. A related measure of entropy is Lempel-Ziv complexity (LZC), which measures the complexity of a time-series, signal, or sequences. The estimation of LZC also relies on the series length. In particular, in this study, analytical expressions have been derived for ApEn, SampEn, and CcEn of an auto-regressive (AR) models. It should be mentioned that AR models have been employed for maximum entropy spectral estimation since many years. The feasibility of parametric estimates of these entropy measures have been studied on both synthetic series and real data. In feasibility study, the agreement between numeral estimates of entropy and estimates obtained through a certain number of realizations of the AR model using Montecarlo simulations has been observed. This agreement or disagreement provides information about nonlinearity, nonstationarity, or nonGaussinaity presents in the series. In some classification problems, the probability of agreement or disagreement have been proved as one of the most relevant features. VII After feasibility study of the parametric entropy estimates, the entropy and related measures have been applied in heart rate and arterial blood pressure variability analysis. The use of entropy and related features have been proved more relevant in developing sleep classification, handwritten character recognition, and physical activity recognition systems. The novel methods for feature extraction researched in this thesis give a good classification or recognition accuracy, in many cases superior to the features reported in the literature of concerned application domains, even with less computational costs

    Algorithms and Systems for IoT and Edge Computing

    Get PDF
    The idea of distributing the signal processing along the path that starts with the acquisition and ends with the final application has given light to the Internet of Things and Edge Computing, which have demonstrated several advantages in terms of scalability, costs, and reliability. In this dissertation, we focus on designing and implementing algorithms and systems that allow performing a complex task on devices with limited resources. Firstly, we assess the trade-off between compression and anomaly detection from both a theoretical and a practical point of view. Information theory provides the rate-distortion analysis that is extended to consider how information content is processed for detection purposes. Considering an actual Structural Health Monitoring application, two corner cases are analysed: detection in high distortion based on a feature extraction method and detection with low distortion based on Principal Component Analysis. Secondly, we focus on streaming methods for Subspace Analysis. In this context, we revise and study state-of-the-art methods to target devices with limited computational resources. We also consider a real case of deployment of an algorithm for streaming Principal Component Analysis for signal compression in a Structural Health Monitoring application, discussing the trade-off between the possible implementation strategies. Finally, we focus on an alternative compression framework suited for low-end devices that is Compressed Sensing. We propose a different decoding approach that splits the recovery problem into two stages and effectively adopts a deep neural network and basic linear algebra to reconstruct biomedical signals. This novel approach outperforms the state-of-the-art in terms of quality of reconstruction and requires lower computational resources

    Polarization and Spatial Coupling:Two Techniques to Boost Performance

    Get PDF
    During the last two decades we have witnessed considerable activity in building bridges between the fields of information theory/communications, computer science, and statistical physics. This is due to the realization that many fundamental concepts and notions in these fields are in fact related and that each field can benefit from the insight and techniques developed in the others. For instance, the notion of channel capacity in information theory, threshold phenomena in computer science, and phase transitions in statistical physics are all expressions of the same concept. Therefore, it would be beneficial to develop a common framework that unifies these notions and that could help to leverage knowledge in one field to make progress in the others. A particularly striking example is the celebrated belief propagation algorithm. It was independently invented in each of these fields but for very different purposes. The realization of the commonality has benefited each of the areas. We investigate polarization and spatial coupling: two techniques that were originally invented in the context of channel coding (communications) thus resulting for the first time in efficient capacity-achieving codes for a wide range of channels. As we will discuss, both techniques play a fundamental role also in computer science and statistical physics and so these two techniques can be seen as further fundamental building blocks that unite all three areas. We demonstrate applications of these techniques, as well as the fundamental phenomena they provide. In more detail, this thesis consists of two parts. In the first part, we consider the technique of polarization and its resultant class of channel codes, called polar codes. Our main focus is the analysis and improvement of the behavior of polarization towards the most significant aspects of modern channel-coding theory: scaling laws, universality, and complexity (quantization). For each of these aspects, we derive fundamental laws that govern the behavior of polarization and polar codes. Even though we concentrate on applications in communications, the analysis that we provide is general and can be carried over to applications of polarization in computer science and statistical physics. As we will show, our investigations confirm some of the inherent strengths of polar codes such as their robustness with respect to quantization. But they also make clear in which aspects further improvement of polar codes is needed. For example, we will explain that the scaling behavior of polar codes is quite slow compared to the optimal one. Hence, further research is required in order to enhance the scaling behavior of polar codes towards optimality. In the second part of this thesis, we investigate spatial coupling. By now, there exists already a considerable literature on spatial coupling in the realm of information theory and communications. We therefore investigate mainly the impact of spatial coupling on the fields of statistical physics and computer science. We consider two well-known models. The first is the Curie-Weiss model that provides us with the simplest model for understanding the mechanism of spatial coupling in the perspective of statistical physics. Many fundamental features of spatial coupling can be simply explained here. In particular, we will show how the well-known Maxwell construction in statistical physics manifests itself through spatial coupling. We then focus on a much richer class of graphical models called constraint satisfaction problems (CSP) (e.g., K-SAT and Q-COL). These models are central to computer science. We follow a general framework: First, we introduce interpolation procedures for proving that the coupled and standard (un-coupled) models are fundamentally related, in that their static properties (such as their SAT/UNSAT threshold) are the same. We then use tools from spin glass theory (cavity method) to demonstrate the so-called phenomenon of threshold saturation in these coupled models. Finally, we present the algorithmic implications and argue that all these features provide a new avenue for obtaining better, provable, algorithmic lower bounds on static thresholds of the individual standard CSP models. We consider simple decimation algorithms (e.g., the unit clause propagation algorithm) for the coupled CSP models and provide a machinery to analyze these algorithms. These analyses enable us to observe that the algorithmic thresholds on the coupled model are significantly improved over the standard model. For some models (e.g., 3-SAT, 3-COL), these coupled algorithmic thresholds surpass the best lower bounds on the SAT/UNSAT threshold in the literature and provide us with a new lower bound. We conclude by pointing out that although we only considered some specific graphical models, our results are of general nature hence applicable to a broad set of models. In particular, a main contribution of this thesis is to firmly establish both polarization, as well as spatial coupling, in the common toolbox of information theory/communication, statistical physics, and computer science

    Model-based Analysis and Processing of Speech and Audio Signals

    Get PDF
    corecore