127 research outputs found

    A Survey on Metric Learning for Feature Vectors and Structured Data

    Full text link
    The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

    Review on Human Re-identification with Multiple Cameras

    Get PDF
    Human re-identification is the core task in most surveillance systems and it is aimed at matching human pairs from different non-overlapping cameras. There are several challenging issues that need to be overcome to achieve reidentification, such as overcoming the variations in viewpoint, pose, image resolution, illumination and occlusion. In this study, we review existing works in human re-identification task. Advantages and limitations of recent works are discussed. At the end, this paper suggests some future research directions for human re-identification

    Development of a machine learning based methodology for bridge health monitoring

    Get PDF
    Tesi en modalitat de compendi de publicacionsIn recent years the scientific community has been developing new techniques in structural health monitoring (SHM) to identify the damages in civil structures specially in bridges. The bridge health monitoring (BHM) systems serve to reduce overall life-cycle maintenance costs for bridges, as their main objective is to prevent catastrophic failures and damages. In the BHM using dynamic data, there are several problems related to the post-processing of the vibration signals such as: (i) when the modal-based dynamic features like natural frequencies, modes shape and damping are used, they present a limitation in relation to damage location, since they are based on a global response of the structure; (ii) presence of noise in the measurement of vibration responses; (iii) inadequate use of existing algorithms for damage feature extraction because of neglecting the non-linearity and non-stationarity of the recorded signals; (iv) environmental and operational conditions can also generate false damage detections in bridges; (v) the drawbacks of traditional algorithms for processing large amounts of data obtained from the BHM. This thesis proposes new vibration-based parameters and methods with focus on damage detection, localization and quantification, considering a mixed robust methodology that includes signal processing and machine learning methods to solve the identified problems. The increasing volume of bridge monitoring data makes it interesting to study the ability of advanced tools and systems to extract useful information from dynamic and static variables. In the field of Machine Learning (ML) and Artificial Intelligence (AI), powerful algorithms have been developed to face problems where the amount of data is much larger (big data). The possibilities of ML techniques (unsupervised algorithms) were analyzed here in bridges taking into account both operational and environmental conditions. A critical literature review was performed and a deep study of the accuracy and performance of a set of algorithms for detecting damage in three real bridges and one numerical model. In the literature review inherent to the vibration-based damage detection, several state-of-the-art methods have been studied that do not consider the nature of the data and the characteristics of the applied excitation (possible non-linearity, non-stationarity, presence or absence of environmental and/or operational effects) and the noise level of the sensors. Besides, most research uses modal-based damage characteristics that have some limitations. A poor data normalization is performed by the majority of methods and both operational and environmental variability is not properly accounted for. Likewise, the huge amount of data recorded requires automatic procedures with proven capacity to reduce the possibility of false alarms. On the other hand, many investigations have limitations since only numerical or laboratory cases are studied. Therefore, a methodology is proposed by the combination of several algorithms to avoid them. The conclusions show a robust methodology based on ML algorithms capable to detect, localize and quantify damage. It allows the engineers to verify bridges and anticipate significant structural damage when occurs. Moreover, the proposed non-modal parameters show their feasibility as damage features using ambient and forced vibrations. Hilbert-Huang Transform (HHT) in conjunction with Marginal Hilbert Spectrum and Instantaneous Phase Difference shows a great capability to analyze the nonlinear and nonstationary response signals for damage identification under operational conditions. The proposed strategy combines algorithms for signal processing (ICEEMDAN and HHT) and ML (k-means) to conduct damage detection and localization in bridges by using the traffic-induced vibration data in real-time operation.En los últimos años la comunidad científica ha desarrollado nuevas técnicas en monitoreo de salud estructural (SHM) para identificar los daños en estructuras civiles especialmente en puentes. Los sistemas de monitoreo de puentes (BHM) sirven para reducir los costos generales de mantenimiento del ciclo de vida, ya que su principal objetivo es prevenir daños y fallas catastróficas. En el BHM que utiliza datos dinámicos, existen varios problemas relacionados con el procesamiento posterior de las señales de vibración, tales como: (i) cuando se utilizan características dinámicas modales como frecuencias naturales, formas de modos y amortiguamiento, presentan una limitación en relación con la localización del daño, ya que se basan en una respuesta global de la estructura; (ii) presencia de ruido en la medición de las respuestas de vibración; (iii) uso inadecuado de los algoritmos existentes para la extracción de características de daño debido a la no linealidad y la no estacionariedad de las señales registradas; (iv) las condiciones ambientales y operativas también pueden generar falsas detecciones de daños en los puentes; (v) los inconvenientes de los algoritmos tradicionales para procesar grandes cantidades de datos obtenidos del BHM. Esta tesis propone nuevos parámetros y métodos basados en vibraciones con enfoque en la detección, localización y cuantificación de daños, considerando una metodología robusta que incluye métodos de procesamiento de señales y aprendizaje automático. El creciente volumen de datos de monitoreo de puentes hace que sea interesante estudiar la capacidad de herramientas y sistemas avanzados para extraer información útil de variables dinámicas y estáticas. En el campo del Machine Learning (ML) y la Inteligencia Artificial (IA) se han desarrollado potentes algoritmos para afrontar problemas donde la cantidad de datos es mucho mayor (big data). Aquí se analizaron las posibilidades de las técnicas ML (algoritmos no supervisados) teniendo en cuenta tanto las condiciones operativas como ambientales. Se realizó una revisión crítica de la literatura y se llevó a cabo un estudio profundo de la precisión y el rendimiento de un conjunto de algoritmos para la detección de daños en tres puentes reales y un modelo numérico. En la revisión de literatura se han estudiado varios métodos que no consideran la naturaleza de los datos y las características de la excitación aplicada (posible no linealidad, no estacionariedad, presencia o ausencia de efectos ambientales y/u operativos) y el nivel de ruido de los sensores. Además, la mayoría de las investigaciones utilizan características de daño modales que tienen algunas limitaciones. Estos métodos realizan una normalización deficiente de los datos y no se tiene en cuenta la variabilidad operativa y ambiental. Asimismo, la gran cantidad de datos registrados requiere de procedimientos automáticos para reducir la posibilidad de falsas alarmas. Por otro lado, muchas investigaciones tienen limitaciones ya que solo se estudian casos numéricos o de laboratorio. Por ello, se propone una metodología mediante la combinación de varios algoritmos. Las conclusiones muestran una metodología robusta basada en algoritmos de ML capaces de detectar, localizar y cuantificar daños. Permite a los ingenieros verificar puentes y anticipar daños estructurales. Además, los parámetros no modales propuestos muestran su viabilidad como características de daño utilizando vibraciones ambientales y forzadas. La Transformada de Hilbert-Huang (HHT) junto con el Espectro Marginal de Hilbert y la Diferencia de Fase Instantánea muestran una gran capacidad para analizar las señales de respuesta no lineales y no estacionarias para la identificación de daños en condiciones operativas. La estrategia propuesta combina algoritmos para el procesamiento de señales (ICEEMDAN y HHT) y ML (k-means) para detectar y localizar daños en puentes mediante el uso de datos de vibraciones inducidas por el tráfico en tiempo real.Postprint (published version

    Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    Get PDF
    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods

    Indexing and knowledge discovery of gaussian mixture models and multiple-instance learning

    Get PDF
    Due to the increasing quantity and variety of generated and stored data, the manual and automatic analysis becomes a more and more challenging task in many modern applications, like biometric identification and content-based image retrieval. In this thesis, we consider two very typical, related inherent structures of objects: Multiple-Instance (MI) objects and Gaussian Mixture Models (GMM). In both approaches, each object is represented by a set. For MI, each object is a set of vectors from a multi-dimensional space. For GMM, each object is a set of multi-variate Gaussian distribution functions, providing the ability to approximate arbitrary distributions in a concise way. Both approaches are very powerful and natural as they allow to express (1) that an object is additively composed from several components or (2) that an object may have several different, alternative kinds of behavior. Thus we can model e.g. an image which may depict a set of different things (1). Likewise, we can model a sports player who has performed differently at different games (2). We can use GMM to approximate MI objects and vice versa. Both ways of approximation can be appealing because GMM are more concise whereas for MI objects the single components are less complex. A similarity measure quantifies similarities between two objects to assess how much alike these objects are. On this basis, indexing and similarity search play essential roles in data mining, providing efficient and/or indispensable supports for a variety of algorithms such as classification and clustering. This thesis aims to solve challenges in the indexing and knowledge discovery of complex data using MI objects and GMM. For the indexing of GMM, there are several techniques available, including universal index structures and GMM-specific methods. However, the well-known approaches either suffer from poor performance or have too many limitations. To make use of the parameterized properties of GMM and tackle the problem of potential unequal length of components, we propose the Gaussian Components based Index (GCI) for efficient queries on GMM. GCI decomposes GMM into their components, and stores the n-lets of Gaussian combinations that have uniform length of parameter vectors in traditional index structures. We introduce an efficient pruning strategy to filter unqualified GMM using the so-called Matching Probability (MP) as the similarity measure. MP sums up the joint probabilities of two objects all over the space. GCI achieves better performance than its competitors on both synthetic and real-world data. To further increase its efficiency, we propose a strategy to store GMM components in a normalized way. This strategy improves the ability of filtering unqualified GMM. Based on the normalized transformation, we derive a set of novel similarity measures for GMM. Since MP is not a metric (i.e., a symmetric, positive definite distance function guaranteeing the triangle inequality), which would be essential for the application of various analysis techniques, we introduce Infinite Euclidean Distance (IED) for probability distribution functions, a metric with a closed-form expression for GMM. IED allows us to store GMM in well-known metric trees like the Vantage-Point tree or M-tree, which facilitate similarity search in sublinear time by exploiting the triangle inequality. Moreover, analysis techniques that require the properties of a metric (e.g. Multidimensional Scaling) can be applied on GMM with IED. For MI objects which are not well-approximated by GMM, we introduce the potential densities of instances for the representation of MI objects. Based on that, two joint Gaussian based measures are proposed for MI objects and we extend GCI on MI objects for efficient queries as well. To sum up, we propose in this thesis a number of novel similarity measures and novel indexing techniques for GMM and MI objects, enabling efficient queries and knowledge discovery on complex data. In a thorough theoretic analysis as well as extensive experiments we demonstrate the superiority of our approaches over the state-of-the-art with respect to the run-time efficiency and the quality of the result.Angesichts der steigenden Quantität und Vielfalt der generierten und gespeicherten Daten werden manuelle und automatisierte Analysen in vielen modernen Anwendungen eine zunehmend anspruchsvolle Aufgabe, wie z.B. biometrische Identifikation und inhaltbasierter Bildzugriff. In dieser Arbeit werden zwei sehr typische und relevante inhärente Strukturen von Objekten behandelt: Multiple-Instance-Objects (MI) und Gaussian Mixture Models (GMM). In beiden Anwendungsfällen wird das Objekt in Form einer Menge dargestellt. Bei MI besteht jedes Objekt aus einer Menge von Vektoren aus einem multidimensionalen Raum. Bei GMM wird jedes Objekt durch eine Menge von multivariaten normalverteilten Dichtefunktionen repräsentiert. Dies bietet die Möglichkeit, beliebige Wahrscheinlichkeitsverteilungen in kompakter Form zu approximieren. Beide Ansätze sind sehr leistungsfähig, denn sie basieren auf einfachsten Ideen: (1) entweder besteht ein Objekt additiv aus mehreren Komponenten oder (2) ein Objekt hat unterschiedliche alternative Verhaltensarten. Dies ermöglicht es uns z.B. ein Bild zu repräsentieren, welches unterschiedliche Objekte und Szenen zeigt (1). In gleicher Weise können wir einen Sportler modellieren, der bei verschiedenen Wettkämpfen unterschiedliche Leistungen gezeigt hat (2). Wir können MI-Objekte durch GMM approximieren und auch der umgekehrte Weg ist möglich. Beide Vorgehensweisen können sehr ansprechend sein, da GMM im Vergleich zu MI kompakter sind, wogegen in MI-Objekten die einzelnen Komponenten weniger Komplexität aufweisen. Ein ähnlichkeitsmaß dient der Quantifikation der Gemeinsamkeit zwischen zwei Objekten. Darauf basierend spielen Indizierung und ähnlichkeitssuche eine wesentliche Rolle für die effiziente Implementierung von einer Vielzahl von Klassifikations- und Clustering-Algorithmen im Bereich des Data Minings. Ziel dieser Arbeit ist es, die Herausforderungen bei Indizierung und Wissensextraktion von komplexen Daten unter Verwendung von MI Objekten und GMM zu bewältigen. Für die Indizierung der GMM stehen verschiedene universelle und GMM-spezifische Indexstrukuren zur Verfügung. Jedoch leiden solche bekannten Ansätze unter schwacher Leistung oder zu vielen Einschränkungen. Um die parametrisieren Eigenschaften der GMM auszunutzen und dem Problem der möglichen ungleichen Komponentenlänge entgegenzuwirken, präsentieren wir das Verfahren Gaussian Components based Index (GCI), welches effizienten Abfrage auf GMM ermöglicht. GCI zerlegt dabei ein GMM in Parameterkomponenten und speichert alle möglichen Kombinationen mit einheitlicher Vektorlänge in traditionellen Indexstrukturen. Wir stellen ein effizientes Pruningverfahren vor, um ungeeignete GMM unter Verwendung der sogenannten Matching Probability (MP) als ähnlichkeitsma\ss auszufiltern. MP errechnet die Summe der gemeinsamen Wahrscheinlichkeit zweier Objekte aus dem gesamten Raum. CGI erzielt bessere Leistung als konkurrierende Verfahren, sowohl in Bezug auf synthetische, als auch auf reale Datensätze. Um ihre Effizienz weiter zu verbessern, stellen wir eine Strategie zur Speicherung der GMM-Komponenten in normalisierter Form vor. Diese Strategie verbessert die Fähigkeit zum Ausfiltern ungeeigneter GMM. Darüber hinaus leiten wir, basierend auf dieser Transformation, neuartige ähnlichkeitsmaße für GMM her. Da MP keine Metrik (d.h. eine symmetrische, positiv definite Distanzfunktion, die die Dreiecksungleichung garantiert) ist, dies jedoch unentbehrlich für die Anwendung mehrerer Analysetechniken ist, führen wir Infinite Euclidean Distance (IED) ein, ein Metrik mit geschlossener Ausdrucksform für GMM. IED erlaubt die Speicherung der GMM in Metrik-Bäumen wie z.B. Vantage-Point Trees oder M-Trees, die die ähnlichkeitssuche in sublinear Zeit mit Hilfe der Dreiecksungleichung erleichtert. Außerdem können Analysetechniken, die die Eigenschaften einer Metrik erfordern (z.B. Multidimensional Scaling), auf GMM mit IED angewandt werden. Für MI-Objekte, die mit GMM nicht in außreichender Qualität approximiert werden können, stellen wir Potential Densities of Instances vor, um MI-Objekte zu repräsentieren. Darauf beruhend werden zwei auf multivariater Gaußverteilungen basierende Maße für MI-Objekte eingeführt. Außerdem erweitern wir GCI für MI-Objekte zur effizienten Abfragen. Zusammenfassend haben wir in dieser Arbeit mehrere neuartige ähnlichkeitsmaße und Indizierungstechniken für GMM- und MI-Objekte vorgestellt. Diese ermöglichen effiziente Abfragen und die Wissensentdeckung in komplexen Daten. Durch eine gründliche theoretische Analyse und durch umfangreiche Experimente demonstrieren wir die überlegenheit unseres Ansatzes gegenüber anderen modernen Ansätzen bezüglich ihrer Laufzeit und Qualität der Resultate

    Co-skeletons:Consistent curve skeletons for shape families

    Get PDF
    We present co-skeletons, a new method that computes consistent curve skeletons for 3D shapes from a given family. We compute co-skeletons in terms of sampling density and semantic relevance, while preserving the desired characteristics of traditional, per-shape curve skeletonization approaches. We take the curve skeletons extracted by traditional approaches for all shapes from a family as input, and compute semantic correlation information of individual skeleton branches to guide an edge-pruning process via skeleton-based descriptors, clustering, and a voting algorithm. Our approach achieves more concise and family-consistent skeletons when compared to traditional per-shape methods. We show the utility of our method by using co-skeletons for shape segmentation and shape blending on real-world data

    Matching sets of features for efficient retrieval and recognition

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 145-153).In numerous domains it is useful to represent a single example by the collection of local features or parts that comprise it. In computer vision in particular, local image features are a powerful way to describe images of objects and scenes. Their stability under variable image conditions is critical for success in a wide range of recognition and retrieval applications. However, many conventional similarity measures and machine learning algorithms assume vector inputs. Comparing and learning from images represented by sets of local features is therefore challenging, since each set may vary in cardinality and its elements lack a meaningful ordering. In this thesis I present computationally efficient techniques to handle comparisons, learning, and indexing with examples represented by sets of features. The primary goal of this research is to design and demonstrate algorithms that can effectively accommodate this useful representation in a way that scales with both the representation size as well as the number of images available for indexing or learning. I introduce the pyramid match algorithm, which efficiently forms an implicit partial matching between two sets of feature vectors.(cont.) The matching has a linear time complexity, naturally forms a Mercer kernel, and is robust to clutter or outlier features, a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. I provide bounds on the expected error relative to the optimal partial matching. For very large databases, even extremely efficient pairwise comparisons may not offer adequately responsive query times. I show how to perform sub-linear time retrievals under the matching measure with randomized hashing techniques, even when input sets have varying numbers of features. My results are focused on several important vision tasks, including applications to content-based image retrieval, discriminative classification for object recognition, kernel regression, and unsupervised learning of categories. I show how the dramatic increase in performance enables accurate and flexible image comparisons to be made on large-scale data sets, and removes the need to artificially limit the number of local descriptions used per image when learning visual categories.by Kristen Lorraine Grauman.Ph.D
    corecore