963 research outputs found

    A Survey on Metric Learning for Feature Vectors and Structured Data

    Full text link
    The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

    Visual Knowledge Discovery with General Line Coordinates

    Full text link
    Understanding black-box Machine Learning methods on multidimensional data is a key challenge in Machine Learning. While many powerful Machine Learning methods already exist, these methods are often unexplainable or perform poorly on complex data. This paper proposes visual knowledge discovery approaches based on several forms of lossless General Line Coordinates. These are an expansion of the previously introduced General Line Coordinates Linear and Dynamic Scaffolding Coordinates to produce, explain, and visualize non-linear classifiers with explanation rules. To ensure these non-linear models and rules are accurate, General Line Coordinates Linear also developed new interactive visual knowledge discovery algorithms for finding worst-case validation splits. These expansions are General Line Coordinates non-linear, interactive rules linear, hyperblock rules linear, and worst-case linear. Experiments across multiple benchmark datasets show that this visual knowledge discovery method can compete with other visual and computational Machine Learning algorithms while improving both interpretability and accuracy in linear and non-linear classifications. Major benefits from these expansions consist of the ability to build accurate and highly interpretable models and rules from hyperblocks, the ability to analyze interpretability weaknesses in a model, and the input of expert knowledge through interactive and human-guided visual knowledge discovery methods.Comment: 44 pages, 26 figures, 3 table

    Unsupervised neural networks as a support tool for pathology diagnosis in MALDI-MSI experiments:A case study on thyroid biopsies

    Get PDF
    Artificial intelligence is getting a foothold in medicine for disease screening and diagnosis. While typical machine learning methods require large labeled datasets for training and validation, their application is limited in clinical fields since ground truth information can hardly be obtained on a sizeable cohort of patients. Unsupervised neural networks - such as Self-Organizing Maps (SOMs) - represent an alternative approach to identifying hidden patterns in biomedical data. Here we investigate the feasibility of SOMs for the identification of malignant and non-malignant regions in liquid biopsies of thyroid nodules, on a patient-specific basis. MALDI-ToF (Matrix Assisted Laser Desorption Ionization -Time of Flight) mass spectrometry-imaging (MSI) was used to measure the spectral profile of bioptic samples. SOMs were then applied for the analysis of MALDI-MSI data of individual patients' samples, also testing various pre-processing and agglomerative clustering methods to investigate their impact on SOMs' discrimination efficacy. The final clustering was compared against the sample's probability to be malignant, hyperplastic or related to Hashimoto thyroiditis as quantified by multinomial regression with LASSO. Our results show that SOMs are effective in separating the areas of a sample containing benign cells from those containing malignant cells. Moreover, they allow to overlap the different areas of cytological glass slides with the corresponding proteomic profile image, and inspect the specific weight of every cellular component in bioptic samples. We envision that this approach could represent an effective means to assist pathologists in diagnostic tasks, avoiding the need to manually annotate cytological images and the effort in creating labeled datasets

    Improving Human Face Recognition Using Deep Learning Based Image Registration And Multi-Classifier Approaches

    Get PDF
    Face detection, registration, and recognition have become a fascinating field for researchers. The motivation behind the enormous interest in the topic is the need to improve the accuracy of many real-time applications. Countless methodologies have been acknowledged and presented in the past years. The complexity of the human face visual and the significant changes based on different effects make it more challenging to design as well as implementing a powerful computational system for object recognition in addition to human face recognition. Using supervised learning often requires extensive training for the computer which results in high execution times. It is an essential step in the face recognition to apply strong preprocessing approaches such as face registration to achieve a high recognition accuracy rate. Although there are exist approaches do both detection and recognition, we believe the absence of a complete end-to-end system capable of performing recognition from an arbitrary scene is in large part due to the difficulty in alignment. Often, the face registration is ignored, with the assumption that the detector will perform a rough alignment, leading to suboptimal recognition performance. In this research, we presented an enhanced approach to improve human face recognition using a back-propagation neural network (BPNN) and features extraction based on the correlation between the training images. A key contribution of this paper is the generation of a new set called the T-Dataset from the original training data set, which is used to train the BPNN. We generated the T-Dataset using the correlation between the training images without using a common technique of image density. The correlated T-Dataset provides a high distinction layer between the training images, which helps the BPNN to converge faster and achieve better accuracy. Data and features reduction is essential in the face recognition process, and researchers have recently focused on the modern neural network. Therefore, we used using a classical conventional Principal Component Analysis (PCA) and Local Binary Patterns (LBP) to prove that there is a potential improvement even using traditional methods. We applied five distance measurement algorithms and then combined them to obtain the T-Dataset, which we fed into the BPNN. We achieved higher face recognition accuracy with less computational cost compared with the current approach by using reduced image features. We test the proposed framework on two small data sets, the YALE and AT&T data sets, as the ground truth. We achieved tremendous accuracy. Furthermore, we evaluate our method on one of the state-of-the-art benchmark data sets, Labeled Faces in the Wild (LFW), where we produce a competitive face recognition performance. In addition, we presented an enhanced framework to improve the face registration using deep learning model. We used deep architectures such as VGG16 and VGG19 to train our method. We trained our model to learn the transformation parameters (Rotation, scaling, and shifting). By leaning the transformation parameters, we will able to transfer the image back to the frontal domain. We used the LFW dataset to evaluate our method, and we achieve high accuracy

    Semantic Segmentation of Remote-Sensing Images Through Fully Convolutional Neural Networks and Hierarchical Probabilistic Graphical Models

    Get PDF
    Deep learning (DL) is currently the dominant approach to image classification and segmentation, but the performances of DL methods are remarkably influenced by the quantity and quality of the ground truth (GT) used for training. In this article, a DL method is presented to deal with the semantic segmentation of very-high-resolution (VHR) remote-sensing data in the case of scarce GT. The main idea is to combine a specific type of deep convolutional neural networks (CNNs), namely fully convolutional networks (FCNs), with probabilistic graphical models (PGMs). Our method takes advantage of the intrinsic multiscale behavior of FCNs to deal with multiscale data representations and to connect them to a hierarchical Markov model (e.g., making use of a quadtree). As a consequence, the spatial information present in the data is better exploited, allowing a reduced sensitivity to GT incompleteness to be obtained. The marginal posterior mode (MPM) criterion is used for inference in the proposed framework. To assess the capabilities of the proposed method, the experimental validation is conducted with the ISPRS 2D Semantic Labeling Challenge datasets on the cities of Vaihingen and Potsdam, with some modifications to simulate the spatially sparse GTs that are common in real remote-sensing applications. The results are quite significant, as the proposed approach exhibits a higher producer accuracy than the standard FCNs considered and especially mitigates the impact of scarce GTs on minority classes and small spatial details

    Self-Supervised Visual Representation Learning with Semantic Grouping

    Full text link
    In this paper, we tackle the problem of learning visual representations from unlabeled scene-centric data. Existing works have demonstrated the potential of utilizing the underlying complex structure within scene-centric data; still, they commonly rely on hand-crafted objectness priors or specialized pretext tasks to build a learning framework, which may harm generalizability. Instead, we propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning. The semantic grouping is performed by assigning pixels to a set of learnable prototypes, which can adapt to each sample by attentive pooling over the feature and form new slots. Based on the learned data-dependent slots, a contrastive objective is employed for representation learning, which enhances the discriminability of features, and conversely facilitates grouping semantically coherent pixels together. Compared with previous efforts, by simultaneously optimizing the two coupled objectives of semantic grouping and contrastive learning, our approach bypasses the disadvantages of hand-crafted priors and is able to learn object/group-level representations from scene-centric images. Experiments show our approach effectively decomposes complex scenes into semantic groups for feature learning and significantly benefits downstream tasks, including object detection, instance segmentation, and semantic segmentation. Code is available at: https://github.com/CVMI-Lab/SlotCon.Comment: Accepted at NeurIPS 202

    Fast DD-classification of functional data

    Full text link
    A fast nonparametric procedure for classifying functional data is introduced. It consists of a two-step transformation of the original data plus a classifier operating on a low-dimensional hypercube. The functional data are first mapped into a finite-dimensional location-slope space and then transformed by a multivariate depth function into the DDDD-plot, which is a subset of the unit hypercube. This transformation yields a new notion of depth for functional data. Three alternative depth functions are employed for this, as well as two rules for the final classification on [0,1]q[0,1]^q. The resulting classifier has to be cross-validated over a small range of parameters only, which is restricted by a Vapnik-Cervonenkis bound. The entire methodology does not involve smoothing techniques, is completely nonparametric and allows to achieve Bayes optimality under standard distributional settings. It is robust, efficiently computable, and has been implemented in an R environment. Applicability of the new approach is demonstrated by simulations as well as a benchmark study

    Deep Hashing Based on Class-Discriminated Neighborhood Embedding

    Get PDF
    Deep-hashing methods have drawn significant attention during the past years in the field of remote sensing (RS) owing to their prominent capabilities for capturing the semantics from complex RS scenes and generating the associated hash codes in an end-to-end manner. Most existing deep-hashing methods exploit pairwise and triplet losses to learn the hash codes with the preservation of semantic-similarities which require the construction of image pairs and triplets based on supervised information (e.g., class labels). However, the learned Hamming spaces based on these losses may not be optimal due to an insufficient sampling of image pairs and triplets for scalable RS archives. To solve this limitation, we propose a new deep-hashing technique based on the class-discriminated neighborhood embedding, which can properly capture the locality structures among the RS scenes and distinguish images class-wisely in the Hamming space. An extensive experimentation has been conducted in order to validate the effectiveness of the proposed method by comparing it with several state-of-the-art conventional and deep-hashing methods. The related codes of this article will be made publicly available for reproducible research by the community

    Implementation and Evaluation of Acoustic Distance Measures for Syllables

    Get PDF
    Munier C. Implementation and Evaluation of Acoustic Distance Measures for Syllables. Bielefeld (Germany): Bielefeld University; 2011.In dieser Arbeit werden verschiedene akustische Ähnlichkeitsmaße für Silben motiviert und anschließend evaluiert. Der Mahalanobisabstand als lokales Abstandsmaß für einen Dynamic-Time-Warping-Ansatz zum Messen von akustischen Abständen hat die Fähigkeit, Silben zu unterscheiden. Als solcher erlaubt er die Klassifizierung von Silben mit einer Genauigkeit, die für die Klassifizierung von kleinen akustischen Einheiten üblich ist (60 Prozent für eine Nächster-Nachbar-Klassifizierung auf einem Satz von zehn Silben für Samples eines einzelnen Sprechers). Dieses Maß kann durch verschiedene Techniken verbessert werden, die jedoch seine Ausführungsgeschwindigkeit verschlechtern (Benutzen von mehr Mischverteilungskomponenten für die Schätzung von Kovarianzen auf einer Gaußschen Mischverteilung, Benutzen von voll besetzten Kovarianzmatrizen anstelle von diagonalen Kovarianzmatrizen). Durch experimentelle Evaluierung wird deutlich, dass ein gut funktionierender Algorithmus zur Silbensegmentierung, welcher eine akkurate Schätzung von Silbengrenzen erlaubt, für die korrekte Berechnung von akustischen Abständen durch die in dieser Arbeit entwickelten Ähnlichkeitsmaße unabdingbar ist. Weitere Ansätze für Ähnlichkeitsmaße, die durch ihre Anwendung in der Timbre-Klassifizierung von Musikstücken motiviert sind, zeigen keine adäquate Fähigkeit zur Silbenunterscheidung.In this work, several acoustic similarity measures for syllables are motivated and successively evaluated. The Mahalanobis distance as local distance measure for a dynamic time warping approach to measure acoustic distances is a measure that is able to discriminate syllables and thus allows for syllable classification with an accuracy that is common to the classification of small acoustic units (60 percent for a nearest neighbor classification of a set of ten syllables using samples of a single speaker). This measure can be improved using several techniques that however impair the execution speed of the distance measure (usage of more mixture density components for the estimation of covariances from a Gaussian mixture model, usage of fully occupied covariance matrices instead of diagonal covariance matrices). Through experimental evaluation it becomes evident that a decently working syllable segmentation algorithm allowing for accurate syllable border estimations is essential to the correct computation of acoustic distances by the similarity measures developed in this work. Further approaches for similarity measures which are motivated by their usage in timbre classification of music pieces do not show adequate syllable discrimination abilities
    corecore