1,011 research outputs found

    Neural Networks for Complex Data

    Full text link
    Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris

    Median topographic maps for biomedical data sets

    Full text link
    Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis

    A Hybrid Templated-Based Composite Classification System

    Get PDF
    An automatic target classification system contains a classifier which reads a feature as an input and outputs a class label. Typically, the feature is a vector of real numbers. Other features can be non-numeric, such as a string of symbols or alphabets. One method of improving the performance of an automatic classification system is through combining two or more independent classifiers that are complementary in nature. Complementary classifiers are observed by finding an optimal method for partitioning the problem space. For example, the individual classifiers may operate to identify specific objects. Another method may be to use classifiers that operate on different features. We propose a design for a hybrid composite classification system, which exploits both real-numbered and non-numeric features with a template matching classification scheme. This composite classification system is made up of two independent classification systems.These two independent classification systems, which receive input from two separate sensors are then combined over various fusion methods for the purpose of target identification. By using these two separate classifiers, we explore conditions that allow the two techniques to be complementary in nature, thus improving the overall performance of the classification system. We examine various fusion techniques, in search of the technique that generates the best results. We investigate different parameter spaces and fusion rules on example problems to demonstrate our classification system. Our examples consider various application areas to help further demonstrate the utility of our classifier. Optimal classifier performance is obtained using a mathematical framework, which takes into account decision variables based on decision-maker preferences and/or engineering specifications, depending upon the classification problem at hand

    Advances in pre-processing and model generation for mass spectrometric data analysis

    Get PDF
    The analysis of complex signals as obtained by mass spectrometric measurements is complicated and needs an appropriate representation of the data. Thereby the kind of preprocessing, feature extraction as well as the used similarity measure are of particular importance. Focusing on biomarker analysis and taking the functional nature of the data into account this task is even more complicated. A new mass spectrometry tailored data preprocessing is shown, discussed and analyzed in a clinical proteom study compared to a standard setting

    On the Significance of Distance in Machine Learning

    Get PDF
    Avstandsbegrepet er grunnleggende i maskinlæring. Hvordan vi velger å måle avstand har betydning, men det er ofte utfordrende å finne et passende avstandsmål. Metrisk læring kan brukes til å lære funksjoner som implementerer avstand eller avstandslignende mål. Vanlige dyplæringsmodeller er sårbare for modifikasjoner av input som har til hensikt å lure modellen (adversarial examples, motstridende eksempler). Konstruksjon av modeller som er robuste mot denne typen angrep er av stor betydning for å kunne utnytte maskinlæringsmodeller i større skala, og et passende avstandsmål kan brukes til å studere slik motstandsdyktighet. Ofte eksisterer det hierarkiske relasjoner blant klasser, og disse relasjonene kan da representeres av den hierarkiske avstanden til klasser. I klassifiseringsproblemer som må ta i betraktning disse klasserelasjonene, kan hierarkiinformert klassifisering brukes. Jeg har utviklet en metode kalt /distance-ratio/-basert (DR) metrisk læring. I motsetning til den formuleringen som normalt anvendes har DR-formuleringen to gunstige egenskaper. For det første er det skala-invariant med hensyn til rommet det projiseres til. For det andre har optimale klassekonfidensverdier på klasserepresentantene. Dersom rommet for å konstruere modifikasjoner er tilstrekklig stort, vil man med standard adversarial accuracy (SAA, standard motstridende nøyaktighet) risikere at naturlige datapunkter blir betraktet som motstridende eksempler. Dette kan være en årsak til SAA ofte går på bekostning av nøyaktighet. For å løse dette problemet har jeg utviklet en ny definisjon på motstridende nøyaktighet kalt Voronoi-epsilon adversarial accuracy (VAA, Voronoi-epsilon motstridende nøyaktighet). VAA utvider studiet av lokal robusthet til global robusthet. Klassehierarkisk informasjon er ikke tilgjengelig for alle datasett. For å håndtere denne utfordringen har jeg undersøkt om klassifikasjonsbaserte metriske læringsmodeller kan brukes til å utlede klassehierarkiet. Videre har jeg undersøkt de mulige effektene av robusthet på feature space (egenskapsrom). Jeg fant da at avstandsstrukturen til et egenskapsrom trent for robusthet har større likhet med avstandsstrukturen i rådata enn et egenskapsrom trent uten robusthet.The notion of distance is fundamental in machine learning. The choice of distance matters, but it is often challenging to find an appropriate distance. Metric learning can be used for learning distance(-like) functions. Common deep learning models are vulnerable to the adversarial modification of inputs. Devising adversarially robust models is of immense importance for the wide deployment of machine learning models, and distance can be used for the study of adversarial robustness. Often, hierarchical relationships exist among classes, and these relationships can be represented by the hierarchical distance of classes. For classification problems that must take these class relationships into account, hierarchy-informed classification can be used. I propose distance-ratio-based (DR) formulation for metric learning. In contrast to the commonly used formulation, DR formulation has two favorable properties. First, it is invariant of the scale of an embedding. Secondly, it has optimal class confidence values on class representatives. For a large perturbation budget, standard adversarial accuracy (SAA) allows natural data points to be considered as adversarial examples. This could be a reason for the tradeoff between accuracy and SAA. To resolve the issue, I proposed a new definition of adversarial accuracy named Voronoi-epsilon adversarial accuracy (VAA). VAA extends the study of local robustness to global robustness. Class hierarchical information is not available for all datasets. To handle this challenge, I investigated whether classification-based metric learning models can be used to infer class hierarchy. Furthermore, I explored the possible effects of adversarial robustness on feature space. I found that the distance structure of robustly trained feature space resembles that of input space to a greater extent than does standard trained feature space.Doktorgradsavhandlin

    Latent variable methods for visualization through time

    Get PDF
    corecore