48,041 research outputs found

    A data-driven functional projection approach for the selection of feature ranges in spectra with ICA or cluster analysis

    Get PDF
    Prediction problems from spectra are largely encountered in chemometry. In addition to accurate predictions, it is often needed to extract information about which wavelengths in the spectra contribute in an effective way to the quality of the prediction. This implies to select wavelengths (or wavelength intervals), a problem associated to variable selection. In this paper, it is shown how this problem may be tackled in the specific case of smooth (for example infrared) spectra. The functional character of the spectra (their smoothness) is taken into account through a functional variable projection procedure. Contrarily to standard approaches, the projection is performed on a basis that is driven by the spectra themselves, in order to best fit their characteristics. The methodology is illustrated by two examples of functional projection, using Independent Component Analysis and functional variable clustering, respectively. The performances on two standard infrared spectra benchmarks are illustrated.Comment: A paraitr

    Nonparametric Hierarchical Clustering of Functional Data

    Full text link
    In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set

    Cluster-based reduced-order modelling of a mixing layer

    Full text link
    We propose a novel cluster-based reduced-order modelling (CROM) strategy of unsteady flows. CROM combines the cluster analysis pioneered in Gunzburger's group (Burkardt et al. 2006) and and transition matrix models introduced in fluid dynamics in Eckhardt's group (Schneider et al. 2007). CROM constitutes a potential alternative to POD models and generalises the Ulam-Galerkin method classically used in dynamical systems to determine a finite-rank approximation of the Perron-Frobenius operator. The proposed strategy processes a time-resolved sequence of flow snapshots in two steps. First, the snapshot data are clustered into a small number of representative states, called centroids, in the state space. These centroids partition the state space in complementary non-overlapping regions (centroidal Voronoi cells). Departing from the standard algorithm, the probabilities of the clusters are determined, and the states are sorted by analysis of the transition matrix. Secondly, the transitions between the states are dynamically modelled using a Markov process. Physical mechanisms are then distilled by a refined analysis of the Markov process, e.g. using finite-time Lyapunov exponent and entropic methods. This CROM framework is applied to the Lorenz attractor (as illustrative example), to velocity fields of the spatially evolving incompressible mixing layer and the three-dimensional turbulent wake of a bluff body. For these examples, CROM is shown to identify non-trivial quasi-attractors and transition processes in an unsupervised manner. CROM has numerous potential applications for the systematic identification of physical mechanisms of complex dynamics, for comparison of flow evolution models, for the identification of precursors to desirable and undesirable events, and for flow control applications exploiting nonlinear actuation dynamics.Comment: 48 pages, 30 figures. Revised version with additional material. Accepted for publication in Journal of Fluid Mechanic

    On Interpretability of Deep Learning based Skin Lesion Classifiers using Concept Activation Vectors

    Full text link
    Deep learning based medical image classifiers have shown remarkable prowess in various application areas like ophthalmology, dermatology, pathology, and radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD) systems in real clinical setups is severely limited primarily because their decision-making process remains largely obscure. This work aims at elucidating a deep learning based medical image classifier by verifying that the model learns and utilizes similar disease-related concepts as described and employed by dermatologists. We used a well-trained and high performing neural network developed by REasoning for COmplex Data (RECOD) Lab for classification of three skin tumours, i.e. Melanocytic Naevi, Melanoma and Seborrheic Keratosis and performed a detailed analysis on its latent space. Two well established and publicly available skin disease datasets, PH2 and derm7pt, are used for experimentation. Human understandable concepts are mapped to RECOD image classification model with the help of Concept Activation Vectors (CAVs), introducing a novel training and significance testing paradigm for CAVs. Our results on an independent evaluation set clearly shows that the classifier learns and encodes human understandable concepts in its latent representation. Additionally, TCAV scores (Testing with CAVs) suggest that the neural network indeed makes use of disease-related concepts in the correct way when making predictions. We anticipate that this work can not only increase confidence of medical practitioners on CAD but also serve as a stepping stone for further development of CAV-based neural network interpretation methods.Comment: Accepted for the IEEE International Joint Conference on Neural Networks (IJCNN) 202

    Explicit diversification of event aspects for temporal summarization

    Get PDF
    During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness

    Genes Suggest Ancestral Colour Polymorphisms Are Shared across Morphologically Cryptic Species in Arctic Bumblebees

    Get PDF
    email Suzanne orcd idCopyright: © 2015 Williams et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
    • …
    corecore