22 research outputs found

    Improved data visualisation through multiple dissimilarity modelling

    Get PDF
    Popular dimension reduction and visualisation algorithms rely on the assumption that input dissimilarities are typically Euclidean, for instance Metric Multidimensional Scaling, t-distributed Stochastic Neighbour Embedding and the Gaussian Process Latent Variable Model. It is well known that this assumption does not hold for most datasets and often high-dimensional data sits upon a manifold of unknown global geometry. We present a method for improving the manifold charting process, coupled with Elastic MDS, such that we no longer assume that the manifold is Euclidean, or of any particular structure. We draw on the benefits of different dissimilarity measures allowing for the relative responsibilities, under a linear combination, to drive the visualisation process

    Improved data visualisation through nonlinear dissimilarity modelling

    Get PDF
    Inherent to state-of-the-art dimension reduction algorithms is the assumption that global distances between observations are Euclidean, despite the potential for altogether non-Euclidean data manifolds. We demonstrate that a non-Euclidean manifold chart can be approximated by implementing a universal approximator over a dictionary of dissimilarity measures, building on recent developments in the field. This approach is transferable across domains such that observations can be vectors, distributions, graphs and time series for instance. Our novel dissimilarity learning method is illustrated with four standard visualisation datasets showing the benefits over the linear dissimilarity learning approach

    Estimation and Visualization of Digital Library Content Similarities

    Get PDF
    We present a semantic similarity-based recommender service. Our experimental application and validation domain consists of K-12 engineering learning resources. Given a learning resource, we must determine which educational standards it addresses and vice versa, find resources that align with a given standard. One approach to this problem suggests transitively inferring standard alignment from the semantic similarity of other, previously aligned resources. We investigate a bigram-based similarity estimator and a Sammon map-based user interface for visualizing the resulting similarity space. Validation was performed using resources in TeachEngineering.org, a K-12 STEM digital library. Target classifications were derived from author-generated tables of content for these resources. Testing shows good performance of the similarity measure, both in its correspondence to the collection’s table of contents and in the form of a two-dimensional Sammon map. The results provide evidence for the feasibility and practicality of using automated similarity measures in standards alignment and similar problems

    Bearing fault diagnosis by EXIN CCA

    Get PDF
    EXIN CCA is an extension of the Curvilinear Component Analysis (CCA), which solves for the noninvariant CCA projection and allows representing data drawn under different operating conditions. It can be applied to data visualization, interpretation (as a kind of sensor of the underlying physical phenomenon) and classification for real time industrial applications. Here an example is given for bearing fault diagnostics in an electromechanical device.Peer ReviewedPostprint (published version

    Reservoir computing and data visualisation

    Get PDF
    We consider the problem of visualisation of high dimensional multivariate time series. A data analyst in creating a two dimensional projection of such a time series might hope to gain some intuition into the structure of the original high dimensional data set. We review a method for visualising time series data using an extension of Echo State Networks (ESNs). The method uses the multidimensional scaling criterion in order to create a visualisation of the time series after its representation in the reservoir of the ESN. We illustrate the method with two dimensional maps of a financial time series. The method is then compared with a mapping which uses a fixed latent space and a novel objective function
    corecore