31 research outputs found

    Multi-perspective embedding for non-metric time series classification

    Get PDF
    The interest in time series analysis is rapidly increasing, providing new challenges for machine learning. Over many decades, Dynamic Time Warping (DTW) is referred to as the de facto standard distance measure for time series and the tool of choice when analyzing such data. Nevertheless, DTW has two major drawbacks: (a) it is non-metric and therefore hard to handle by standard machine learning techniques, and (b) it is not well suited for multi-dimensional time series. For this purpose, we propose a multi-perspective embedding of the time series into a complex-valued vector space and the evaluation by a model that is able to handle complex-valued data. The approach is evaluated on various multi-dimensional time series data and with different classifier techniques

    Complex-valued embeddings of generic proximity data

    Get PDF
    Proximities are at the heart of almost all machine learning methods. If the input data are given as numerical vectors of equal lengths, euclidean distance, or a Hilbertian inner product is frequently used in modeling algorithms. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey particular mathematical properties. This renders many machine learning methods invalid, leading to convergence problems and the loss of guarantees, like generalization bounds. In many cases, the preferred dissimilarity measure is not metric, like the earth mover distance, or the similarity measure may not be a simple inner product in a Hilbert space but in its generalization a Krein space. If the input data are non-vectorial, like text sequences, proximity-based learning is used or ngram embedding techniques can be applied. Standard embeddings lead to the desired fixed-length vector encoding, but are costly and have substantial limitations in preserving the original data's full information. As an information preserving alternative, we propose a complex-valued vector embedding of proximity data. This allows suitable machine learning algorithms to use these fixed-length, complex-valued vectors for further processing. The complex-valued data can serve as an input to complex-valued machine learning algorithms. In particular, we address supervised learning and use extensions of prototype-based learning. The proposed approach is evaluated on a variety of standard benchmarks and shows strong performance compared to traditional techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued embedding, learning vector quantizatio

    Data-Driven Supervised Learning for Life Science Data

    Get PDF
    Life science data are often encoded in a non-standard way by means of alpha-numeric sequences, graph representations, numerical vectors of variable length, or other formats. Domain-specific or data-driven similarity measures like alignment functions have been employed with great success. The vast majority of more complex data analysis algorithms require fixed-length vectorial input data, asking for substantial preprocessing of life science data. Data-driven measures are widely ignored in favor of simple encodings. These preprocessing steps are not always easy to perform nor particularly effective, with a potential loss of information and interpretability. We present some strategies and concepts of how to employ data-driven similarity measures in the life science context and other complex biological systems. In particular, we show how to use data-driven similarity measures effectively in standard learning algorithms

    Complex-valued embeddings of generic proximity data

    Get PDF
    Proximities are at the heart of almost all machine learning methods. If the input data are given as numerical vectors of equal lengths, euclidean distance, or a Hilbertian inner product is frequently used in modeling algorithms. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey particular mathematical properties. This renders many machine learning methods invalid, leading to convergence problems and the loss of guarantees, like generalization bounds. In many cases, the preferred dissimilarity measure is not metric, like the earth mover distance, or the similarity measure may not be a simple inner product in a Hilbert space but in its generalization a Krein space. If the input data are non-vectorial, like text sequences, proximity-based learning is used or ngram embedding techniques can be applied. Standard embeddings lead to the desired fixed-length vector encoding, but are costly and have substantial limitations in preserving the original data's full information. As an information preserving alternative, we propose a complex-valued vector embedding of proximity data. This allows suitable machine learning algorithms to use these fixed-length, complex-valued vectors for further processing. The complex-valued data can serve as an input to complex-valued machine learning algorithms. In particular, we address supervised learning and use extensions of prototype-based learning. The proposed approach is evaluated on a variety of standard benchmarks and shows strong performance compared to traditional techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued embedding, learning vector quantizatio

    Structure Preserving Encoding of Non-euclidean Similarity Data

    Get PDF
    Domain-specific proximity measures, like divergence measures in signal processing or alignment scores in bioinformatics, often lead to non-metric, indefinite similarities or dissimilarities. However, many classical learning algorithms like kernel machines assume metric properties and struggle with such metric violations. For example, the classical support vector machine is no longer able to converge to an optimum. One possible direction to solve the indefiniteness problem is to transform the non-metric (dis-)similarity data into positive (semi-)definite matrices. For this purpose, many approaches have been proposed that adapt the eigenspectrum of the given data such that positive definiteness is ensured. Unfortunately, most of these approaches modify the eigenspectrum in such a strong manner that valuable information is removed or noise is added to the data. In particular, the shift operation has attracted a lot of interest in the past few years despite its frequently reoccurring disadvantages. In this work, we propose a modified advanced shift correction method that enables the preservation of the eigenspectrum structure of the data by means of a low-rank approximated nullspace correction. We compare our advanced shift to classical eigenvalue corrections like eigenvalue clipping, flipping, squaring, and shifting on several benchmark data. The impact of a low-rank approximation on the data’s eigenspectrum is analyzed.</p

    OJS Software Workshop Report

    Get PDF
    This report summarizes the achievements of the OJS community members from Germany and Switzerland in the OJS Workshop in Heidelberg University Library, Germany from February 20 and 21, 2020. Main goal of the workshop was to share knowledge and challenges, conceptualize and document problem solving suggestions and collectively develop software in and around OJS. Participants worked on a variety of subjects including data import/export plugins, search functionality, containerization, long-time archiving and XML workflows in and around OJS and OMP. The workshop is a continuation of fruitful meetings within the German OJS user and developer community under auspices of OJS-de.net networ

    Blockbuster Middle Ages. Proceedings of the Postgraduate Conference Bamberg 2015

    Get PDF
    Im Mittelpunkt des aus einer im Jahr 2015 in Bamberg abgehaltenen Nachwuchstagung hervorgegangenen Bandes steht die Mittelalterrezeption in filmischen und seriellen Großproduktionen. Der Blockbuster-Begriff spielt dabei insofern eine wichtige Rolle, als er die mediale Ware ‚Mittelalter‘ sowohl von Produktions- als auch Publikumsseite in den Blick nimmt, die beide an der Erzeugung des Phänomens ‚Blockbuster‘ beteiligt sind. Dass seit dem Aufkommen solcher finanziell hochaufwändigen, auf spektakuläre Schauwerte setzenden Produktionen in den 50er- und 60er Jahren kaum ein Jahr vergangen ist, in denen die Major-Studios nicht einen Mittelalterfilm ‚Marke Hollywood‘ veröffentlicht haben, lässt auf ein anhaltendes ökonomisches Potential mittelalterbezogener Produktionen und damit auf eine stabile Beliebtheit auf Seiten der Rezipienten schließen. Den Gesetzen von Angebot und Nachfrage folgend, beeinflusst das globale Kinopublikum damit zwar einerseits den Filmmarkt mit, gleichzeitig prägen, transportieren und perpetuieren die dadurch entstehenden, international mit großer Reichweite verbreiteten Filme das Mittelalterbild ihrer ZuschauerInnen stark. Die 21 Beiträge des Tagungsbandes widmen sich dem Phänomen ‚Mittelalter im Blockbusterkino‘ inhaltlich, beschäftigen sich aber auch mit der Notwendigkeit seiner Reflexion und den Bedingungen seines gewinnbringenden Einsatzes in Lehre und Unterricht
    corecore