4 research outputs found

    DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

    Get PDF
    Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoring—become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results

    Multi-sensor real-time data fusion on embedded computing platforms

    No full text
    Die vorliegende Arbeit beschäftigt sich mit der Auswahl und Implementierung von Algorithmen zur akustischen Merkmalsextraktion und Datenfusion für Fahrzeug-Klassifikation auf eingebetteten Plattformen in Echtzeit. Es werden zwei wesentliche Ziele verfolgt. Zum Einen steht die Auswahl und Implementierung geeigneter Algorithmen im Vordergrund. Die Auswahlkriterien basieren auf Maße, welche die Varianz zwischen den Klassen maximieren und diese innerhalb der Klassen minimieren. Darüber hinaus erfüllen diese Methoden weitere Anforderungen wie Restriktionen hinsichtlich verfügbarer Ressourcen und Ausführungperformanz. Zum Anderen werden die akustischen Klassifikatoren in ein selbst-lernendes Framework integriert um den Aufwand manueller Annotation der Daten zu verringern. Weiters wird ein audio-visuelles Co-training Framework vorgestellt, welches ein autonomes on-line Lernen unterstützt. Zusätzliche high-level Datenfusion erhöht die gesamte Performanz des Klassifikationssystems. Es wurde eine eingebettete multi-sensor Datenfusionsplattform entwickelt, welche als Evaluierungsprototyp dient. Die experimentellen Ergebnisse bekräftigen die Durchführbarkeit des in dieser Arbeit beschriebenen Ansatzes zur akustischen Fahrzeug-Klassifikation unter Berücksichtigung eingebetteter und zeitlicher Restriktionen. Weiters bestätigen sie die Anwendbarkeit akustischer Klassifikatoren als unterstützende Funktion für autonomes on-line Lernen visueller/akustischer Klassifikatoren und kollaborative Audio-Video Klassifikation.The key question addressed in this thesis is how to select and implement extended techniques for acoustic feature extraction and data fusion for real-time vehicle classification on embedded platforms. The major objectives are two-fold. First, the selection and implementation of appropriate algorithms for feature extraction and data fusion. The selection criteria are based on class-specific measures that maximize the between-class variance and minimize the within-class variability of the features. Furthermore, these features satisfy predefined constraints concerning available resources and execution performance. Second, the acoustic classifiers are integrated in a multi-sensor self-training framework with the goal of significantly reducing the effort for manual labeling of training data and in an audio-visual on-line co-training framework to support autonomous learning and scene-adaption. By exploiting additional high-level fusion techniques the overall classification performance of the system is increased substantially. An embedded multi-sensor data fusion platform prototype has been developed to serve as an evaluation platform. Numerous evaluation experiments have been performed on this prototype to show the feasibility of the acoustic real-time vehicle classification approach considering embedded constraints. Moreover, the evaluation of the self-training and the audio-visual co-training framework reveal the applicability of the acoustic classifiers to support the autonomous on-line learning of visual as well as acoustic classifiers and the collaborative audio-video classification.Andreas StarzacherAbweichender Titel laut Übersetzung der Verfasserin/des VerfassersZsfassung in dt. SpracheKlagenfurt, Alpen-Adria-Univ., Diss., 2010KB2010 26OeBB(VLID)241010

    DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

    No full text
    Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoring—become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results
    corecore