43 research outputs found

    Data semantic enrichment for complex event processing over IoT Data Streams

    Get PDF
    This thesis generalizes techniques for processing IoT data streams, semantically enrich data with contextual information, as well as complex event processing in IoT applications. A case study for ECG anomaly detection and signal classification was conducted to validate the knowledge foundation

    Structure and representation of ecological data to support knowledge discovery: A case study with bioacoustic data

    Get PDF
    Bird communities have long been surveyed as key indicators of ecosystem health and biodiversity. Adoption of Autonomous Recording Units (ARUs) to perform avian surveys has shifted the burden of species recognition from “birders” in the field, to “listeners” who review the ARU recordings at a later time. The number of recordings ARUs can produce has created a need to process large amounts of data. Although much research is devoted to fully automating the recognition process, expert humans are still required when entire bird communities must be identified. A framework for a Decision Support System (DSS) is presented which would assist listeners by suggesting likely species. A unique feature of the DSS is the consideration of the recording “context” of time, location and habitat as well as the bioacoustic features to match unknown vocalizations with reference species. In this thesis a data warehouse was built for an existing set of bioacoustic research data as a first–step to creating the DSS. The data set was from ARU deployments in the Lower Athabasca Region of Alberta, Canada. The Knowledge Discovery in Databases (KDD) and Dimensional Design Process protocols were used as guides to build a Kimball–style data warehouse. Data housed in the data warehouse included field data, data derived from GIS analysis, fuzzy logic memberships and symbolic representation of bioacoustic recording using the Piecewise Aggregate Approximation and Symbolic Aggregate approXimation (PAA/SAX). Examples of how missing and erroneous data were detected and processed are given. The sources of uncertainty inherent in ecological data are discussed and fuzzy logic is demonstrated as a soft–computing technique to accommodate this data. Data warehouses are commonly used for business applications but are very applicable for ecological data. As most instructions on building data warehouse are for business data, this thesis is offered as an example for ecologists interested in moving their data to a data warehouse. This thesis presents a case–study of how a data warehouse can be constructed for existing ecological data, whether as part of a DSS or a tool for viewing research data.Symbolic aggregate approximationBioacousticsDecision support systemData warehouseFuzzy logicBirdsAutonomous recording unitsPiecewise aggregate approximatio

    Structure and representation of ecological data to support knowledge discovery: A case study with bioacoustic data

    Get PDF
    Bird communities have long been surveyed as key indicators of ecosystem health and biodiversity. Adoption of Autonomous Recording Units (ARUs) to perform avian surveys has shifted the burden of species recognition from “birders” in the field, to “listeners” who review the ARU recordings at a later time. The number of recordings ARUs can produce has created a need to process large amounts of data. Although much research is devoted to fully automating the recognition process, expert humans are still required when entire bird communities must be identified. A framework for a Decision Support System (DSS) is presented which would assist listeners by suggesting likely species. A unique feature of the DSS is the consideration of the recording “context” of time, location and habitat as well as the bioacoustic features to match unknown vocalizations with reference species. In this thesis a data warehouse was built for an existing set of bioacoustic research data as a first–step to creating the DSS. The data set was from ARU deployments in the Lower Athabasca Region of Alberta, Canada. The Knowledge Discovery in Databases (KDD) and Dimensional Design Process protocols were used as guides to build a Kimball–style data warehouse. Data housed in the data warehouse included field data, data derived from GIS analysis, fuzzy logic memberships and symbolic representation of bioacoustic recording using the Piecewise Aggregate Approximation and Symbolic Aggregate approXimation (PAA/SAX). Examples of how missing and erroneous data were detected and processed are given. The sources of uncertainty inherent in ecological data are discussed and fuzzy logic is demonstrated as a soft–computing technique to accommodate this data. Data warehouses are commonly used for business applications but are very applicable for ecological data. As most instructions on building data warehouse are for business data, this thesis is offered as an example for ecologists interested in moving their data to a data warehouse. This thesis presents a case–study of how a data warehouse can be constructed for existing ecological data, whether as part of a DSS or a tool for viewing research data.Symbolic aggregate approximationBioacousticsDecision support systemData warehouseFuzzy logicBirdsAutonomous recording unitsPiecewise aggregate approximatio

    High-Level Facade Image Interpretation using Marked Point Processes

    Get PDF
    In this thesis, we address facade image interpretation as one essential ingredient for the generation of high-detailed, semantic meaningful, three-dimensional city-models. Given a single rectified facade image, we detect relevant facade objects such as windows, entrances, and balconies, which yield a description of the image in terms of accurate position and size of these objects. Urban digital three-dimensional reconstruction and documentation is an active area of research with several potential applications, e.g., in the area of digital mapping for navigation, urban planning, emergency management, disaster control or the entertainment industry. A detailed building model which is not just a geometric object enriched with texture, allows for semantic requests as the number of floors or the location of balconies and entrances. Facade image interpretation is one essential step in order to yield such models. In this thesis, we propose the interpretation of facade images by combining evidence for the occurrence of individual object classes which we derive from data, and prior knowledge which guides the image interpretation in its entirety. We present a three-step procedure which generates features that are suited to describe relevant objects, learns a representation that is suited for object detection, and that enables the image interpretation using the results of object detection while incorporating prior knowledge about typical configurations of facade objects, which we learn from training data. According to these three sub-tasks, our major achievements are: We propose a novel method for facade image interpretation based on a marked point process. Therefor, we develop a model for the description of typical configurations of facade objects and propose an image interpretation system which combines evidence derived from data and prior knowledge about typical configurations of facade objects. In order to generate evidence from data, we propose a feature type which we call shapelets. They are scale invariant and provide large distinctiveness for facade objects. Segments of lines, arcs, and ellipses serve as basic features for the generation of shapelets. Therefor, we propose a novel line simplification approach which approximates given pixel-chains by a sequence of lines, circular, and elliptical arcs. Among others, it is based on an adaption to Douglas-Peucker's algorithm, which is based on circles as basic geometric elements We evaluate each step separately. We show the effects of polyline segmentation and simplification on several images with comparable good or even better results, referring to a state-of-the-art algorithm, which proves their large distinctiveness for facade objects. Using shapelets we provide a reasonable classification performance on a challenging dataset, including intra-class variations, clutter, and scale changes. Finally, we show promising results for the facade interpretation system on several datasets and provide a qualitative evaluation which demonstrates the capability of complete and accurate detection of facade objectsHigh-Level Interpretation von Fassaden-Bildern unter Benutzung von Markierten PunktprozessenDas Thema dieser Arbeit ist die Interpretation von Fassadenbildern als wesentlicher Beitrag zur Erstellung hoch detaillierter, semantisch reichhaltiger dreidimensionaler Stadtmodelle. In rektifizierten Einzelaufnahmen von Fassaden detektieren wir relevante Objekte wie Fenster, Türen und Balkone, um daraus eine Bildinterpretation in Form von präzisen Positionen und Größen dieser Objekte abzuleiten. Die digitale dreidimensionale Rekonstruktion urbaner Regionen ist ein aktives Forschungsfeld mit zahlreichen Anwendungen, beispielsweise der Herstellung digitaler Kartenwerke für Navigation, Stadtplanung, Notfallmanagement, Katastrophenschutz oder die Unterhaltungsindustrie. Detaillierte Gebäudemodelle, die nicht nur als geometrische Objekte repräsentiert und durch eine geeignete Textur visuell ansprechend dargestellt werden, erlauben semantische Anfragen, wie beispielsweise nach der Anzahl der Geschosse oder der Position der Balkone oder Eingänge. Die semantische Interpretation von Fassadenbildern ist ein wesentlicher Schritt für die Erzeugung solcher Modelle. In der vorliegenden Arbeit lösen wir diese Aufgabe, indem wir aus Daten abgeleitete Evidenz für das Vorkommen einzelner Objekte mit Vorwissen kombinieren, das die Analyse der gesamten Bildinterpretation steuert. Wir präsentieren dafür ein dreistufiges Verfahren: Wir erzeugen Bildmerkmale, die für die Beschreibung der relevanten Objekte geeignet sind. Wir lernen, auf Basis abgeleiteter Merkmale, eine Repräsentation dieser Objekte. Schließlich realisieren wir die Bildinterpretation basierend auf der zuvor gelernten Repräsentation und dem Vorwissen über typische Konfigurationen von Fassadenobjekten, welches wir aus Trainingsdaten ableiten. Wir leisten dazu die folgenden wissenschaftlichen Beiträge: Wir schlagen eine neuartige Me-thode zur Interpretation von Fassadenbildern vor, die einen sogenannten markierten Punktprozess verwendet. Dafür entwickeln wir ein Modell zur Beschreibung typischer Konfigurationen von Fassadenobjekten und entwickeln ein Bildinterpretationssystem, welches aus Daten abgeleitete Evidenz und a priori Wissen über typische Fassadenkonfigurationen kombiniert. Für die Erzeugung der Evidenz stellen wir Merkmale vor, die wir Shapelets nennen und die skaleninvariant und durch eine ausgesprochene Distinktivität im Bezug auf Fassadenobjekte gekennzeichnet sind. Als Basismerkmale für die Erzeugung der Shapelets dienen Linien-, Kreis- und Ellipsensegmente. Dafür stellen wir eine neuartige Methode zur Vereinfachung von Liniensegmenten vor, die eine Pixelkette durch eine Sequenz von geraden Linienstücken und elliptischen Bogensegmenten approximiert. Diese basiert unter anderem auf einer Adaption des Douglas-Peucker Algorithmus, die anstelle gerader Linienstücke, Bogensegmente als geometrische Basiselemente verwendet. Wir evaluieren jeden dieser drei Teilschritte separat. Wir zeigen Ergebnisse der Liniensegmen-tierung anhand verschiedener Bilder und weisen dabei vergleichbare und teilweise verbesserte Ergebnisse im Vergleich zu bestehende Verfahren nach. Für die vorgeschlagenen Shapelets weisen wir in der Evaluation ihre diskriminativen Eigenschaften im Bezug auf Fassadenobjekte nach. Wir erzeugen auf einem anspruchsvollen Datensatz von skalenvariablen Fassadenobjekten, mit starker Variabilität der Erscheinung innerhalb der Klassen, vielversprechende Klassifikationsergebnisse, die die Verwendbarkeit der gelernten Shapelets für die weitere Interpretation belegen. Schließlich zeigen wir Ergebnisse der Interpretation der Fassadenstruktur anhand verschiedener Datensätze. Die qualitative Evaluation demonstriert die Fähigkeit des vorgeschlagenen Lösungsansatzes zur vollständigen und präzisen Detektion der genannten Fassadenobjekte

    Machine learning techniques for identification using mobile and social media data

    Get PDF
    Networked access and mobile devices provide near constant data generation and collection. Users, environments, applications, each generate different types of data; from the voluntarily provided data posted in social networks to data collected by sensors on mobile devices, it is becoming trivial to access big data caches. Processing sufficiently large amounts of data results in inferences that can be characterized as privacy invasive. In order to address privacy risks we must understand the limits of the data exploring relationships between variables and how the user is reflected in them. In this dissertation we look at data collected from social networks and sensors to identify some aspect of the user or their surroundings. In particular, we find that from social media metadata we identify individual user accounts and from the magnetic field readings we identify both the (unique) cellphone device owned by the user and their course-grained location. In each project we collect real-world datasets and apply supervised learning techniques, particularly multi-class classification algorithms to test our hypotheses. We use both leave-one-out cross validation as well as k-fold cross validation to reduce any bias in the results. Throughout the dissertation we find that unprotected data reveals sensitive information about users. Each chapter also contains a discussion about possible obfuscation techniques or countermeasures and their effectiveness with regards to the conclusions we present. Overall our results show that deriving information about users is attainable and, with each of these results, users would have limited if any indication that any type of analysis was taking place

    Cognition-enabled robotic wiping: Representation, planning, execution, and interpretation

    Get PDF
    Advanced cognitive capabilities enable humans to solve even complex tasks by representing and processing internal models of manipulation actions and their effects. Consequently, humans are able to plan the effect of their motions before execution and validate the performance afterwards. In this work, we derive an analog approach for robotic wiping actions which are fundamental for some of the most frequent household chores including vacuuming the floor, sweeping dust, and cleaning windows. We describe wiping actions and their effects based on a qualitative particle distribution model. This representation enables a robot to plan goal-oriented wiping motions for the prototypical wiping actions of absorbing, collecting and skimming. The particle representation is utilized to simulate the task outcome before execution and infer the real performance afterwards based on haptic perception. This way, the robot is able to estimate the task performance and schedule additional motions if necessary. We evaluate our methods in simulated scenarios, as well as in real experiments with the humanoid service robot Rollin’ Justin

    FLAGS : a methodology for adaptive anomaly detection and root cause analysis on sensor data streams by fusing expert knowledge with machine learning

    Get PDF
    Anomalies and faults can be detected, and their causes verified, using both data-driven and knowledge-driven techniques. Data-driven techniques can adapt their internal functioning based on the raw input data but fail to explain the manifestation of any detection. Knowledge-driven techniques inherently deliver the cause of the faults that were detected but require too much human effort to set up. In this paper, we introduce FLAGS, the Fused-AI interpretabLe Anomaly Generation System, and combine both techniques in one methodology to overcome their limitations and optimize them based on limited user feedback. Semantic knowledge is incorporated in a machine learning technique to enhance expressivity. At the same time, feedback about the faults and anomalies that occurred is provided as input to increase adaptiveness using semantic rule mining methods. This new methodology is evaluated on a predictive maintenance case for trains. We show that our method reduces their downtime and provides more insight into frequently occurring problems. (C) 2020 The Authors. Published by Elsevier B.V

    Time Series classification through transformation and ensembles

    Get PDF
    The problem of time series classification (TSC), where we consider any real-valued ordered data a time series, offers a specific challenge. Unlike traditional classification problems, the ordering of attributes is often crucial for identifying discriminatory features between classes. TSC problems arise across a diverse range of domains, and this variety has meant that no single approach outperforms all others. The general consensus is that the benchmark for TSC is nearest neighbour (NN) classifiers using Euclidean distance or Dynamic Time Warping (DTW). Though conceptually simple, many have reported that NN classifiers are very diffi�cult to beat and new work is often compared to NN classifiers. The majority of approaches have focused on classification in the time domain, typically proposing alternative elastic similarity measures for NN classification. Other work has investigated more specialised approaches, such as building support vector machines on variable intervals and creating tree-based ensembles with summary measures. We wish to answer a specific research question: given a new TSC problem without any prior, specialised knowledge, what is the best way to approach the problem? Our thesis is that the best methodology is to first transform data into alternative representations where discriminatory features are more easily detected, and then build ensemble classifiers on each representation. In support of our thesis, we propose an elastic ensemble classifier that we believe is the first ever to significantly outperform DTW on the widely used UCR datasets. Next, we propose the shapelet-transform, a new data transformation that allows complex classifiers to be coupled with shapelets, which outperforms the original algorithm and is competitive with DTW. Finally, we combine these two works with with heterogeneous ensembles built on autocorrelation and spectral-transformed data to propose a collective of transformation-based ensembles (COTE). The results of COTE are, we believe, the best ever published on the UCR datasets
    corecore