3,896 research outputs found

    Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)

    Full text link
    Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data

    Preventing Location-Based Identity Inference in Anonymous Spatial Queries

    Get PDF
    The increasing trend of embedding positioning capabilities (for example, GPS) in mobile devices facilitates the widespread use of Location-Based Services. For such applications to succeed, privacy and confidentiality are essential. Existing privacy-enhancing techniques rely on encryption to safeguard communication channels, and on pseudonyms to protect user identities. Nevertheless, the query contents may disclose the physical location of the user. In this paper, we present a framework for preventing location-based identity inference of users who issue spatial queries to Location-Based Services. We propose transformations based on the well-established K-anonymity concept to compute exact answers for range and nearest neighbor search, without revealing the query source. Our methods optimize the entire process of anonymizing the requests and processing the transformed spatial queries. Extensive experimental studies suggest that the proposed techniques are applicable to real-life scenarios with numerous mobile users

    Log-based Evaluation of Label Splits for Process Models

    Get PDF
    Process mining techniques aim to extract insights in processes from event logs. One of the challenges in process mining is identifying interesting and meaningful event labels that contribute to a better understanding of the process. Our application area is mining data from smart homes for elderly, where the ultimate goal is to signal deviations from usual behavior and provide timely recommendations in order to extend the period of independent living. Extracting individual process models showing user behavior is an important instrument in achieving this goal. However, the interpretation of sensor data at an appropriate abstraction level is not straightforward. For example, a motion sensor in a bedroom can be triggered by tossing and turning in bed or by getting up. We try to derive the actual activity depending on the context (time, previous events, etc.). In this paper we introduce the notion of label refinements, which links more abstract event descriptions with their more refined counterparts. We present a statistical evaluation method to determine the usefulness of a label refinement for a given event log from a process perspective. Based on data from smart homes, we show how our statistical evaluation method for label refinements can be used in practice. Our method was able to select two label refinements out of a set of candidate label refinements that both had a positive effect on model precision.Comment: Paper accepted at the 20th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, to appear in Procedia Computer Scienc

    Modelling Non-residential Real Estate Prices and Land Use Development in Windsor with Potential Impacts from the Windsor-Essex Parkway

    Get PDF
    A study of non-residential land use in the Windsor, Ontario CMA was undertaken to examine possible local implications from construction of the Windsor-Essex Parkway. Two distinct model types were employed. The first consisted of price regressions for industrial, vacant, commercial, office, retail, restaurant, and plaza properties. The second set studied the discrete choice of land use types within commercial and industrial zoning. The commercial logit model had four alternatives: office, retail, restaurant, and other. The industrial logit model had three alternatives: warehouse, factory, and other. The results obtained from these models provide a useful account of interacting land use processes that can inform future transportation and land use policies. Moreover, the empirical analysis is particularly valuable given the larger amount of research into residential land use compared to non-residential. Finally, the models may be useful in the future as part of a more complex integrated urban model

    A probabilistic approach for modeling and real-time filtering of freeway detector data

    Get PDF
    Traffic surveillance systems are a key component for providing information on traffic conditions and supporting traffic management functions. A large amount of data is currently collected from inductive loop detector systems in the form of three macroscopic traffic parameters (speed, volume and occupancy). Such information is vital to the successful implementation of transportation data warehouses and decision support systems. The quality of data is, however, affected by erroneous observations that result from malfunctioning or mis-calibration of detectors. The open literature shows that little effort has been made to establish procedures for screening traffic observations in real-time. This study presents a probabilistic approach for modeling and real-time screening of freeway traffic data. The study proposes a simple methodology to capture the probabilistic and dynamic relationships between the three traffic parameters using historical data collected from the I-4 corridor in Orlando, Florida. The developed models are then used to identify the probability that each traffic observation is partially or fully invalid
    corecore