546 research outputs found

    Explora : interactive querying of multidimensional data in the context of smart cities

    Get PDF
    Citizen engagement is one of the key factors for smart city initiatives to remain sustainable over time. This in turn entails providing citizens and other relevant stakeholders with the latest data and tools that enable them to derive insights that add value to their day-to-day life. The massive volume of data being constantly produced in these smart city environments makes satisfying this requirement particularly challenging. This paper introduces Explora, a generic framework for serving interactive low-latency requests, typical of visual exploratory applications on spatiotemporal data, which leverages the stream processing for deriving-on ingestion time-synopsis data structures that concisely capture the spatial and temporal trends and dynamics of the sensed variables and serve as compacted data sets to provide fast (approximate) answers to visual queries on smart city data. The experimental evaluation conducted on proof-of-concept implementations of Explora, based on traditional database and distributed data processing setups, accounts for a decrease of up to 2 orders of magnitude in query latency compared to queries running on the base raw data at the expense of less than 10% query accuracy and 30% data footprint. The implementation of the framework on real smart city data along with the obtained experimental results prove the feasibility of the proposed approach

    IRS-III: A broker-based approach to semantic Web services

    Get PDF
    A factor limiting the take up of Web services is that all tasks associated with the creation of an application, for example, finding, composing, and resolving mismatches between Web services have to be carried out by a software developer. Semantic Web services is a combination of semantic Web and Web service technologies that promise to alleviate these problems. In this paper we describe IRS-III, a framework for creating and executing semantic Web services, which takes a semantic broker based approach to mediating between service requesters and service providers. We describe the overall approach and the components of IRS-III from an ontological and architectural viewpoint. We then illustrate our approach through an application in the eGovernment domain

    Uncertainty analysis in the Model Web

    Get PDF
    This thesis provides a set of tools for managing uncertainty in Web-based models and workflows.To support the use of these tools, this thesis firstly provides a framework for exposing models through Web services. An introduction to uncertainty management, Web service interfaces,and workflow standards and technologies is given, with a particular focus on the geospatial domain.An existing specification for exposing geospatial models and processes, theWeb Processing Service (WPS), is critically reviewed. A processing service framework is presented as a solutionto usability issues with the WPS standard. The framework implements support for Simple ObjectAccess Protocol (SOAP), Web Service Description Language (WSDL) and JavaScript Object Notation (JSON), allowing models to be consumed by a variety of tools and software. Strategies for communicating with models from Web service interfaces are discussed, demonstrating the difficultly of exposing existing models on the Web. This thesis then reviews existing mechanisms for uncertainty management, with an emphasis on emulator methods for building efficient statistical surrogate models. A tool is developed to solve accessibility issues with such methods, by providing a Web-based user interface and backend to ease the process of building and integrating emulators. These tools, plus the processing service framework, are applied to a real case study as part of the UncertWeb project. The usability of the framework is proved with the implementation of aWeb-based workflow for predicting future crop yields in the UK, also demonstrating the abilities of the tools for emulator building and integration. Future directions for the development of the tools are discussed

    Methods to Improve and Evaluate Spatial Data Infrastructures

    Get PDF
    This thesis mainly focuses on methods for improving and evaluating Spatial Data Infrastructures (SDIs). The aim has been threefold: to develop a framework for the management and evaluation of an SDI, to improve the accessibility of spatial data in an SDI, and to improve the cartography in view services in an SDI. Spatial Data Infrastructure has been identified as an umbrella covering spatial data handling procedures. The long-term implementation of SDI increases the need for short/middle term feedbacks from different perspectives. Thus, a precise strategic plan and accurate objectives have to be defined for the implementation of an efficient environment for spatial data collection and exchange in a region. In this thesis, a comprehensive study was conducted to review the current methods in the business management literature to approach to an integrated framework for the implementation and evaluation of SDIs. In this context, four techniques were described and the usability of each technique in several aspects of SDI implementation was discussed. SDI evaluation has been considered as one of the main challenges in recent years. Lack of a general goal oriented framework to assess an SDI from different perspectives was one of the main concerns of this thesis. Among a number of the current methods in this research area, we focused on the Balanced Scorecard (BSC) as a general evaluation framework covering all perspectives in an SDI. The assessment study opened a window to a number of important issues that ranged from the technical to the cartographic aspects of spatial data exchange in an SDI. To access the required datasets in an SDI, clearinghouse networks have been developed as a gateway to the data repositories. However, traditional clearinghouse networks do not satisfy the end user requirements. By adding a number of functionalities, we proposed a methodology to increase the percentage of accessing required data. These methods were based on predefined rules and additional procedures within web processing services and service composition subjects to develop an expert system based clearinghouses. From the cartography viewpoint, current methods for spatial data presentation do not satisfy the user requirements in an SDI environment. The main presentation problem occurs when spatial data are integrated from different sources. For appropriate cartography, we propose a number of methods, such as the polygon overlay method, which is an icon placement approach, to emphasize the more important layers and the color saturation method to decrease the color saturation of the unimportant layers and emphasize the foreground layer according to the visual hierarchy concept. Another cartographic challenge is the geometrical and topological conflicts in data shown in view services. The geometrical inconsistency is due to the artificial discrepancy that occurs when displaying connected information from different sources, which is caused by inaccuracies and different levels of details in the datasets. The semantic conflict is related to the definition of the related features, i.e., to the information models of the datasets. To overcome these conflicts and to fix the topological and geometric conflicts we use a semantic based expert system by utilizing an automatic cartography core containing a semantic rule based component. We proposed a system architecture that has an OWL (Web Ontology Language) based expert system to improve the cartography by adjusting and resolving topological and geometrical conflicts in geoportals

    Quality of Service Aware Data Stream Processing for Highly Dynamic and Scalable Applications

    Get PDF
    Huge amounts of georeferenced data streams are arriving daily to data stream management systems that are deployed for serving highly scalable and dynamic applications. There are innumerable ways at which those loads can be exploited to gain deep insights in various domains. Decision makers require an interactive visualization of such data in the form of maps and dashboards for decision making and strategic planning. Data streams normally exhibit fluctuation and oscillation in arrival rates and skewness. Those are the two predominant factors that greatly impact the overall quality of service. This requires data stream management systems to be attuned to those factors in addition to the spatial shape of the data that may exaggerate the negative impact of those factors. Current systems do not natively support services with quality guarantees for dynamic scenarios, leaving the handling of those logistics to the user which is challenging and cumbersome. Three workloads are predominant for any data stream, batch processing, scalable storage and stream processing. In this thesis, we have designed a quality of service aware system, SpatialDSMS, that constitutes several subsystems that are covering those loads and any mixed load that results from intermixing them. Most importantly, we natively have incorporated quality of service optimizations for processing avalanches of geo-referenced data streams in highly dynamic application scenarios. This has been achieved transparently on top of the codebases of emerging de facto standard best-in-class representatives, thus relieving the overburdened shoulders of the users in the presentation layer from having to reason about those services. Instead, users express their queries with quality goals and our system optimizers compiles that down into query plans with an embedded quality guarantee and leaves logistic handling to the underlying layers. We have developed standard compliant prototypes for all the subsystems that constitutes SpatialDSMS

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Workflow repository for providing configurable workflow in ERP

    Get PDF
    Workflow pada ERP dengan domain fungsi yang besar rentan dengan adanya duplikasi. Membuat workflow repository yang menyimpan berbagai macam workflow dari proses bisnis ERP yang dapat digunakan untuk menyusun workflow baru sesuai kebutuhan tenant baru Metode yang diusulkan: Metode yang diusulkan terdiri dari 2 tahapan, preprocessing dan processing. Tahap preprocessing bertujuan untuk mencari common dan sub variant dari existing workflow variant. Workflow variant yang disimpan oleh pengguna adalah Procure to Pay workflow. Variasi tersebut diseleksi berdasarkan kemiripannya dengan similarity filtering, kemudian dimerge untuk mencari common dan sub variantnya. Common dan sub variant disimpan menggunakan metadata yang dipetakan pada basis data relasional. Deteksi common dan sub variant workflow mencapai tingkat akurasi sebesar 92%. Ccommon workflow terdiri dari 3-common dari 8-variant workflow. Common workflow tersebut memiliki tingkat kompleksitas lebih rendah 10% dari model sebelumnya. Tahapan processing adalah tahapan penyediaan configurable workflow. Pengguna memasukan query model untuk mencari workflow yang diinginkan. Dengan menggunakan metode similarity filtering, didapatkan common dan/atau sub variant yang memungkinkan. Pengguna dapat menggunakan common workflow melalui workflow designer untuk melakukan rekomposisi ulang. Penyediaan configurable workflow oleh ERP mencapai tingkat 100% dimana apapun yang diinginkan pengguna dapat disediakaan workflownya oleh ERP, ataupun sebagai dasar membentuk workflow yang lain. Berdasarkan hasil percobaan, tempat penyimpanan workflow dapat dibangun dengan arsitektur yang diajukan dan mampu menyimpan dan menyediakan workflow. Tempat penyimpanan ERP mampu mendeteksi workflow yang bersifat common dan sub variant. Tempat penyimpanan ERP mampu menyediakan configurable workflow, dimana pengguna dapat memanfaatkan common dan sub variant workflow untuk menjadi dasar mengkomposisi workflow yang lain. =================================================================================================== Workflow in ERP which covered big domain faced duplication issues. Scope of this research was developing workflow from business process ERP which could be used for required workflow as user needs. Proposed approach consisted of 2 stages preprocessing and processing. Preprocessing stages aimed for finding common and variant of sub workflow based on existing workflow variant. The workflow variants that were stored by user were procured to pay workflow. The workflows was filtered by similarity filtering method then merged for identifying the common and variant of sub workflow. The common and sub variant workflow were stored using metadata that mapped into relational database. The common and variant of sub workflow detection achieved 92% accuracy. The common workflow consisted of 3- the common workflow from 8-variant workflow. The common workflow has 10% lesser complexity than its predecessor. Processing was providing configurable workflow. User inputted query model to find required workflow. Utilizing similarity filtering, possible the common and variant of sub workflow was collected. User used the common workflow through workflow designer to recompose. Providing configurable workflow ERP achieved 100%, where any user need would be provided by ERP, as workflow or as based template for creating other. Based on evaluation, repository was built based on proposed architecture and was able to store or provide workflow. Repository detected workflow whether common or variant of sub workflow. Repository ERP was able to provide configurable ERP, where user utilized common and variant of sub workflow as based for creating one of their need

    Low-latency, query-driven analytics over voluminous multidimensional, spatiotemporal datasets

    Get PDF
    2017 Summer.Includes bibliographical references.Ubiquitous data collection from sources such as remote sensing equipment, networked observational devices, location-based services, and sales tracking has led to the accumulation of voluminous datasets; IDC projects that by 2020 we will generate 40 zettabytes of data per year, while Gartner and ABI estimate 20-35 billion new devices will be connected to the Internet in the same time frame. The storage and processing requirements of these datasets far exceed the capabilities of modern computing hardware, which has led to the development of distributed storage frameworks that can scale out by assimilating more computing resources as necessary. While challenging in its own right, storing and managing voluminous datasets is only the precursor to a broader field of study: extracting knowledge, insights, and relationships from the underlying datasets. The basic building block of this knowledge discovery process is analytic queries, encompassing both query instrumentation and evaluation. This dissertation is centered around query-driven exploratory and predictive analytics over voluminous, multidimensional datasets. Both of these types of analysis represent a higher-level abstraction over classical query models; rather than indexing every discrete value for subsequent retrieval, our framework autonomously learns the relationships and interactions between dimensions in the dataset (including time series and geospatial aspects), and makes the information readily available to users. This functionality includes statistical synopses, correlation analysis, hypothesis testing, probabilistic structures, and predictive models that not only enable the discovery of nuanced relationships between dimensions, but also allow future events and trends to be predicted. This requires specialized data structures and partitioning algorithms, along with adaptive reductions in the search space and management of the inherent trade-off between timeliness and accuracy. The algorithms presented in this dissertation were evaluated empirically on real-world geospatial time-series datasets in a production environment, and are broadly applicable across other storage frameworks

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems
    • …
    corecore