805 research outputs found

    Pervasive brain monitoring and data sharing based on multi-tier distributed computing and linked data technology

    Get PDF
    EEG-based Brain-computer interfaces (BCI) are facing grant challenges in their real-world applications. The technical difficulties in developing truly wearable multi-modal BCI systems that are capable of making reliable real-time prediction of users’ cognitive states under dynamic real-life situations may appear at times almost insurmountable. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report our attempt to develop a pervasive on-line BCI system by employing state-of-art technologies such as multi-tier fog and cloud computing, semantic Linked Data search and adaptive prediction/classification models. To verify our approach, we implement a pilot system using wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end fog servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end cloud servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch and the UCSD Movement Disorder Center to use our system in real-life personal stress and in-home Parkinson’s disease patient monitoring experiments. We shall proceed to develop a necessary BCI ontology and add automatic semantic annotation and progressive model refinement capability to our system

    Medical Big Data Analysis in Hospital Information System

    Get PDF
    The rapidly increasing medical data generated from hospital information system (HIS) signifies the era of Big Data in the healthcare domain. These data hold great value to the workflow management, patient care and treatment, scientific research, and education in the healthcare industry. However, the complex, distributed, and highly interdisciplinary nature of medical data has underscored the limitations of traditional data analysis capabilities of data accessing, storage, processing, analyzing, distributing, and sharing. New and efficient technologies are becoming necessary to obtain the wealth of information and knowledge underlying medical Big Data. This chapter discusses medical Big Data analysis in HIS, including an introduction to the fundamental concepts, related platforms and technologies of medical Big Data processing, and advanced Big Data processing technologies

    NORA: Scalable OWL reasoner based on NoSQL databasesand Apache Spark

    Get PDF
    Reasoning is the process of inferring new knowledge and identifying inconsis-tencies within ontologies. Traditional techniques often prove inadequate whenreasoning over large Knowledge Bases containing millions or billions of facts.This article introduces NORA, a persistent and scalable OWL reasoner built ontop of Apache Spark, designed to address the challenges of reasoning over exten-sive and complex ontologies. NORA exploits the scalability of NoSQL databasesto effectively apply inference rules to Big Data ontologies with large ABoxes. Tofacilitatescalablereasoning,OWLdata,includingclassandpropertyhierarchiesand instances, are materialized in the Apache Cassandra database. Spark pro-grams are then evaluated iteratively, uncovering new implicit knowledge fromthe dataset and leading to enhanced performance and more efficient reasoningover large-scale ontologies. NORA has undergone a thorough evaluation withdifferent benchmarking ontologies of varying sizes to assess the scalability of thedeveloped solution.Funding for open access charge: Universidad de Málaga / CBUA This work has been partially funded by grant (funded by MCIN/AEI/10.13039/501100011033/) PID2020-112540RB-C41,AETHER-UMA (A smart data holistic approach for context-aware data analytics: semantics and context exploita-tion). Antonio Benítez-Hidalgo is supported by Grant PRE2018-084280 (Spanish Ministry of Science, Innovation andUniversities)

    Data semantic enrichment for complex event processing over IoT Data Streams

    Get PDF
    This thesis generalizes techniques for processing IoT data streams, semantically enrich data with contextual information, as well as complex event processing in IoT applications. A case study for ECG anomaly detection and signal classification was conducted to validate the knowledge foundation

    Scalable big data systems: Architectures and optimizations

    Get PDF
    Big data analytics has become not just a popular buzzword but also a strategic direction in information technology for many enterprises and government organizations. Even though many new computing and storage systems have been developed for big data analytics, scalable big data processing has become more and more challenging as a result of the huge and rapidly growing size of real-world data. Dedicated to the development of architectures and optimization techniques for scaling big data processing systems, especially in the era of cloud computing, this dissertation makes three unique contributions. First, it introduces a suite of graph partitioning algorithms that can run much faster than existing data distribution methods and inherently scale to the growth of big data. The main idea of these approaches is to partition a big graph by preserving the core computational data structure as much as possible to maximize intra-server computation and minimize inter-server communication. In addition, it proposes a distributed iterative graph computation framework that effectively utilizes secondary storage to maximize access locality and speed up distributed iterative graph computations. The framework not only considerably reduces memory requirements for iterative graph algorithms but also significantly improves the performance of iterative graph computations. Last but not the least, it establishes a suite of optimization techniques for scalable spatial data processing along with three orthogonal dimensions: (i) scalable processing of spatial alarms for mobile users traveling on road networks, (ii) scalable location tagging for improving the quality of Twitter data analytics and prediction accuracy, and (iii) lightweight spatial indexing for enhancing the performance of big spatial data queries.Ph.D

    Polyflow: a Polystore-compliant mechanism to provide interoperability to heterogeneous provenance graphs

    Get PDF
    Many scientific experiments are modeled as workflows. Workflows usually output massive amounts of data. To guarantee the reproducibility of workflows, they are usually orchestrated by Workflow Management Systems (WfMS), that capture provenance data. Provenance represents the lineage of a data fragment throughout its transformations by activities in a workflow. Provenance traces are usually represented as graphs. These graphs allows scientists to analyze and evaluate results produced by a workflow. However, each WfMS has a proprietary format for provenance and do it in different granularity levels. Therefore, in more complex scenarios in which the scientist needs to interpret provenance graphs generated by multiple WfMSs and workflows, a challenge arises. To first understand the research landscape, we conduct a Systematic Literature Mapping, assessing existing solutions under several different lenses. With a clearer understanding of the state of the art, we propose a tool called Polyflow, which is based on the concept of Polystore systems, integrating several databases of heterogeneous origin by adopting a global ProvONE schema. Polyflow allows scientists to query multiple provenance graphs in an integrated way. Polyflow was evaluated by experts using provenance data collected from real experiments that generate phylogenetic trees through workflows. The experiment results suggest that Polyflow is a viable solution for interoperating heterogeneous provenance data generated by different WfMSs, from both a usability and performance standpoint.Muitos experimentos científicos são modelados como workflows (fluxos de trabalho). Workflows produzem comumente um grande volume de dados. De forma a garantir a reprodutibilidade desses workflows, estes geralmente são orquestrados por Sistemas de Gerência de Workflows (SGWfs), garantindo que dados de proveniência sejam capturados. Dados de proveniência representam o histórico de derivação de um dado ao longo da execução do workflow. Assim, o histórico de derivação dos dados pode ser representado por meio de um grafo de proveniência. Este grafo possibilita aos cientistas analisarem e avaliarem resultados produzidos por um workflow. Todavia, cada SGWf tem seu formato proprietário de representação para dados de proveniência, e os armazenam em diferentes granularidades. Consequentemente, em cenários mais complexos em que um cientista precisa analisar de forma integrada grafos de proveniência gerados por múltiplos workflows, isso se torna desafiador. Primeiramente, para entender o campo de pesquisa, realizamos um Mapeamento Sistemático da Literatura, avaliando soluções existentes sob diferentes lentes. Com uma compreensão mais clara do atual estado da arte, propomos uma ferramenta chamada Polyflow, inspirada em conceitos de sistemas Polystore, possibilitando a integração de várias bases de dados heterogêneas por meio de uma interface de consulta única que utiliza o ProvONE como schema global. Polyflow permite que cientistas submetam consultas em múltiplos grafos de proveniência de maneira integrada. Polyflow foi avaliado em conjunto com especialistas usando dados de proveniência coletados de workflows reais que apoiam o estudo de geração de árvores filogenéticas. O resultado da avaliação mostrou a viabilidade do Polyflow para interoperar semanticamente dados de proveniência gerado por distintos SGWfs, tanto do ponto de vista de desempenho quanto de usabilidade

    Streaming the Web: Reasoning over dynamic data.

    Get PDF
    In the last few years a new research area, called stream reasoning, emerged to bridge the gap between reasoning and stream processing. While current reasoning approaches are designed to work on mainly static data, the Web is, on the other hand, extremely dynamic: information is frequently changed and updated, and new data is continuously generated from a huge number of sources, often at high rate. In other words, fresh information is constantly made available in the form of streams of new data and updates. Despite some promising investigations in the area, stream reasoning is still in its infancy, both from the perspective of models and theories development, and from the perspective of systems and tools design and implementation. The aim of this paper is threefold: (i) we identify the requirements coming from different application scenarios, and we isolate the problems they pose; (ii) we survey existing approaches and proposals in the area of stream reasoning, highlighting their strengths and limitations; (iii) we draw a research agenda to guide the future research and development of stream reasoning. In doing so, we also analyze related research fields to extract algorithms, models, techniques, and solutions that could be useful in the area of stream reasoning. © 2014 Elsevier B.V. All rights reserved
    • …
    corecore