4 research outputs found

    Semantic reasoning for context-aware internet of things applications

    No full text
    Abstract Acquiring knowledge from continuous and heterogeneous data streams is a prerequisite for Internet of Things (IoT) applications. Semantic technologies provide comprehensive tools and applicable methods for representing, integrating, and acquiring knowledge. However, resource-constraints, dynamics, mobility, scalability, and real-time requirements introduce challenges for applying these methods in IoT environments. We study how to utilize semantic IoT data for reasoning of actionable knowledge by applying state-of-the-art semantic technologies. For performing these studies, we have developed a semantic reasoning system operating in a realistic IoT environment. We evaluate the scalability of different reasoning approaches, including a single reasoner, distributed reasoners, mobile reasoners, and a hybrid of them. We evaluate latencies of reasoning introduced by different semantic data formats. We verify the capabilities of promising semantic technologies for IoT applications through comparing the scalability and real-time response of different reasoning approaches with various semantic data formats. Moreover, we evaluate different data aggregation strategies for integrating distributed IoT data for reasoning processes

    Scalable Reference Genome Assembly from Compressed Pan-Genome Index with Spark

    Get PDF
    High-throughput sequencing (HTS) technologies have enabled rapid sequencing of genomes and large-scale genome analytics with massive data sets. Traditionally, genetic variation analyses have been based on the human reference genome assembled from a relatively small human population. However, genetic variation could be discovered more comprehensively by using a collection of genomes i.e., pan-genome as a reference. The pan-genomic references can be assembled from larger populations or a specific population under study. Moreover, exploiting the pan-genomic references with current bioinformatics tools requires efficient compression and indexing methods. To be able to leverage the accumulating genomic data, the power of distributed and parallel computing has to be harnessed for the new genome analysis pipelines. We propose a scalable distributed pipeline, PanGenSpark, for compressing and indexing pan-genomes and assembling a reference genome from the pan-genomic index. We experimentally show the scalability of the PanGenSpark with human pan-genomes in a distributed Spark cluster comprising 448 cores distributed to 26 computing nodes. Assembling a consensus genome of a pan-genome including 50 human individuals was performed in 215 min and with 500 human individuals in 1468 min. The index of 1.41 TB pan-genome was compressed into a size of 164.5 GB in our experiments.Peer reviewe

    Privacy as a service:protecting the individual in healthcare data processing

    No full text
    Abstract Health applications involve many data sources, individuals, and services that work against guarantees that an individual’s personal data will not be used without consent. The proposed privacy-centered architecture integrates data security and semantic descriptions into a trust-query framework, enabling the provision of user consent as a service
    corecore