1,204,798 research outputs found

    On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research

    Full text link
    Scientific research requires access, analysis, and sharing of data that is distributed across various heterogeneous data sources at the scale of the Internet. An eager ETL process constructs an integrated data repository as its first step, integrating and loading data in its entirety from the data sources. The bootstrapping of this process is not efficient for scientific research that requires access to data from very large and typically numerous distributed data sources. a lazy ETL process loads only the metadata, but still eagerly. Lazy ETL is faster in bootstrapping. However, queries on the integrated data repository of eager ETL perform faster, due to the availability of the entire data beforehand. In this paper, we propose a novel ETL approach for scientific data integration, as a hybrid of eager and lazy ETL approaches, and applied both to data as well as metadata. This way, Hybrid ETL supports incremental integration and loading of metadata and data from the data sources. We incorporate a human-in-the-loop approach, to enhance the hybrid ETL, with selective data integration driven by the user queries and sharing of integrated data between users. We implement our hybrid ETL approach in a prototype platform, Obidos, and evaluate it in the context of data sharing for medical research. Obidos outperforms both the eager ETL and lazy ETL approaches, for scientific research data integration and sharing, through its selective loading of data and metadata, while storing the integrated data in a scalable integrated data repository.Comment: Pre-print Submitted to the DMAH Special Issue of the Springer DAPD Journa

    Building and Analyzing a Corpus of Contextualized Traces Collected during a Technology Enhanced Teaching Module

    Get PDF
    International audience—Sharing and analyzing data collected within Technology Enhanced Learning environments is an interesting issue for researchers to validate their models and systems. In this paper we present a corpus we built and analyzed in order to validate our proposed " Proxy approach " as an approach for sharing and analyzing learning data corpora

    Sharing of Unlicensed Spectrum by Strategic Operators

    Full text link
    Facing the challenge of meeting ever-increasing demand for wireless data, the industry is striving to exploit large swaths of spectrum which anyone can use for free without having to obtain a license. Major standards bodies are currently considering a proposal to retool and deploy Long Term Evolution (LTE) technologies in unlicensed bands below 6 GHz. This paper studies the fundamental questions of whether and how the unlicensed spectrum can be shared by intrinsically strategic operators without suffering from the tragedy of the commons. A class of general utility functions is considered. The spectrum sharing problem is formulated as a repeated game over a sequence of time slots. It is first shown that a simple static sharing scheme allows a given set of operators to reach a subgame perfect Nash equilibrium for mutually beneficial sharing. The question of how many operators will choose to enter the market is also addressed by studying an entry game. A sharing scheme which allows dynamic spectrum borrowing and lending between operators is then proposed to address time-varying traffic and proved to achieve perfect Bayesian equilibrium. Numerical results show that the proposed dynamic sharing scheme outperforms static sharing, which in turn achieves much higher revenue than uncoordinated full-spectrum sharing. Implications of the results to the standardization and deployment of LTE in unlicensed bands (LTE-U) are also discussed.Comment: To appear in the IEEE Journal on Selected Areas in Communications, Special Issue on Game Theory for Network

    How to Share Data Online (fast) – A Taxonomy of Data Sharing Business Models

    Get PDF
    Data is an integral part of almost every business. Sharing data enables new opportunities to generate value or enrich the existing data repository, opening up new potentials for optimization and business models. However, these opportunities are still untapped, as sharing data comes with many challenges. First and foremost, aspects such as trust in partners, transparency, and the desire for security are issues that need to be addressed. Only then can data sharing be used efficiently in business models. The paper addresses this issue and generates guidance for the data-sharing business model (DSBM) design in the form of a taxonomy. The taxonomy is built on the empirical analysis of 80 DSBMs. With this, the primary contributions are structuring the field of an emerging phenomenon and outlining design options for these types of business models

    Conditionals in Homomorphic Encryption and Machine Learning Applications

    Get PDF
    Homomorphic encryption aims at allowing computations on encrypted data without decryption other than that of the final result. This could provide an elegant solution to the issue of privacy preservation in data-based applications, such as those using machine learning, but several open issues hamper this plan. In this work we assess the possibility for homomorphic encryption to fully implement its program without relying on other techniques, such as multiparty computation (SMPC), which may be impossible in many use cases (for instance due to the high level of communication required). We proceed in two steps: i) on the basis of the structured program theorem (Bohm-Jacopini theorem) we identify the relevant minimal set of operations homomorphic encryption must be able to perform to implement any algorithm; and ii) we analyse the possibility to solve -- and propose an implementation for -- the most fundamentally relevant issue as it emerges from our analysis, that is, the implementation of conditionals (requiring comparison and selection/jump operations). We show how this issue clashes with the fundamental requirements of homomorphic encryption and could represent a drawback for its use as a complete solution for privacy preservation in data-based applications, in particular machine learning ones. Our approach for comparisons is novel and entirely embedded in homomorphic encryption, while previous studies relied on other techniques, such as SMPC, demanding high level of communication among parties, and decryption of intermediate results from data-owners. Our protocol is also provably safe (sharing the same safety as the homomorphic encryption schemes), differently from other techniques such as Order-Preserving/Revealing-Encryption (OPE/ORE).Comment: 14 pages, 1 figure, corrected typos, added introductory pedagogical section on polynomial approximatio

    A Decentralized Personal Data Store based on Ethereum: Towards GDPR Compliance

    Get PDF
    Sharing personal data with service providers is a fundamental resource for the times we live in. But data sharing represents an unavoidable issue, due to improper data treatment, lack of users\u27 awareness to whom they are sharing with, wrong or excessive data sharing from end users who ignore they are exposing personal information. The problem becomes even more complicate if we try to consider the devices around us: how to share devices we own, so that we can receive pervasive services, based on our contexts and device functionalities. The European Authority has provided the General Data Protection Regulation (GDPR), in order to implement protection of sensitive data in each EU member, throughout certification mechanisms (according to Art. 42 GDPR). The certification assures compliance to the regulation, which represent a mandatory requirement for any service which may come in contact with sensitive data. Still the certification is an open process and not constrained by strict rule. In this paper we describe our decentralized approach in sharing personal data in the era of smart devices, being those considered sensitive data as well. Having in mind the centrality of users in the ownership of the data, we have proposed a decentralized Personal Data Store prototype, which stands as a unique data sharing endpoint for third party services. Even if blockchain technologies may seem fit to solve the issue of data protection, because of the absence of a central authority, they lay to additional concerns especially relating such technologies with specifications described in the regulation. The current work offers a contribution in the advancements of personal data sharing management systems in a distributed environment by presenting a real prototype and an architectural blueprint, which advances the state of the art in order to meet the GDPR regulation. Address those arisen issues, from a technological perspective, stands as an important challenge, in order to empower end users in owning their personal data for real

    Is a Semantic Web Agent a Knowledge-Savvy Agent?

    No full text
    The issue of knowledge sharing has permeated the field of distributed AI and in particular, its successor, multiagent systems. Through the years, many research and engineering efforts have tackled the problem of encoding and sharing knowledge without the need for a single, centralized knowledge base. However, the emergence of modern computing paradigms such as distributed, open systems have highlighted the importance of sharing distributed and heterogeneous knowledge at a larger scale—possibly at the scale of the Internet. The very characteristics that define the Semantic Web—that is, dynamic, distributed, incomplete, and uncertain knowledge—suggest the need for autonomy in distributed software systems. Semantic Web research promises more than mere management of ontologies and data through the definition of machine-understandable languages. The openness and decentralization introduced by multiagent systems and service-oriented architectures give rise to new knowledge management models, for which we can’t make a priori assumptions about the type of interaction an agent or a service may be engaged in, and likewise about the message protocols and vocabulary used. We therefore discuss the problem of knowledge management for open multi-agent systems, and highlight a number of challenges relating to the exchange and evolution of knowledge in open environments, which pertinent to both the Semantic Web and Multi Agent System communities alike

    Data Science in Healthcare

    Get PDF
    Data science is an interdisciplinary field that applies numerous techniques, such as machine learning, neural networks, and deep learning, to create value based on extracting knowledge and insights from available data. Advances in data science have a significant impact on healthcare. While advances in the sharing of medical information result in better and earlier diagnoses as well as more patient-tailored treatments, information management is also affected by trends such as increased patient centricity (with shared decision making), self-care (e.g., using wearables), and integrated care delivery. The delivery of health services is being revolutionized through the sharing and integration of health data across organizational boundaries. Via data science, researchers can deliver new approaches to merge, analyze, and process complex data and gain more actionable insights, understanding, and knowledge at the individual and population levels. This Special Issue focuses on how data science is used in healthcare (e.g., through predictive modeling) and on related topics, such as data sharing and data management

    HOW TO INCREASE EMPLOYEE’S DISCIPLINARY IN FACULTY MEDICINE OF DIPONEGORO UNIVERSITY

    Get PDF
    Human Capital plays an important role in organization. It is the heart of the organization strategy. Many factors embeded in it. Public Service employees as government human capital instead of the contract employees. The quality of the Public Service Employee has recently become a major issue. It is widely known that Public Service employee is lacking of disciplin. The issue discuss most in Public service employee’s disciplinary is absentheeism. This study presents factors that influence the Public Service employee’s disciplinary in Faculty of Medicine of Diponegoro University. This research purposes will support the decision – making process to increase the Faculty of Medicine of Diponegoro University employee’s disciplinary, this study proposing some models analyzed by SEM. The study population is administration staffs in Faculty of Medicine of Diponegoro University for both Public Service employees and contract employees. The respondents are 120 employees whose given some questionaires related to the study. The result of the data analysis shows that the human capital is influenced by knowledge sharing, empowerment and workplace environment. Whilst human capital influences employee’s disciplinary positively

    Verticalization of data sharing and the difficult path to eunnovation

    Get PDF
    Data sharing has been offered as a useful tool to open up impregnable markets to competition. EU law has a rich tradition in enabling business-to-business data sharing in a sector-specific (or vertical) fashion, which has formed the basis of the quest for an internal market where data flows freely. Two recent legislative instruments, the Digital Markets Act and the Data Act, contain industry- and actor-specific data sharing provisions. By unleashing troves of data hoarded by large incumbents, the Acts aspire to empower small and medium-sized enterprises, unlocking organic innovation. Notwithstanding the normative desirability of such a goal, it is unclear whether verticalized rules on data sharing can foster innovation by entrants and smaller undertakings. This Article legally and economically appraises the Acts to shed light on this issue. Read together, the data sharing provisions under the Digital Markets Act and the Data Act pursue the common aim of spurring disruptive (market creating) and complementary innovation. However, the Acts suffer from legal uncertainty and are liable to produce unintended economic consequences, such as diminishing the ability of complementors to satisfy consumers whilst simultaneously strengthening incumbent platform operators. The conclusions cast doubt on whether the vertical data sharing rules of the Acts can achieve their intended objectives, that is, ensuring the contestability of digital markets by promoting organic innovation by smaller scale firms
    • 

    corecore