267,231 research outputs found

    Homo Datumicus : correcting the market for identity data

    Get PDF
    Effective digital identity systems offer great economic and civic potential. However, unlocking this potential requires dealing with social, behavioural, and structural challenges to efficient market formation. We propose that a marketplace for identity data can be more efficiently formed with an infrastructure that provides a more adequate representation of individuals online. This paper therefore introduces the ontological concept of Homo Datumicus: individuals as data subjects transformed by HAT Microservers, with the axiomatic computational capabilities to transact with their own data at scale. Adoption of this paradigm would lower the social risks of identity orientation, enable privacy preserving transactions by default and mitigate the risks of power imbalances in digital identity systems and markets

    Modelling long term digital preservation costs: a scientific data case study

    Get PDF
    In recent years there has been increasing UK Government pressure on publicly funded researchers to plan the preservation and ensure the accessibility of their data for the long term. A critical challenge in implementing a digital preservation strategy is the estimation of such a programme’s cost. This pa-per presents a case study based on the cost estimation of preserving scientific data produced in the ISIS facility based at The Science and Technology Facilities Council (STFC) Rutherford Appleton Laboratory UK. The model for cost estimation for long term digital preservation is presented along with an outline of the development and validation activities undertaken as part of this project. The framework and methodology from this research provide an insight into the task of costing long term digital preservation processes, and can potentially be adapted to deliver benefits to other organisa-tions

    A framework for privacy preserving digital trace data collection through data donation

    Get PDF
    A potentially powerful method of social-scientific data collection and investigation has been created by an unexpected institution: the law. Article 15 of the EU’s 2018 General Data Protection Regulation (GDPR) mandates that individuals have electronic access to a copy of their personal data, and all major digital platforms now comply with this law by providing users with “data download packages” (DDPs). Through voluntary donation of DDPs, all data collected by public and private entities during the course of citizens’ digital life can be obtained and analyzed to answer social-scientific questions – with consent. Thus, consented DDPs open the way for vast new research opportunities. However, while this entirely new method of data collection will undoubtedly gain popularity in the coming years, it also comes with its own questions of representativeness and measurement quality, which are often evaluated systematically by means of an error framework. Therefore, in this paper we provide a blueprint for digital trace data collection using DDPs, and devise a “total error framework” for such projects. Our error framework for digital trace data collection through data donation is intended to facilitate high quality social-scientific investigations using DDPs while critically reflecting its unique methodological challenges and sources of error. In addition, we provide a quality control checklist to guide researchers in leveraging the vast opportunities afforded by this new mode of investigation

    Synthetic Observational Health Data with GANs: from slow adoption to a boom in medical research and ultimately digital twins?

    Full text link
    After being collected for patient care, Observational Health Data (OHD) can further benefit patient well-being by sustaining the development of health informatics and medical research. Vast potential is unexploited because of the fiercely private nature of patient-related data and regulations to protect it. Generative Adversarial Networks (GANs) have recently emerged as a groundbreaking way to learn generative models that produce realistic synthetic data. They have revolutionized practices in multiple domains such as self-driving cars, fraud detection, digital twin simulations in industrial sectors, and medical imaging. The digital twin concept could readily apply to modelling and quantifying disease progression. In addition, GANs posses many capabilities relevant to common problems in healthcare: lack of data, class imbalance, rare diseases, and preserving privacy. Unlocking open access to privacy-preserving OHD could be transformative for scientific research. In the midst of COVID-19, the healthcare system is facing unprecedented challenges, many of which of are data related for the reasons stated above. Considering these facts, publications concerning GAN applied to OHD seemed to be severely lacking. To uncover the reasons for this slow adoption, we broadly reviewed the published literature on the subject. Our findings show that the properties of OHD were initially challenging for the existing GAN algorithms (unlike medical imaging, for which state-of-the-art model were directly transferable) and the evaluation synthetic data lacked clear metrics. We find more publications on the subject than expected, starting slowly in 2017, and since then at an increasing rate. The difficulties of OHD remain, and we discuss issues relating to evaluation, consistency, benchmarking, data modelling, and reproducibility.Comment: 31 pages (10 in previous version), not including references and glossary, 51 in total. Inclusion of a large number of recent publications and expansion of the discussion accordingl

    Preserving the impossible: conservation of soft-sediment hominin footprint sites and strategies for three-dimensional digital data capture.

    Get PDF
    Human footprints provide some of the most publically emotive and tangible evidence of our ancestors. To the scientific community they provide evidence of stature, presence, behaviour and in the case of early hominins potential evidence with respect to the evolution of gait. While rare in the geological record the number of footprint sites has increased in recent years along with the analytical tools available for their study. Many of these sites are at risk from rapid erosion, including the Ileret footprints in northern Kenya which are second only in age to those at Laetoli (Tanzania). Unlithified, soft-sediment footprint sites such these pose a significant geoconservation challenge. In the first part of this paper conservation and preservation options are explored leading to the conclusion that to 'record and digitally rescue' provides the only viable approach. Key to such strategies is the increasing availability of three-dimensional data capture either via optical laser scanning and/or digital photogrammetry. Within the discipline there is a developing schism between those that favour one approach over the other and a requirement from geoconservationists and the scientific community for some form of objective appraisal of these alternatives is necessary. Consequently in the second part of this paper we evaluate these alternative approaches and the role they can play in a 'record and digitally rescue' conservation strategy. Using modern footprint data, digital models created via optical laser scanning are compared to those generated by state-of-the-art photogrammetry. Both methods give comparable although subtly different results. This data is evaluated alongside a review of field deployment issues to provide guidance to the community with respect to the factors which need to be considered in digital conservation of human/hominin footprints

    Data hidding in color images using perceptual models

    Get PDF
    One of the problems arising from the use of digital media is the ease of identical copies of digital images or audio files, allowing manipulation and unauthorized use. Copyright is an effective tool for preserving intellectual property of those documents but authors and publishers need effective techniques that prevent from copyright modification, due to the straightforward access to multimedia applications and the wider use of digital publications through the www. These techniques are generally called watermarking and allow the introduction of side information (i.e. author identification, copyrights, dates, etc.). This work concentrates on the problem embedding and optimum blind detection of data in color images through the use of spread spectrum techniques, both in space (Direct Sequence Spread Spectrum or DSSS) and frequency (Frequency Hopping). It is applied to RGB and opponent color component representations. Perceptive information is considered in both color systems. Some tests are performed in order to ensure imperceptibility and to assess detection quality of the optimum color detectors.Peer ReviewedPostprint (published version

    Migration on request, a practical technique for preservation

    Get PDF
    Maintaining a digital object in a usable state over time is a crucial aspect of digital preservation. Existing methods of preserving have many drawbacks. This paper describes advanced techniques of data migration which can be used to support preservation more accurately and cost effectively. To ensure that preserved works can be rendered on current computer systems over time, “traditional migration” has been used to convert data into current formats. As the new format becomes obsolete another conversion is performed, etcetera. Traditional migration has many inherent problems as errors during transformation propagate throughout future transformations. CAMiLEON’s software longevity principles can be applied to a migration strategy, offering improvements over traditional migration. This new approach is named “Migration on Request.” Migration on Request shifts the burden of preservation onto a single tool, which is maintained over time. Always returning to the original format enables potential errors to be significantly reduced
    • …
    corecore