462,896 research outputs found

    Informed Consent to Address Trust, Control, and Privacy Concerns in User Profiling

    Get PDF
    More and more, services and products are being personalised or\ud tailored, based on user-related data stored in so called user profiles or user\ud models. Although user profiling offers great benefits for both organisations and\ud users, there are several psychological factors hindering the potential success of user profiling. The most important factors are trust, control and privacy\ud concerns. This paper presents informed consent as a means to address the\ud hurdles trust, control, and privacy concerns pose to user profiling

    Towards information profiling: data lake content metadata management

    Get PDF
    There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, commonly called Data Lakes (DL). These BD require new techniques of data integration and schema alignment in order to make the data usable by its consumers and to discover the relationships linking their content. This can be provided by metadata services which discover and describe their content. However, there is currently a lack of a systematic approach for such kind of metadata discovery and management. Thus, we propose a framework for the profiling of informational content stored in the DL, which we call information profiling. The profiles are stored as metadata to support data analysis. We formally define a metadata management process which identifies the key activities required to effectively handle this.We demonstrate the alternative techniques and performance of our process using a prototype implementation handling a real-life case-study from the OpenML DL, which showcases the value and feasibility of our approach.Peer ReviewedPostprint (author's final draft

    Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review

    Get PDF
    A variety of genome-wide profiling techniques are available to probe complementary aspects of genome structure and function. Integrative analysis of heterogeneous data sources can reveal higher-level interactions that cannot be detected based on individual observations. A standard integration task in cancer studies is to identify altered genomic regions that induce changes in the expression of the associated genes based on joint analysis of genome-wide gene expression and copy number profiling measurements. In this review, we provide a comparison among various modeling procedures for integrating genome-wide profiling data of gene copy number and transcriptional alterations and highlight common approaches to genomic data integration. A transparent benchmarking procedure is introduced to quantitatively compare the cancer gene prioritization performance of the alternative methods. The benchmarking algorithms and data sets are available at http://intcomp.r-forge.r-project.orgComment: PDF file including supplementary material. 9 pages. Preprin

    Epitope profiling via mixture modeling of ranked data

    Full text link
    We propose the use of probability models for ranked data as a useful alternative to a quantitative data analysis to investigate the outcome of bioassay experiments, when the preliminary choice of an appropriate normalization method for the raw numerical responses is difficult or subject to criticism. We review standard distance-based and multistage ranking models and in this last context we propose an original generalization of the Plackett-Luce model to account for the order of the ranking elicitation process. The usefulness of the novel model is illustrated with its maximum likelihood estimation for a real data set. Specifically, we address the heterogeneous nature of experimental units via model-based clustering and detail the necessary steps for a successful likelihood maximization through a hybrid version of the Expectation-Maximization algorithm. The performance of the mixture model using the new distribution as mixture components is compared with those relative to alternative mixture models for random rankings. A discussion on the interpretation of the identified clusters and a comparison with more standard quantitative approaches are finally provided.Comment: (revised to properly include references

    Coordinating views for data visualisation and algorithmic profiling

    Get PDF
    A number of researchers have designed visualisation systems that consist of multiple components, through which data and interaction commands flow. Such multistage (hybrid) models can be used to reduce algorithmic complexity, and to open up intermediate stages of algorithms for inspection and steering. In this paper, we present work on aiding the developer and the user of such algorithms through the application of interactive visualisation techniques. We present a set of tools designed to profile the performance of other visualisation components, and provide further functionality for the exploration of high dimensional data sets. Case studies are provided, illustrating the application of the profiling modules to a number of data sets. Through this work we are exploring ways in which techniques traditionally used to prepare for visualisation runs, and to retrospectively analyse them, can find new uses within the context of a multi-component visualisation system

    Data base for the Colorado profiling network

    Get PDF
    The Colorado profiling system developed by the Wave Propagation Laboratory (WPL) includes five (soon to be six) Doppler radar wind Profilers; four operate at 49 MHz (6 m) and are located at Platteville, Fleming, Lay Creek, and Cahone, and one operates at 915 MHz (33 cm) and is located at Denver. The sixth radar, now under construction, will operate at 405 MHz (UHF) and will be located at Boulder. Microwave radiometers and surface meteorological stations are at some of the radar sites. The data base for the wind Profilers is discussed
    • …
    corecore