Search CORE

462,896 research outputs found

Recommended from our members

PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures.

Author: Aviran Sharon
Ledda Mirko
Publication venue: eScholarship, University of California
Publication date: 01/03/2018
Field of study

Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions

eScholarship - University of California

Informed Consent to Address Trust, Control, and Privacy Concerns in User Profiling

Author: Geest Thea van der
Pieterson Willem
Vries Peter de
Publication venue
Publication date: 01/01/2005
Field of study

More and more, services and products are being personalised or\ud tailored, based on user-related data stored in so called user profiles or user\ud models. Although user profiling offers great benefits for both organisations and\ud users, there are several psychological factors hindering the potential success of user profiling. The most important factors are trust, control and privacy\ud concerns. This paper presents informed consent as a means to address the\ud hurdles trust, control, and privacy concerns pose to user profiling

University of Twente Research Information

Towards information profiling: data lake content metadata management

Author: Abelló Gamazo Alberto
Al-serafi Ayman Mounir Mohamed
Calders Toon
Romero Moral Óscar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, commonly called Data Lakes (DL). These BD require new techniques of data integration and schema alignment in order to make the data usable by its consumers and to discover the relationships linking their content. This can be provided by metadata services which discover and describe their content. However, there is currently a lack of a systematic approach for such kind of metadata discovery and management. Thus, we propose a framework for the profiling of informational content stored in the DL, which we call information profiling. The profiles are stored as metadata to support data analysis. We formally define a metadata management process which identifies the key activities required to effectively handle this.We demonstrate the alternative techniques and performance of our process using a prototype implementation handling a real-life case-study from the OpenML DL, which showcases the value and feasibility of our approach.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review

Author: Akavia
Andrews
Baasiri
Chin
Dai
De Bie
Futreal
H.-U. Klein
Haverty
Hawkins
Hyman
Johnson
Kao
L. Lahti
M. Dugas
M. Schafer
McLendon
Menezes
Mullighan
Mullighan
Myllykangas
Olshen
Ortiz-Estevez
Phillips
Qin
S. Bicciato
Solvang
Soneson
Stranger
van Wieringen
van Wieringen
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/11/2011
Field of study

A variety of genome-wide profiling techniques are available to probe complementary aspects of genome structure and function. Integrative analysis of heterogeneous data sources can reveal higher-level interactions that cannot be detected based on individual observations. A standard integration task in cancer studies is to identify altered genomic regions that induce changes in the expression of the associated genes based on joint analysis of genome-wide gene expression and copy number profiling measurements. In this review, we provide a comparison among various modeling procedures for integrating genome-wide profiling data of gene copy number and transcriptional alterations and highlight common approaches to genomic data integration. A transparent benchmarking procedure is introduced to quantitatively compare the cancer gene prioritization performance of the alternative methods. The benchmarking algorithms and data sets are available at http://intcomp.r-forge.r-project.orgComment: PDF file including supplementary material. 9 pages. Preprin

arXiv.org e-Print Archive

Crossref

PubMed Central

Wageningen University & Research Publications

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Epitope profiling via mixture modeling of ranked data

Author: Benter
Busse
Caron
Critchlow
Critchlow
Croon
Croon
Dempster
Diaconis
Fligner
Fligner
Fraley
Gabrielli
Gormley
Gormley
Gormley
Gormley
Guiver
Hunter
Hunter
Lange
Lee
Luce
Marden
McLachlan
McLachlan
McNicholas
Murphy
Plackett
Schwarz
Vaida
Publication venue
Publication date: 01/01/2014
Field of study

We propose the use of probability models for ranked data as a useful alternative to a quantitative data analysis to investigate the outcome of bioassay experiments, when the preliminary choice of an appropriate normalization method for the raw numerical responses is difficult or subject to criticism. We review standard distance-based and multistage ranking models and in this last context we propose an original generalization of the Plackett-Luce model to account for the order of the ranking elicitation process. The usefulness of the novel model is illustrated with its maximum likelihood estimation for a real data set. Specifically, we address the heterogeneous nature of experimental units via model-based clustering and detail the necessary steps for a successful likelihood maximization through a hybrid version of the Expectation-Maximization algorithm. The performance of the mixture model using the new distribution as mixture components is compared with those relative to alternative mixture models for random rankings. A discussion on the interpretation of the identified clusters and a comparison with more standard quantitative approaches are finally provided.Comment: (revised to properly include references

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Coordinating views for data visualisation and algorithmic profiling

Author: Chalmers M.
Morrison A.
Ross G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

A number of researchers have designed visualisation systems that consist of multiple components, through which data and interaction commands flow. Such multistage (hybrid) models can be used to reduce algorithmic complexity, and to open up intermediate stages of algorithms for inspection and steering. In this paper, we present work on aiding the developer and the user of such algorithms through the application of interactive visualisation techniques. We present a set of tools designed to profile the performance of other visualisation components, and provide further functionality for the exploration of high dimensional data sets. Case studies are provided, illustrating the application of the profiling modules to a number of data sets. Through this work we are exploring ways in which techniques traditionally used to prepare for visualisation runs, and to retrospectively analyse them, can find new uses within the context of a multi-component visualisation system

CiteSeerX

Enlighten

Data base for the Colorado profiling network

Author: Merritt D. A.
Publication venue
Publication date
Field of study

The Colorado profiling system developed by the Wave Propagation Laboratory (WPL) includes five (soon to be six) Doppler radar wind Profilers; four operate at 49 MHz (6 m) and are located at Platteville, Fleming, Lay Creek, and Cahone, and one operates at 915 MHz (33 cm) and is located at Denver. The sixth radar, now under construction, will operate at 405 MHz (UHF) and will be located at Boulder. Microwave radiometers and surface meteorological stations are at some of the radar sites. The data base for the wind Profilers is discussed

NASA Technical Reports Server