Search CORE

12,194 research outputs found

Non-linear mapping for exploratory data analysis in functional genomics

Author: Azuaje Francisco
Chesneau Alban
Wang Haiying
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Several supervised and unsupervised learning tools are available to classify functional genomics data. However, relatively less attention has been given to exploratory, visualisation-driven approaches. Such approaches should satisfy the following factors: Support for intuitive cluster visualisation, user-friendly and robust application, computational efficiency and generation of biologically meaningful outcomes. This research assesses a relaxation method for non-linear mapping that addresses these concerns. Its applications to gene expression and protein-protein interaction data analyses are investigated RESULTS: Publicly available expression data originating from leukaemia, round blue-cell tumours and Parkinson disease studies were analysed. The method distinguished relevant clusters and critical analysis areas. The system does not require assumptions about the inherent class structure of the data, its mapping process is controlled by only one parameter and the resulting transformations offer intuitive, meaningful visual displays. Comparisons with traditional mapping models are presented. As a way of promoting potential, alternative applications of the methodology presented, an example of exploratory data analysis of interactome networks is illustrated. Data from the C. elegans interactome were analysed. Results suggest that this method might represent an effective solution for detecting key network hubs and for clustering biologically meaningful groups of proteins. CONCLUSION: A relaxation method for non-linear mapping provided the basis for visualisation-driven analyses using different types of data. This study indicates that such a system may represent a user-friendly and robust approach to exploratory data analysis. It may allow users to gain better insights into the underlying data structure, detect potential outliers and assess assumptions about the cluster composition of the data

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

VizRank: Data Visualization Guided by Machine Learning

Author: Bratko Ivan
Leban Gregor
Vidmar Gaj
Zupan Blaz
Publication venue
Publication date: 01/01/2006
Field of study

Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics

ePrints.FRI

Association of the IL-10 gene family locus on chromosome 1 with juvenile idiopathic arthritis (JIA)

Author: Bryant A
BSPAR study group
Childhood Arthritis Prospective Study (CAPS)
Childhood Arthritis Response to Medication Study (CHARMS)
Forabosco P
Hamaoui R
Hinks A
Lewis CM
Omoyinmi E
Thomson W
Ursu S
Wedderburn LR
Woo P
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

The cytokine IL-10 and its family members have been implicated in autoimmune diseases and we have previously reported that genetic variants in IL-10 were associated with a rare group of diseases called juvenile idiopathic arthritis (JIA). The aim of this study was to fine map genetic variants within the IL-10 cytokine family cluster on chromosome 1 using linkage disequilibrium (LD)-tagging single nucleotide polymorphisms (tSNPs) approach with imputation and conditional analysis to test for disease associations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Enlighten

The University of Manchester - Institutional Repository

King's Research Portal

FigShare

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Inferring gene regulatory networks using ensembles of feature selection techniques

Author: Demeester Piet
Dhaene Tom
Geurts Pierre
Huynh-thu Vân anh
Ruyssinck Joeri
Saeys Yvan
Publication venue
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

Of mice and men: Sparse statistical modeling in cardiovascular genomics

Author: Goldschmidt-Clermont Pascal J.
Seo David M.
West Mike
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/06/2007
Field of study

In high-throughput genomics, large-scale designed experiments are becoming common, and analysis approaches based on highly multivariate regression and anova concepts are key tools. Shrinkage models of one form or another can provide comprehensive approaches to the problems of simultaneous inference that involve implicit multiple comparisons over the many, many parameters representing effects of design factors and covariates. We use such approaches here in a study of cardiovascular genomics. The primary experimental context concerns a carefully designed, and rich, gene expression study focused on gene-environment interactions, with the goals of identifying genes implicated in connection with disease states and known risk factors, and in generating expression signatures as proxies for such risk factors. A coupled exploratory analysis investigates cross-species extrapolation of gene expression signatures--how these mouse-model signatures translate to humans. The latter involves exploration of sparse latent factor analysis of human observational data and of how it relates to projected risk signatures derived in the animal models. The study also highlights a range of applied statistical and genomic data analysis issues, including model specification, computational questions and model-based correction of experimental artifacts in DNA microarray data.Comment: Published at http://dx.doi.org/10.1214/07-AOAS110 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

University of Miami: Scholarship Miami

Recommended from our members

International meta-analysis of PTSD genome-wide association studies identifies sex- and ancestry-specific genetic risk loci.

Author: Aiello Allison E
Almli Lynn M
Amstadter Ananda B
Andersen Søren B
Andreassen Ole A
Arbisi Paul A
Ashley-Koch Allison E
Atkinson Elizabeth G
Austin S Bryn
Avdibegovic Esmina
Babić Dragan
Baker Dewleen G
Beckham Jean C
Bierut Laura J
Bisson Jonathan I
Boks Marco P
Bolger Elizabeth A
Bradley Bekh
Brashear Megan
Breen Gerome
Bryant Richard A
Bustamante Angela C
Bybjerg-Grauholm Jonas
Bækvad-Hansen Marie
Børglum Anders D
Calabrese Joseph R
Caldas-de-Almeida José M
Chen Chia-Yen
Choi Karmel W
Coleman Jonathan RI
Dale Anders M
Dalvie Shareefa
Daly Mark J
Daskalakis Nikolaos P
Deckert Jürgen
Delahanty Douglas L
Dennis Michelle F
Disner Seth G
Domschke Katharina
Duncan Laramie E
Dzubur-Kulenovic Alma
Erbes Christopher R
Evans Alexandra
Farrer Lindsay A
Feeny Norah C
Flory Janine D
Forbes David
Franz Carol E
Galea Sandro
Garrett Melanie E
Gelaye Bizu
Gelernter Joel
Geuze Elbert
Gillespie Charles
Gordon Scott D
Guffanti Guia
Hammamieh Rasha
Harnal Supriya
Hauser Michael A
Heath Andrew C
Hemmings Sian MJ
Hougaard David Michael
Jakovljevic Miro
Jett Marti
Johnson Eric Otto
Jones Ian
Jovanovic Tanja
Junglen Angela G
Karstoft Karen-Inge
Kaufman Milissa L
Kessler Ronald C
Khan Alaptagin
Kimbrel Nathan A
King Anthony P
Klengel Torsten
Koen Nastassja
Kranzler Henry R
Kremen William S
Lawford Bruce R
Lebois Lauren AM
Levey Daniel F
Lewis Catrin E
Linnstaedt Sarah D
Logue Mark W
Lori Adriana
Lugonja Bozo
Luykx Jurjen J
Lyons Michael J
Maihofer Adam X
Maples-Keller Jessica
Marmar Charles
Martin Alicia R
Nievergelt Caroline M
Polimanti Renato
Provost Allison C
Qin Xue-Jun
Ratanatharathorn Andrew
Stein Murray B
Torres Katy
Uka Aferdita Goci
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

The risk of posttraumatic stress disorder (PTSD) following trauma is heritable, but robust common variants have yet to be identified. In a multi-ethnic cohort including over 30,000 PTSD cases and 170,000 controls we conduct a genome-wide association study of PTSD. We demonstrate SNP-based heritability estimates of 5-20%, varying by sex. Three genome-wide significant loci are identified, 2 in European and 1 in African-ancestry analyses. Analyses stratified by sex implicate 3 additional loci in men. Along with other novel genes and non-coding RNAs, a Parkinson's disease gene involved in dopamine regulation, PARK2, is associated with PTSD. Finally, we demonstrate that polygenic risk for PTSD is significantly predictive of re-experiencing symptoms in the Million Veteran Program dataset, although specific loci did not replicate. These results demonstrate the role of genetic variation in the biology of risk for PTSD and highlight the necessity of conducting sex-stratified analyses and expanding GWAS beyond European ancestry populations

eScholarship - University of California

Wolf outside, dog inside? The genomic make-up of the Czechoslovakian Wolfdog

Author: Bolf\uedkov\ue1 Barbora \u10cern\ue1
Camatta Alessio
Caniglia Romolo
Carnier Paolo
Dykyy Ihor
Fabbri Elena
Galaverni Marco
Hulva Pavel
Jind\u159ichov\ue1 Milena
Randi Ettore
Stronen Astrid Vik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Background Genomic methods can provide extraordinary tools to explore the genetic background of wild species and domestic breeds, optimize breeding practices, monitor and limit the spread of recessive diseases, and discourage illegal crossings. In this study we analysed a panel of 170k Single Nucleotide Polymorphisms with a combination of multivariate, Bayesian and outlier gene approaches to examine the genome-wide diversity and inbreeding levels in a recent wolf x dog cross-breed, the Czechoslovakian Wolfdog, which is becoming increasingly popular across Europe. Results Pairwise FST values, multivariate and assignment procedures indicated that the Czechoslovakian Wolfdog was significantly differentiated from all the other analysed breeds and also well-distinguished from both parental populations (Carpathian wolves and German Shepherds). Coherently with the low number of founders involved in the breed selection, the individual inbreeding levels calculated from homozygosity regions were relatively high and comparable with those derived from the pedigree data. In contrast, the coefficient of relatedness between individuals estimated from the pedigrees often underestimated the identity-by-descent scores determined using genetic profiles. The timing of the admixture and the effective population size trends estimated from the LD patterns reflected the documented history of the breed. Ancestry reconstruction methods identified more than 300 genes with excess of wolf ancestry compared to random expectations, mainly related to key morphological features, and more than 2000 genes with excess of dog ancestry, playing important roles in lipid metabolism, in the regulation of circadian rhythms, in learning and memory processes, and in sociability, such as the COMT gene, which has been described as a candidate gene for the latter trait in dogs. Conclusions In this study we successfully applied genome-wide procedures to reconstruct the history of the Czechoslovakian Wolfdog, assess individual wolf ancestry proportions and, thanks to the availability of a well-annotated reference genome, identify possible candidate genes for wolf-like and dog-like phenotypic traits typical of this breed, including commonly inherited disorders. Moreover, through the identification of ancestry-informative markers, these genomic approaches could provide tools for forensic applications to unmask illegal crossings with wolves and uncontrolled trades of recent and undeclared wolfdog hybrids

Directory of Open Access Journals

VBN

Archivio istituzionale della ricerca - Università di Padova