Search CORE

14,205 research outputs found

Robust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data

Author: A Beyer
AL Barabási
BH Good
BW Kernighan
CO Daub
D Duewer
D Marbach
DFT Veiga
E Bonnet
E Ravasz
E Segal
EH Davidson
F Luo
G Balázsi
G Getz
G Palla
G Palla
H Zare
HW Ma
J Chen
J Duch
J Hubble
J Lemke
J Reichardt
JJ Faith
JJ Faith
JN Weinstein
K Baggerly
Kevin E. Bassler
KY Yeung
M Blatt
M Riley
MB Eisen
MEJ Newman
MEJ Newman
MF Traxler
MM Barker
N Friedman
N Friedman
O Alter
PD Karp
Q Lu
R Guimerà
RA Irizarry
S Fortunato
S Fortunato
S Gama-Castro
S Raychaudhuri
S Tavazoie
Santiago Treviño
Satoru Miyano
SB Seidman
SB Seidman
SP Borgatii
SP Borgatii
TF Cooper
Tim F. Cooper
TS Gardner
U Brandes
UN Raghavan
X Wen
Y Benjamini
Y Sun
Yudong Sun
Z Shi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/01/2012
Field of study

Determining the functional structure of biological networks is a central goal of systems biology. One approach is to analyze gene expression data to infer a network of gene interactions on the basis of their correlated responses to environmental and genetic perturbations. The inferred network can then be analyzed to identify functional communities. However, commonly used algorithms can yield unreliable results due to experimental noise, algorithmic stochasticity, and the influence of arbitrarily chosen parameter values. Furthermore, the results obtained typically provide only a simplistic view of the network partitioned into disjoint communities and provide no information of the relationship between communities. Here, we present methods to robustly detect coregulated and functionally enriched gene communities and demonstrate their application and validity for Escherichia coli gene expression data. Applying a recently developed community detection algorithm to the network of interactions identified with the context likelihood of relatedness (CLR) method, we show that a hierarchy of network communities can be identified. These communities significantly enrich for gene ontology (GO) terms, consistent with them representing biologically meaningful groups. Further, analysis of the most significantly enriched communities identified several candidate new regulatory interactions. The robustness of our methods is demonstrated by showing that a core set of functional communities is reliably found when artificial noise, modeling experimental noise, is added to the data. We find that noise mainly acts conservatively, increasing the relatedness required for a network link to be reliably assigned and decreasing the size of the core communities, rather than causing association of genes into new communities.Comment: Due to appear in PLoS Computational Biology. Supplementary Figure S1 was not uploaded but is available by contacting the author. 27 pages, 5 figures, 15 supplementary file

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

A Sparse Graph-Structured Lasso Mixed Model for Genetic Association with Confounding Correction

Author: Liu Xiang
Wang Haohan
Xing Eric P.
Ye Wenting
Publication venue
Publication date: 11/11/2017
Field of study

While linear mixed model (LMM) has shown a competitive performance in correcting spurious associations raised by population stratification, family structures, and cryptic relatedness, more challenges are still to be addressed regarding the complex structure of genotypic and phenotypic data. For example, geneticists have discovered that some clusters of phenotypes are more co-expressed than others. Hence, a joint analysis that can utilize such relatedness information in a heterogeneous data set is crucial for genetic modeling. We proposed the sparse graph-structured linear mixed model (sGLMM) that can incorporate the relatedness information from traits in a dataset with confounding correction. Our method is capable of uncovering the genetic associations of a large number of phenotypes together while considering the relatedness of these phenotypes. Through extensive simulation experiments, we show that the proposed model outperforms other existing approaches and can model correlation from both population structure and shared signals. Further, we validate the effectiveness of sGLMM in the real-world genomic dataset on two different species from plants and humans. In Arabidopsis thaliana data, sGLMM behaves better than all other baseline models for 63.4% traits. We also discuss the potential causal genetic variation of Human Alzheimer's disease discovered by our model and justify some of the most important genetic loci.Comment: Code available at https://github.com/YeWenting/sGLM

arXiv.org e-Print Archive

A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes

Author: Auburn Sarah
Berriman Matthew
Böhme Ulrike
Gao Qi
Hostetler Jessica
Newbold Chris I
Nosten Francois
Otto Thomas D.
Price Ric N
Sanders Mandy
Steinbiss Sascha
Trimarsanto Hidayat
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2016
Field of study

Plasmodium vivax is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous ex vivo culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of P. vivax have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 Plasmodium interspersed repeat (pir) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Enlighten

What traits are carried on mobile genetic elements, and why?

Author: A Gardner
A Gardner
A Gardner
A Gardner
A Gardner
A Gardner
A Mochizuki
A Norman
A Philppon
A Wagner
AD O’Brien
AS Griffin
B Sanchez
BJ Crespi
BK Kozlowicz
BMM Ahmer
BR Levin
C Buchrieser
CE White
CF Amabile-Cuevas
CG Kurland
CM Thomas
CM Yates
CT Bergstrom
D Gevers
D J Rankin
DJ Rankin
DM Livermore
E Cascales
E Cascales
E Lerat
E P C Rocha
E Polzleitner
ES Egan
F Bushman
F de la Cruz
F Delsuc
F Dionisio
F Dionisio
F Harrison
F Rousset
FM Stewart
FR Slater
G Gil
G Hardin
GA Dykes
H Knothe
H Ochman
I Chen
I Kobayashi
I Rasched
J Hacker
J Lederberg
J Paulsson
J Smith
J Zupan
JB Ferdy
JB Xavier
JC Diaz Ricci
JE Bouma
JL Martinez
JL Martínez
JL Sachs
JL Sachs
JR van der Ploeg
K Lee
KM Oliver
KR Foster
KR Foster
KR Foster
L Buts
L Chao
L Keller
L Lehmann
L Lehmann
L Riboli-Sasco
L Van Melderen
LE Orgel
LJ Johnson
LN Lili
LS Frost
M Achtman
M Ackermann
M Touchon
MA Riley
MI Bahl
MI Bahl
MK Waldor
MP Nuti
N Goldenfeld
ND Zinder
PB Rainey
PE Turner
PE Turner
R Barrangou
R Korona
RD Magnuson
RE Fox
RJ Ellis
RJF Haft
RM Anderson
RM Anderson
RM May
S Alizon
S P Brown
SA West
SA West
SA West
SA West
SC Slater
SJ Sorensen
SM Faruque
SP Brown
SP Brown
SP Brown
SP Brown
SR Bordenstein
T Nogueira
TF Cooper
TF Cooper
VL Arcus
WD Hamilton
WD Hamilton
WG Eberhard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Although similar to any other organism, prokaryotes can transfer genes vertically from mother cell to daughter cell, they can also exchange certain genes horizontally. Genes can move within and between genomes at fast rates because of mobile genetic elements (MGEs). Although mobile elements are fundamentally self-interested entities, and thus replicate for their own gain, they frequently carry genes beneficial for their hosts and/or the neighbours of their hosts. Many genes that are carried by mobile elements code for traits that are expressed outside of the cell. Such traits are involved in bacterial sociality, such as the production of public goods, which benefit a cell's neighbours, or the production of bacteriocins, which harm a cell's neighbours. In this study we review the patterns that are emerging in the types of genes carried by mobile elements, and discuss the evolutionary and ecological conditions under which mobile elements evolve to carry their peculiar mix of parasitic, beneficial and cooperative genes

Crossref

PubMed Central

Edinburgh Research Explorer

ZORA

Finding co-solvers on Twitter, with a little help from Linked Data

Author: A. Burton-Jones
B.F. Jones
C. Macdonald
C. Wagner
C.-N. Ziegler
H. Luo
H. Ziaimatin
J. Letierce
K. Balog
K. Lakiotaki
P. Kazienko
P. Serdyukov
P.J. Hinds
Q. Li
R.L. Cilibrasi
S. Matos
S. Siersdorfer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper we propose a method for suggesting potential collaborators for solving innovation challenges online, based on their competence, similarity of interests and social proximity with the user. We rely on Linked Data to derive a measure of semantic relatedness that we use to enrich both user profiles and innovation problems with additional relevant topics, thereby improving the performance of co-solver recommendation. We evaluate this approach against state of the art methods for query enrichment based on the distribution of topics in user profiles, and demonstrate its usefulness in recommending collaborators that are both complementary in competence and compatible with the user. Our experiments are grounded using data from the social networking service Twitter.com

Crossref

Open Research Online (The Open University)

Lancaster E-Prints

Statistical Modeling of Epistasis and Linkage Decay using Logic Regression

Author: Jean-Luc Jannink
John A. Henning
Peter Szucs
Thomas B. Parker
Walt F. Mahaffee
Publication venue
Publication date: 18/11/2008
Field of study

Logic regression has been recognized as a tool that can identify and model non-additive genetic interactions using Boolean logic groups. Logic regression, TASSEL-GLM and SAS-GLM were compared for analytical precision using a previously characterized model system to identify the best genetic model explaining epistatic interaction of vernalization-sensitivity in barley. A genetic model containing two molecular markers identified in vernalization response in barley was selected using logic regression while both TASSEL-GLM and SAS-GLM included spurious associations in their models. The results also suggest the logic regression can be used to identify dominant/recessive relationships between epistatic alleles through its use of conjugate
operators

Crossref

Nature Precedings

Analysis of somatic mutations across the kinome reveals loss-of-function mutations in multiple cancer types

Author: Bose Ron
Kumar Runjun D
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

Analysis of the human diseasome reveals phenotype modules across common, genetic, and infectious diseases

Author: Gkoutos Georgios V
Hoehndorf Robert
Schofield Paul N
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2014
Field of study

Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text- mining approach to identify the phenotypes (signs and symptoms) associated with over 8,000 diseases. We demonstrate that our method generates phenotypes that correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that share signs and symptoms cluster together, and we use this network to identify phenotypic disease modules

arXiv.org e-Print Archive

University of Birmingham Research Portal

PubMed Central