Search CORE

Queen Mary Research Online

Bern Open Repository and Information System (BORIS)

MPG.PuRe

FigShare

Genome Majority Vote Improves Gene Predictions

Author: A Pallejà
A Pati
AE Tenney
AL Delcher
Christos A. Ouzounis
D Hyatt
D Vallenet
DP Herlemann
G Parra
I Korf
J Besemer
J Dunbar
John Dunbar
Judith D. Cohn
KE Rudd
M Alexandersson
M Dai
M Riley
M Walker
Michael E. Wall
MR Brent
MS Poptsova
P Flicek
R Guigó
RC Edgar
RG Skophammer
RK Aziz
SF Altschul
Sindhu Raghavan
SS Gross
SS Gross
WJ Bruno
WJ Bruno
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Recent studies have noted extensive inconsistencies in gene start sites among orthologous genes in related microbial genomes. Here we provide the first documented evidence that imposing gene start consistency improves the accuracy of gene start-site prediction. We applied an algorithm using a genome majority vote (GMV) scheme to increase the consistency of gene starts among orthologs. We used a set of validated Escherichia coli genes as a standard to quantify accuracy. Results showed that the GMV algorithm can correct hundreds of gene prediction errors in sets of five or ten genomes while introducing few errors. Using a conservative calculation, we project that GMV would resolve many inconsistencies and errors in publicly available microbial gene maps. Our simple and logical solution provides a notable advance toward accurate gene maps

CiteSeerX

Public Library of Science (PLOS)

Public Library of Science (PLOS)

Texas ScholarWorks

A Meta-Analysis of Microarray Gene Expression in Mouse Stem Cells: Redefining Stemness

Author: A Sandelin
A Smith
AI Su
BE Bernstein
D Baek
David T. Jones
H Parkinson
JL Attema
JR Landry
K Kimura
K Tsuritani
Kevin Bryson
LO Barrera
M Ashburner
M Buszczak
M Gardiner-Garden
M Grskovic
M Ramalho-Santos
NB Ivanova
NB Ivanova
NO Fortunel
P Carninci
P Flicek
P Rice
PA Jones
RA Irizarry
RC Gentleman
RH Waterston
S Falcon
S Fukada
S Prabhakar
T Barrett
TA Venezia
TB Miranda
TJ Hubbard
Winston Hide
Yvonne J. K. Edwards
Publication venue: Public Library of Science
Publication date: 01/07/2008
Field of study

While much progress has been made in understanding stem cell (SC) function, a complete description of the molecular mechanisms regulating SCs is not yet established. This lack of knowledge is a major barrier holding back the discovery of therapeutic uses of SCs. We investigated the value of a novel meta-analysis of microarray gene expression in mouse SCs to aid the elucidation of regulatory mechanisms common to SCs and particular SC types.We added value to previously published microarray gene expression data by characterizing the promoter type likely to regulate transcription. Promoters of up-regulated genes in SCs were characterized in terms of alternative promoter (AP) usage and CpG-richness, with the aim of correlating features known to affect transcriptional control with SC function. We found that SCs have a higher proportion of up-regulated genes using CpG-rich promoters compared with the negative controls. Comparing subsets of SC type with the controls a slightly different story unfolds. The differences between the proliferating adult SCs and the embryonic SCs versus the negative controls are statistically significant. Whilst the difference between the quiescent adult SCs compared with the negative controls is not. On examination of AP usage, no difference was observed between SCs and the controls. However, comparing the subsets of SC type with the controls, the quiescent adult SCs are found to up-regulate a larger proportion of genes that have APs compared to the controls and the converse is true for the proliferating adult SCs and the embryonic SCs.These findings suggest that looking at features associated with control of transcription is a promising future approach for characterizing “stemness” and that further investigations of stemness could benefit from separate considerations of different SC states. For example, “proliferating-stemness” is shown here, in terms of promoter usage, to be distinct from “quiescent-stemness”

UCL Discovery

Enlighten

Ortho2ExpressMatrix—a web server that interprets cross-species gene expression data by gene family information

Author: A Krause
A Krause
A Valencia
AC Berglund
AJ Enright
AJ Enright
AJ Vilella
Andreas H Ludewig
BY Liao
C Frech
EL Sonnhammer
EV Koonin
G Ostlund
H Edwards
H Parkinson
HS Le
I Rivals
J Michaud
KI Goh
L Huminiecki
M Kanehisa
M Kapushesky
M Pellegrini
M Remm
Michal R Schweiger
P Flicek
Ralf Herwig
Ramu Chenna
RC Friedman
RD Finn
RL Tatusov
S Abhiman
S Griffiths-Jones
S Haider
SF Altschul
Sylvia Krobitsch
T Barrett
T Domazet-Loso
T Meinel
T Meinel
Thomas Meinel
TJ Hubbard
TW Harris
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The study of gene families is pivotal for the understanding of gene evolution across different organisms and such phylogenetic background is often used to infer biochemical functions of genes. Modern high-throughput experiments offer the possibility to analyze the entire transcriptome of an organism; however, it is often difficult to deduct functional information from that data. Results To improve functional interpretation of gene expression we introduce Ortho2ExpressMatrix, a novel tool that integrates complex gene family information, computed from sequence similarity, with comparative gene expression profiles of two pre-selected biological objects: gene families are displayed with two-dimensional matrices. Parameters of the tool are object type (two organisms, two individuals, two tissues, etc.), type of computational gene family inference, experimental meta-data, microarray platform, gene annotation level and genome build. Family information in Ortho2ExpressMatrix bases on computationally different protein family approaches such as EnsemblCompara, InParanoid, SYSTERS and Ensembl Family. Currently, respective all-against-all associations are available for five species: human, mouse, worm, fruit fly and yeast. Additionally, microRNA expression can be examined with respect to miRBase or TargetScan families. The visualization, which is typical for Ortho2ExpressMatrix, is performed as matrix view that displays functional traits of genes (differential expression) as well as sequence similarity of protein family members (BLAST e-values) in colour codes. Such translations are intended to facilitate the user's perception of the research object. Conclusions Ortho2ExpressMatrix integrates gene family information with genome-wide expression data in order to enhance functional interpretation of high-throughput analyses on diseases, environmental factors, or genetic modification or compound treatment experiments. The tool explores differential gene expression in the light of orthology, paralogy and structure of gene families up to the point of ambiguity analyses. Results can be used for filtering and prioritization in functional genomic, biomedical and systems biology applications. The web server is freely accessible at <url>http://bioinf-data.charite.de/o2em/cgi-bin/o2em.pl</url>.</p

Springer - Publisher Connector

MPG.PuRe

BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources

Author: AI Su
AM Jenkinson
Andrew I Su
AR Pico
B Mons
C Goble
C Wu
Camilo Orozco
Christopher L Hodge
Chunlei Wu
CJ Bult
D Maglott
ES Lein
James Goodale
James Haase
Jason Boyer
JE Lattin
Jeff Janes
Jon W Huss
JR Walker
JW Huss
L Wang
LJ Jensen
M Kanehisa
Marc Leglise
MD Brazas
MM Dix
MY Galperin
P Flicek
P Yue
R Hoffmann
RC Friedman
S Xu
Serge Batalov
SN Twigger
X Zhang
Publication venue: BioMed Central
Publication date
Field of study

BioGPS is a community based customisable gene annotation portal bringing together gene annotation resources on to a single platform

pubmed2ensembl: A Resource for Mining the Biological Literature on Genes

Author: A Doms
AA Morgan
AM Jenkinson
B Giardine
BA Eckman
C Plake
Casey M. Bergman
D Hull
D Maglott
D Smedley
E Ryder
EM Zdobnov
G Zhou
Goran Nenadic
H Miller
H Parkinson
J Hakenberg
J Hirschman
J Tamames
JM Fernandez
Joachim Baran
L Chen
L Hirschman
M Ashburner
M Gerner
M Haeussler
M Huang
M Krallinger
M Krallinger
Martin Gerner
Maximilian Haeussler
P Flicek
P Kersey
PA Fujita
R Drysdale
R Hoffmann
R Leinonen
R Lyne
RC Gentleman
S Matos
SM Gallo
SP Shah
SS Dwight
Stein Aerts
T Imanishi
TJ Lee
U Mudunuri
W Xuan
Y Makita
Y Yoshida
Z Lu
Publication venue: Public Library of Science
Publication date: 29/09/2011
Field of study

The last two decades have witnessed a dramatic acceleration in the production of genomic sequence information and publication of biomedical articles. Despite the fact that genome sequence data and publications are two of the most heavily relied-upon sources of information for many biologists, very little effort has been made to systematically integrate data from genomic sequences directly with the biological literature. For a limited number of model organisms dedicated teams manually curate publications about genes; however for species with no such dedicated staff many thousands of articles are never mapped to genes or genomic regions.To overcome the lack of integration between genomic data and biological literature, we have developed pubmed2ensembl (http://www.pubmed2ensembl.org), an extension to the BioMart system that links over 2,000,000 articles in PubMed to nearly 150,000 genes in Ensembl from 50 species. We use several sources of curated (e.g., Entrez Gene) and automatically generated (e.g., gene names extracted through text-mining on MEDLINE records) sources of gene-publication links, allowing users to filter and combine different data sources to suit their individual needs for information extraction and biological discovery. In addition to extending the Ensembl BioMart database to include published information on genes, we also implemented a scripting language for automated BioMart construction and a novel BioMart interface that allows text-based queries to be performed against PubMed and PubMed Central documents in conjunction with constraints on genomic features. Finally, we illustrate the potential of pubmed2ensembl through typical use cases that involve integrated queries across the biomedical literature and genomic data.By allowing biologists to find the relevant literature on specific genomic regions or sets of functionally related genes more easily, pubmed2ensembl offers a much-needed genome informatics inspired solution to accessing the ever-increasing biomedical literature

The University of Manchester - Institutional Repository

The malignant phenotype in breast cancer is driven by eIF4A1-mediated changes in the translational landscape

Author: A De Benedetti
A Keller
A Lazaris-Karatzas
A Subramanian
AE Teschendorff
AG Hinnebusch
AL Wolfe
AP Jansen
B Langmead
B Lankat-Buttgereit
B Schwanhausser
BP Lewis
C Jin
C Vogel
D Shahbazian
D Shahbazian
E Horvilleur
E Turro
E Turro
EM Azzato
F Lesueur
F Meric-Bernstam
FM Blows
GK Smyth
H Liu
HA Meijer
HR Ali
HR Ali
HS Yang
IL Hofacker
JM Cairns
JP Le Quesne
JR Babendure
K Feoktistova
L Boussemart
LJ Coleman
LM McShane
M Bohm
ME Bordeleau
MJ Dunning
P Flicek
R Cencic
R Cencic
RC Gentleman
RJ Crowder
RJ Jackson
T Tsumuraya
TL Bailey
TL Bailey
VK Mootha
Y Benjamini
YH Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/01/2015
Field of study

Human mRNA DeXD/H-box helicases are ubiquitous molecular motors that are required for the majority of cellular processes that involve RNA metabolism. One of the most abundant is eIF4A, which is required during the initiation phase of protein synthesis to unwind regions of highly structured mRNA that would otherwise impede the scanning ribosome. Dysregulation of protein synthesis is associated with tumorigenesis, but little is known about the detailed relationships between RNA helicase function and the malignant phenotype in solid malignancies. Therefore, immunohistochemical analysis was performed on over 3000 breast tumors to investigate the relationship among expression of eIF4A1, the helicase-modulating proteins eIF4B, eIF4E and PDCD4, and clinical outcome. We found eIF4A1, eIF4B and eIF4E to be independent predictors of poor outcome in ER-negative disease, while in contrast, the eIF4A1 inhibitor PDCD4 was related to improved outcome in ER-positive breast cancer. Consistent with these data, modulation of eIF4A1, eIF4B and PCDC4 expression in cultured MCF7 cells all restricted breast cancer cell growth and cycling. The eIF4A1-dependent translatome of MCF7 cells was defined by polysome profiling, and was shown to be highly enriched for several classes of oncogenic genes, including G-protein constituents, cyclins and protein kinases, and for mRNAs with G/C-rich 5′UTRs with potential to form G-quadruplexes and with 3′UTRs containing microRNA target sites. Overall, our data show that dysregulation of mRNA unwinding contributes to the malignant phenotype in breast cancer via preferential translation of a class of genes involved in pro-oncogenic signaling at numerous levels. Furthermore, immunohistochemical tests are promising biomarkers for tumors sensitive to anti-helicase therapies

Nottingham ePrints

Nottingham eTheses

Repository@Nottingham

The Tetraodon nigroviridis reference transcriptome: Developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome

Author: A Kapusta
A Necsulea
A Pauli
A Stabenau
AJ Vilella
AR Quinlan
B Maher
C Nepal
C Trapnell
C Weaver
CA Watson
CM Smith
D Kim
DR Kelley
F Pelegri
G St. Laurent
GT Williams
H Aanes
H Hezroni
H Roest Crollius
H Roest Crollius
H Tilgner
I Ulitsky
J Harrow
J Kim
J Ponjavic
J Ruiz-Orera
J-W Nam
JB Brown
M Blanchette
M Chorev
M Lohse
MD Robinson
MN Cabili
NT Ingolia
O Jaillon
P Flicek
P Heyn
P Miura
R Arrial
RC Gentleman
S Aparicio
S Basu
S Brenner
S Durinck
S Mathavan
SA Harvey
SS Paranjpe
T Derrien
T Kino
TR Dreszer
V Haberle
W Tadros
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies

University of Birmingham Research Portal

KITopen

Public Library of Science (PLOS)

Sissa Digital Library

Identifying Consensus Disease Pathways in Parkinson's Disease Using an Integrative Systems Biology Approach

Parkinson's disease (PD) has had six genome-wide association studies (GWAS) conducted as well as several gene expression studies. However, only variants in MAPT and SNCA have been consistently replicated. To improve the utility of these approaches, we applied pathway analyses integrating both GWAS and gene expression. The top 5000 SNPs (p<0.01) from a joint analysis of three existing PD GWAS were identified and each assigned to a gene. For gene expression, rather than the traditional comparison of one anatomical region between sets of patients and controls, we identified differentially expressed genes between adjacent Braak regions in each individual and adjusted using average control expression profiles. Over-represented pathways were calculated using a hyper-geometric statistical comparison. An integrated, systems meta-analysis of the over-represented pathways combined the expression and GWAS results using a Fisher's combined probability test. Four of the top seven pathways from each approach were identical. The top three pathways in the meta-analysis, with their corrected p-values, were axonal guidance (p = 2.8E-07), focal adhesion (p = 7.7E-06) and calcium signaling (p = 2.9E-05). These results support that a systems biology (pathway) approach will provide additional insight into the genetic etiology of PD and that these pathways have both biological and statistical support to be important in PD