Search CORE

532 research outputs found

GIVE: portable genome browsers for personal websites.

Author: Alvin Zheng
B Sridhar
B Sridhar
C Tyner
D Barrios
D Comer
E Lieberman-Aiden
E Sharma
F Ozsolak
F Yue
FH Biase
JD Buenrostro
JG Aw
JT Robinson
LD Stein
ME Skinner
MJ Fullwood
Qiuyang Wu
R Bayer
R Li
R Mourad
S Carrere
Sheng Zhong
TC Nguyen
The ENCODE Project Consortium
VW Zhou
WJ Kent
X Li
X Zhou
Xiaoyi Cao
Z Lu
Zhangming Yan
Publication venue: eScholarship, University of California
Publication date: 01/07/2018
Field of study

Growing popularity and diversity of genomic data demand portable and versatile genome browsers. Here, we present an open source programming library called GIVE that facilitates the creation of personalized genome browsers without requiring a system administrator. By inserting HTML tags, one can add to a personal webpage interactive visualization of multiple types of genomics data, including genome annotation, "linear" quantitative data, and genome interaction data. GIVE includes a graphical interface called HUG (HTML Universal Generator) that automatically generates HTML code for displaying user chosen data, which can be copy-pasted into user's personal website or saved and shared with collaborators. GIVE is available at: https://www.givengine.org/

Crossref

Directory of Open Access Journals

eScholarship - University of California

Revealing mammalian evolutionary relationships by comparative analysis of gene clusters

Author: Abi-Rached
Akahoshi
Bailey
Benjamin Dickins
Birney
Cadavid
Cathy Riemer
Chen
Chih-Hao Hsu
Chiu
Colobran
Datta
Degenhardt
Dewey
Dufayard
Edwards
Eric D. Green
Fitch
Fitch
Fitch
Giltae Song
Gish
Gonzalez
Goodstadt
Graef
Guethlein
Guethlein
Han
Hardies
Hardison
Hardison
Hardison
Harris
Hie Lim Kim
Hoffmann
Hou
Hou
Hsu
Hsu
Hu
Huerta-Cepas
Jensen
Johnson
Kim
Kristensen
Lee
Levy
Li
Li
Lopez-Vazquez
Louxin Zhang
Margulies
Martin
Matsuya
Mi
Miyata
Muller
Murphy
NISC Comparative Sequencing Program
Opazo
Opazo
Ostlund
Ouzounis
Parham
Pianezza
Rajalingam
Ross C. Hardison
Sambrook
Shilling
Siepel
Smit
Song
Song
Song
Sonnhammer
Su
Tatusov
The ENCODE Project Consortium
Uchiyama
van der Heijden
Vilella
Wang
Wapinski
Waterhouse
Webb Miller
Wilson
Wilson
Woelk
Yu Zhang
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events

Crossref

Nottingham Trent Institutional Repository (IRep)

PubMed Central

ScholarBank@NUS

The Escherichia coli transcriptome mostly consists of independently regulated modules

Author: A Anand
A Biton
A Delorme
A Frigyesi
A Hyvärinen
A Santos-Zavaleta
A-M Martoglio
AE Teschendorff
B Dalrymple
B Langmead
B-K Cho
B-K Cho
BM Bolstad
C Vijayendran
CL Turnbough Jr
D Kim
D Marbach
D Risso
D-S Huang
DS Latchman
E Nudler
EJ O’Brien
ENCODE Project Consortium.
ER Gansner
F Pedregosa
GI Guzmán
GI Guzmán
H Zou
HS Rhee
I Kristoficova
IM Keseler
J Pouyssegur
J Utrilla
JE Galagan
JJ Faith
JM Buescher
JM Engreitz
JM Monk
JT Leek
K Valgepea
K-K Yan
KF Jensen
KJ Karczewski
L Wang
M Ester
M Kim
M Lawrence
M Moretto
M Scott
M Scott
MB Gerstein
MI Love
NE Lewis
O Alter
P Chiappetta
P Comon
PR Subbarayan
PV Phaneuf
R De Smet
R Kolter
RA LaCroix
RB D’agostino
S Gama-Castro
S Lin
SJ Larsen
SW Seo
T Baba
T Barrett
TM Henkin
W Kong
W Liebermeister
W Saelens
X Zhang
Xin Fang
XW Zhang
Y Gao
Y Yamanaka
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome

Crossref

ScholarWorks@UNIST

eScholarship - University of California

Online Research Database In Technology

Law of Genome Evolution Direction : Coding Information Quantity Grows

Author: A. F. A. Smit
A. G. Matera
A. Mira
B. Charlesworth
C. L. Organ
C. Nusbaum
D. A. Petrov
D. L. Marais Des
D. R. Scannell
E. Schrodinger
E. T. Dermitzakis
F. Clark
G. Bejerano
G. Liu
G. Storz
H. H. Chou
H. H. Kazazian
H. Ozkan
H. Winter
I. J. Leitch
I. Wapinski
I. Wickelgren
International Human Genome Sequencing Consortium
J. Filkowski
J.M. Aury
K. M. Devos
L. F. Luo
L. F. Luo
L. F. Luo
L. He
L. Patthy
L. R. Zhang
Liao-fu Luo
R. J. Taft
R. P. Bininda-Edmonds
S. E. Peters
T. C. Stadtman
T. Kouzarides
T. R. Gregory
The ENCODE Project Consortium
W. Deng
W. Enard
W. H. Li
W. Makalowski
X. Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/08/2008
Field of study

The problem of the directionality of genome evolution is studied. Based on the analysis of C-value paradox and the evolution of genome size we propose that the function-coding information quantity of a genome always grows in the course of evolution through sequence duplication, expansion of code, and gene transfer from outside. The function-coding information quantity of a genome consists of two parts, p-coding information quantity which encodes functional protein and n-coding information quantity which encodes other functional elements except amino acid sequence. The evidences on the evolutionary law about the function-coding information quantity are listed. The needs of function is the motive force for the expansion of coding information quantity and the information quantity expansion is the way to make functional innovation and extension for a species. So, the increase of coding information quantity of a genome is a measure of the acquired new function and it determines the directionality of genome evolution.Comment: 16 page

arXiv.org e-Print Archive

Crossref

Genetic determinants of co-accessible chromatin regions in activated T cells across humans.

Author: A Barrie
A Battle
A Franke
AA Shabalin
AM Klein
AR Quinlan
Atsede Siba
Aviv Regev
Aviva P. Aiden
B Li
BE Stranger
C Hou
Christine S. Cheng
Christophe Benoist
Chun J. Ye
CJ Ye
CK Stroud
D Hnisz
D Lee
D Sakata
DE Speiser
Dmytro Lituiev
E Elinav
E Splinter
EM Schmidt
Erez Lieberman Aiden
EZ Macosko
G Jun
G McVicker
H Kilpinen
H Li
H Li
HK Finucane
HM Kang
Howard Y. Chang
Ido Machol
Ivo Wortman
J Yang
JD Buenrostro
JD Buenrostro
JD Storey
JE Phillips
JF Degner
JN Hirschhorn
JS Delisle
K Enjyoji
Kendrick L. Hougen
KK Farh
L Chen
L Plesner
M Feuerer
M Ghandi
M Kasowski
M Kronenberg
M Kurachi
M. Grace Gordon
Marcin Tabaka
MB Gerstein
Meena Subramaniam
MI Love
MI McCarthy
Michael A. Beer
MN Lee
MT Maurano
Muhammad Shamim
MY Donath
N Kumasaka
NC Durand
Neva C. Durand
NP Restifo
P Cauchy
P Li
PC Hollenhorst
Philip L. De Jager
PM Visscher
PS Ohashi
R Satija
Rachel E. Gate
RE Thurman
RM Samstein
Roadmap Epigenomics Consortium
S Deaglio
S Heinz
S Neph
SM Waszak
SS Rao
Su-Chen Huang
T Lappalainen
T Raj
The ENCODE Project Consortium.
Ting Feng
TL Murphy
UM Marigorta
WA Whyte
WJ Astle
X Chen
X Sun
Y Belkaid
Y Zhang
YY Fan
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression

Crossref

eScholarship - University of California

Modeling associations between genetic markers using Bayesian networks

Author: Altshuler
Browning
C. D. Maciel
E. Villanueva
Liu
Mueller
Nothnagel
Pritchard
Scheet
The ENCODE Project Consortium
Thomas
Thomas
Tishkoff
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Understanding the patterns of association between polymorphisms at different loci in a population (linkage disequilibrium, LD) is of fundamental importance in various genetic studies. Many coefficients were proposed for measuring the degree of LD, but they provide only a static view of the current LD structure. Generative models (GMs) were proposed to go beyond these measures, giving not only a description of the actual LD structure but also a tool to help understanding the process that generated such structure. GMs based in coalescent theory have been the most appealing because they link LD to evolutionary factors. Nevertheless, the inference and parameter estimation of such models is still computationally challenging

Crossref

PubMed Central

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Repositório da Produção USP (Univ. de São Paulo)

Modeling associations between genetic markers using Bayesian networks

Author: Altshuler
Browning
C. D. Maciel
E. Villanueva
Liu
Mueller
Nothnagel
Pritchard
Scheet
The ENCODE Project Consortium
Thomas
Thomas
Tishkoff
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Crossref

PubMed Central

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Repositório da Produção USP (Univ. de São Paulo)

Determinants of protein evolutionary rates in light of ENCODE functional genomics

Author: AM Larracuente
C Pál
ENCODE Project Consortium
Marc Robinson-Rechavi
Nadezda Kryuchkova
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A systematic, large-scale comparison of transcription factor binding site models

Background The modelling of gene regulation is a major challenge in biomedical research. This process is dominated by transcription factors (TFs) and mutations in their binding sites (TFBSs) may cause the misregulation of genes, eventually leading to disease. The consequences of DNA variants on TF binding are modelled in silico using binding matrices, but it remains unclear whether these are capable of accurately representing in vivo binding. In this study, we present a systematic comparison of binding models for 82 human TFs from three freely available sources: JASPAR matrices, HT-SELEX-generated models and matrices derived from protein binding microarrays (PBMs). We determined their ability to detect experimentally verified “real” in vivo TFBSs derived from ENCODE ChIP-seq data. As negative controls we chose random downstream exonic sequences, which are unlikely to harbour TFBS. All models were assessed by receiver operating characteristics (ROC) analysis. Results While the area- under-curve was low for most of the tested models with only 47 % reaching a score of 0.7 or higher, we noticed strong differences between the various position-specific scoring matrices with JASPAR and HT-SELEX models showing higher success rates than PBM-derived models. In addition, we found that while TFBS sequences showed a higher degree of conservation than randomly chosen sequences, there was a high variability between individual TFBSs. Conclusions Our results show that only few of the matrix-based models used to predict potential TFBS are able to reliably detect experimentally confirmed TFBS. We compiled our findings in a freely accessible web application called ePOSSUM (http:/mutationtaster.charite.de/ePOSSUM/) which uses a Bayes classifier to assess the impact of genetic alterations on TF binding in user-defined sequences. Additionally, ePOSSUM provides information on the reliability of the prediction using our test set of experimentally confirmed binding sites

Institutional Repository of the Freie Universität Berlin

Crossref

Springer - Publisher Connector

PubMed Central

Chromatin loop anchors are associated with genome instability in cancer and recombination hotspots in the germline

Author: A Auton
A Canela
A Gardini
A Liaw
A Losada
AL Valton
AR Quinlan
AS Kudlicki
B Charlesworth
B Gel
B Schuster-Bockler
BJ Taylor
BL Moore
C Bertoli
C Grey
CE Grant
CJ Lord
CM Carvalho
CM Manville
Colin A. Semple
CS Walsh
CT Ong
CY McLean
D Hnisz
D Perera
DG Lupianez
DR Zerbino
E Guillou
E Hatchi
E Splinter
Encode Project Consortium
EP Nora
F Baudat
F McNicoll
F Pratto
G Coop
G Fudenberg
G Fudenberg
G McVicker
GA McVean
J Feichtinger
J MacArthur
J Weischenfeldt
JA Rosenfeld
JA Stamatoyannopoulos
JHI Haarhuis
JM Engreitz
JN Strathern
JR Dixon
JS Gehring
K Brick
K Hilmi
L Uuskula-Reimand
LJ Valentijn
M Peifer
MA Reijns
MH Nichols
PA Northcott
R Hänsel-Hertsch
R Katainen
R Sabarinathan
S Besenbacher
S Courbet
S Groschel
S Morganella
S Myers
S Myers
S Nik-Zainal
SS Rao
SV Lensing
TJ Hudson
TL Bailey
TW Glover
TW Glover
V Dileep
VB Kaiser
Vera B. Kaiser
W Winckler
WA Flavahan
WM Hicks
Y Drier
Y Liu
Y Zhang
Z Tang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2018
Field of study

Abstract Background Chromatin loops form a basic unit of interphase nuclear organization, with chromatin loop anchor points providing contacts between regulatory regions and promoters. However, the mutational landscape at these anchor points remains under-studied. Here, we describe the unusual patterns of somatic mutations and germline variation associated with loop anchor points and explore the underlying features influencing these patterns. Results Analyses of whole genome sequencing datasets reveal that anchor points are strongly depleted for single nucleotide variants (SNVs) in tumours. Despite low SNV rates in their genomic neighbourhood, anchor points emerge as sites of evolutionary innovation, showing enrichment for structural variant (SV) breakpoints and a peak of SNVs at focal CTCF sites within the anchor points. Both CTCF-bound and non-CTCF anchor points harbour an excess of SV breakpoints in multiple tumour types and are prone to double-strand breaks in cell lines. Common fragile sites, which are hotspots for genome instability, also show elevated numbers of intersecting loop anchor points. Recurrently disrupted anchor points are enriched for genes with functions in cell cycle transitions and regions associated with predisposition to cancer. We also discover a novel class of CTCF-bound anchor points which overlap meiotic recombination hotspots and are enriched for the core PRDM9 binding motif, suggesting that the anchor points have been foci for diversity generated during recent human evolution. Conclusions We suggest that the unusual chromatin environment at loop anchor points underlies the elevated rates of variation observed, marking them as sites of regulatory importance but also genomic fragility

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer