Search CORE

242 research outputs found

Content-based microarray search using differential expression profiles

Author: Altman Russ B
Butte Atul J
Chen Rong
Dudley Joel T
Engreitz Jesse M
Morgan Alexander A
Thathoo Rahul
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background With the expansion of public repositories such as the Gene Expression Omnibus (GEO), we are rapidly cataloging cellular transcriptional responses to diverse experimental conditions. Methods that query these repositories based on gene expression content, rather than textual annotations, may enable more effective experiment retrieval as well as the discovery of novel associations between drugs, diseases, and other perturbations. Results We develop methods to retrieve gene expression experiments that differentially express the same transcriptional programs as a query experiment. Avoiding thresholds, we generate differential expression profiles that include a score for each gene measured in an experiment. We use existing and novel dimension reduction and correlation measures to rank relevant experiments in an entirely data-driven manner, allowing emergent features of the data to drive the results. A combination of matrix decomposition and <it>p</it>-weighted Pearson correlation proves the most suitable for comparing differential expression profiles. We apply this method to index all GEO DataSets, and demonstrate the utility of our approach by identifying pathways and conditions relevant to transcription factors Nanog and FoxO3. Conclusions Content-based gene expression search generates relevant hypotheses for biological inquiry. Experiments across platforms, tissue types, and protocols inform the analysis of new datasets.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Deep-coverage whole genome sequences and blood lipids among 16,324 individuals.

Author: Abecasis Goncalo
Alver Maris
Bloom Jonathan M
Chaffin Mark
Correa Adolfo
Cupples L Adrienne
Engreitz Jesse M
Ernst Jason
Esko Tonu
Ganna Andrea
Johnson W Craig
Kathiresan Sekar
Kellis Manolis
Khera Amit V
Lander Eric S
Manichaikul Ani
Mitchell Braxton
Montasser May
Natarajan Pradeep
Neale Benjamin M
NHLBI TOPMed Lipids Working Group
O'Connell Jeffrey R
Peloso Gina M
Perry James A
Poterba Timothy
Rich Stephen S
Ripatti Samuli
Rotter Jerome I
Ruotsalainen Sanni E
Salomaa Veikko
Seed Cotton
Surakka Ida L
Vasan Ramachandran S
Willer Cristen J
Wilson James G
Zekavat Seyedeh Maryam
Zhou Wei
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

Large-scale deep-coverage whole-genome sequencing (WGS) is now feasible and offers potential advantages for locus discovery. We perform WGS in 16,324 participants from four ancestries at mean depth >29X and analyze genotypes with four quantitative traits-plasma total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. Common variant association yields known loci except for few variants previously poorly imputed. Rare coding variant association yields known Mendelian dyslipidemia genes but rare non-coding variant association detects no signals. A high 2M-SNP LDL-C polygenic score (top 5th percentile) confers similar effect size to a monogenic mutation (~30 mg/dl higher for each); however, among those with severe hypercholesterolemia, 23% have a high polygenic score and only 2% carry a monogenic mutation. At these sample sizes and for these phenotypes, the incremental value of WGS for discovery is limited but WGS permits simultaneous assessment of monogenic and polygenic models to severe hypercholesterolemia

DSpace@MIT

Directory of Open Access Journals

eScholarship - University of California

George Washington University: Health Sciences Research Commons (HSRC)

Recommended from our members

Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries.

Author: Alver Maris
Bloom Jonathan
Budoff Matthew
Chaffin Mark
Correa Adolfo
Cupples L Adrienne
Daly Mark J
Engreitz Jesse
Ernst Jason
Esko Tonu
Fu Mao
Ganna Andrea
Handsaker Robert E
Johnson W Craig
Kathiresan Sekar
Kellis Manolis
Manichaikul Ani
McCarroll Steven
Metspalu Andres
Mitchell Braxton D
Natarajan Pradeep
Neale Benjamin M
NHLBI TOPMed Lipids Working Group
Peloso Gina M
Post Wendy
Poterba Timothy
Rich Stephen S
Ripatti Samuli
Rotter Jerome I
Ruotsalainen Sanni
Ryan Kathleen A
Salomaa Veikko
Seed Cotton
Surakka Ida
Tsai Michael
Vasan Ramachandran S
Wilson James G
Yang Chaojie
Zekavat Seyedeh M
Publication venue: eScholarship, University of California
Publication date: 01/07/2018
Field of study

Lipoprotein(a), Lp(a), is a modified low-density lipoprotein particle that contains apolipoprotein(a), encoded by LPA, and is a highly heritable, causal risk factor for cardiovascular diseases that varies in concentrations across ancestries. Here, we use deep-coverage whole genome sequencing in 8392 individuals of European and African ancestry to discover and interpret both single-nucleotide variants and copy number (CN) variation associated with Lp(a). We observe that genetic determinants between Europeans and Africans have several unique determinants. The common variant rs12740374 associated with Lp(a) cholesterol is an eQTL for SORT1 and independent of LDL cholesterol. Observed associations of aggregates of rare non-coding variants are largely explained by LPA structural variation, namely the LPA kringle IV 2 (KIV2)-CN. Finally, we find that LPA risk genotypes confer greater relative risk for incident atherosclerotic cardiovascular diseases compared to directly measured Lp(a), and are significantly associated with measures of subclinical atherosclerosis in African Americans

eScholarship - University of California

George Washington University: Health Sciences Research Commons (HSRC)

Publisher Correction: Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries.

Author: Alver Maris
Bloom Jonathan
Budoff Matthew
Chaffin Mark
Correa Adolfo
Cupples L Adrienne
Daly Mark J
Engreitz Jesse
Ernst Jason
Esko Tonu
Fu Mao
Ganna Andrea
Handsaker Robert E
Johnson W Craig
Kathiresan Sekar
Kellis Manolis
Manichaikul Ani
McCarroll Steven
Metspalu Andres
Mitchell Braxton D
Natarajan Pradeep
Neale Benjamin M
NHLBI TOPMed Lipids Working Group
Peloso Gina M
Post Wendy
Poterba Timothy
Rich Stephen S
Ripatti Samuli
Rotter Jerome I
Ruotsalainen Sanni
Ryan Kathleen A
Salomaa Veikko
Seed Cotton
Surakka Ida
Tsai Michael
Vasan Ramachandran S
Wilson James G
Yang Chaojie
Zekavat Seyedeh M
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

The original version of this article contained an error in the name of the author Ramachandran S. Vasan, which was incorrectly given as Vasan S. Ramachandran. This has now been corrected in both the PDF and HTML versions of the article

Crossref

eScholarship - University of California

George Washington University: Health Sciences Research Commons (HSRC)

The Escherichia coli transcriptome mostly consists of independently regulated modules

Author: A Anand
A Biton
A Delorme
A Frigyesi
A Hyvärinen
A Santos-Zavaleta
A-M Martoglio
AE Teschendorff
B Dalrymple
B Langmead
B-K Cho
B-K Cho
BM Bolstad
C Vijayendran
CL Turnbough Jr
D Kim
D Marbach
D Risso
D-S Huang
DS Latchman
E Nudler
EJ O’Brien
ENCODE Project Consortium.
ER Gansner
F Pedregosa
GI Guzmán
GI Guzmán
H Zou
HS Rhee
I Kristoficova
IM Keseler
J Pouyssegur
J Utrilla
JE Galagan
JJ Faith
JM Buescher
JM Engreitz
JM Monk
JT Leek
K Valgepea
K-K Yan
KF Jensen
KJ Karczewski
L Wang
M Ester
M Kim
M Lawrence
M Moretto
M Scott
M Scott
MB Gerstein
MI Love
NE Lewis
O Alter
P Chiappetta
P Comon
PR Subbarayan
PV Phaneuf
R De Smet
R Kolter
RA LaCroix
RB D’agostino
S Gama-Castro
S Lin
SJ Larsen
SW Seo
T Baba
T Barrett
TM Henkin
W Kong
W Liebermeister
W Saelens
X Zhang
Xin Fang
XW Zhang
Y Gao
Y Yamanaka
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome

Crossref

ScholarWorks@UNIST

eScholarship - University of California

Online Research Database In Technology

Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis

Author: A Kendall
A Mackay
A Naderi
A Tichopad
AA Shabalin
AC Culhane
AE Teschendorff
AH Sims
AH Sims
AH Sims
AL Oberg
Alexey A Larionov
Andrew H Sims
Arran K Turnbull
C Desmedt
CY Lin
GC Tseng
GW Snedecor
HS Leong
J Michael Dixon
J Neter
J Rudy
JM Engreitz
JM Engreitz
JS Parker
JT Leek
JT Leek
KR Ong
L Ein-Dor
L Shi
Lorna Renshaw
M Barnes
M Benito
M Dai
MB Eisen
MJ Okoniewski
ML Lindstrom
MN McCall
NL Barbosa-Morais
NM Laird
R Clarke
R Sandberg
R Shen
RC Gentleman
Robert R Kitchen
RR Kitchen
RR Kitchen
RR Kitchen
VG Tusher
VS Sabine
WE Johnson
WR Miller
WR Miller
WR Miller
X Fan
X Lu
Z Hu
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. Results Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. Conclusion Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Genome-wide enhancer maps link risk variants to disease genes

Author: Bergman DT
Collins RL
Cui A
Daly MJ
Dey K
Doughty BR
Eisenhaure TM
Engreitz JM
Epstein CB
Finucane HK
Fulco CP
Guckelberger P
Hacohen N
Huang HL
Jones TR
Kane M
Kang HY
Kundaje A
Lander ES
Lekschas F
Mualim K
Munson G
Nasser J
Natri HM
Nguyen TH
Patwardhan TA
Pfister H
Price AL
Ray JP
Ulirsch JC
Weeks EM
Xavier RJ
Publication venue
Publication date: 13/05/2021
Field of study

Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complextraits, each of which could reveal insights into the mechanisms of disease(1). Many ofthe underlying causal variants may affect enhancers(2,3), but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types(4). Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577genesthat appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Long non-coding RNAs: spatial amplifiers that control nuclear structure and gene expression

Author: A Beletskii
A Di Ruscio
A Kanhere
A Minajigi
A Monfort
A Mortazavi
A Necsulea
A Rego
A Wutz
A Wutz
AA Hyman
AC Ayupe
AI Lamond
AM Keohane
AM Keohane
B Burke
B Bánfai
B Moindrot
BA Boggs
C Chu
C Davidovich
C Maison
C Mayer
C Naughton
C Trapnell
CA McHugh
CJ Brown
CJ Brown
CM Clemson
CM Clemson
D Bernard
D Smeets
DC He
DC Zappulla
DL Spector
DM Figueroa
DM Shechner
DP Patil
DW Dodd
E Hacisuleyman
E Hacisuleyman
E Lieberman-Aiden
E Splinter
EM Darrow
EPC Rocha
F Lai
F Mohammad
F Ramírez
FO Fackelmayer
G Biamonti
G Brawerman
G Csankovszki
G-L Chew
GD Penny
H Hezroni
H Maamar
H Sunwoo
H Sunwoo
H Tani
HJ Worman
I Gonzalez
I Kwon
J Chen
J Cheng
J Dekker
J Grant
J Kind
J Silva
J Zhao
JA Martens
JA Nickerson
JA West
Jesse M. Engreitz
JH Bergmann
JJ Quinn
JJ Quinn
JL Rinn
JL Rinn
JM Engreitz
JM Engreitz
JR Alvarez-Dominguez
JR Chubb
JR Dixon
JR Prensner
JR Sanford
JT Lee
JT Lee
K Imamura
K Plath
K Plath
KC Wang
KL Yap
KV Prasanth
L Bintu
L Giorgetti
L Sun
LL Chen
LL Hall
LL Hall
M Amendola
M Caudron-Herger
M Denholtz
M Ebisuya
M Feric
M Guttman
M Guttman
M Guttman
M Guttman
M Kato
M Melé
M-L Änkö
MB Clark
MC Tsai
MD Jacob
MD Simon
MD Simon
MG Guenther
MG Guenther
Mitchell Guttman
MJ Wakefield
MML Soruco
MN Cabili
MS Bartolomei
N Brockdorff
N Brockdorff
N Brockdorff
N Khanna
NB Leontis
Noah Ollikainen
P Kapranov
PA Latos
PG Maass
Q-F Yin
R Galupa
R Helbig
R Miyagawa
RA Flynn
RB Lanz
RR Pandey
RS Nozawa
S Huang
S Kalantry
S Kalantry
S Kaneko
S Loewer
S Schoeftner
S Schoenfelder
S Washietl
S Xiang
SF Banani
SH You
SP Shevtsov
SS Rao
ST da Rocha
ST da Rocha
SY Ng
T Cheutin
T Cremer
T Derrien
T Kino
T Melese
T Mondal
T Nagano
TS Mikkelsen
V Tripathi
VH Meller
WF Marshall
X Deng
X Wang
Y Gruenbaum
Y Hasegawa
Y Jeon
Y Marahrens
Y Shi
YS Mao
YW Yang
YY Shevelyov
Z Lu
Publication venue: Nature Publishing Group
Publication date: 01/12/2016
Field of study

Over the past decade, it has become clear that mammalian genomes encode thousands of long non-coding RNAs (lncRNAs), many of which are now implicated in diverse biological processes. Recent work studying the molecular mechanisms of several key examples — including Xist, which orchestrates X chromosome inactivation — has provided new insights into how lncRNAs can control cellular functions by acting in the nucleus. Here we discuss emerging mechanistic insights into how lncRNAs can regulate gene expression by coordinating regulatory proteins, localizing to target loci and shaping three-dimensional (3D) nuclear organization. We explore these principles to highlight biological challenges in gene regulation, in which lncRNAs are well-suited to perform roles that cannot be carried out by DNA elements or protein regulators alone, such as acting as spatial amplifiers of regulatory signals in the nucleus

Crossref

Caltech Authors

Recommended from our members

Publisher Correction: Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries.

Author: Alver Maris
Bloom Jonathan
Budoff Matthew
Chaffin Mark
Correa Adolfo
Cupples L Adrienne
Daly Mark J
Engreitz Jesse
Ernst Jason
Esko Tonu
Fu Mao
Ganna Andrea
Handsaker Robert E
Johnson W Craig
Kathiresan Sekar
Kellis Manolis
Manichaikul Ani
McCarroll Steven
Metspalu Andres
Mitchell Braxton D
Natarajan Pradeep
Neale Benjamin M
NHLBI TOPMed Lipids Working Group
Peloso Gina M
Post Wendy
Poterba Timothy
Rich Stephen S
Ripatti Samuli
Rotter Jerome I
Ruotsalainen Sanni
Ryan Kathleen A
Salomaa Veikko
Seed Cotton
Surakka Ida
Tsai Michael
Vasan Ramachandran S
Wilson James G
Yang Chaojie
Zekavat Seyedeh M
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

An amendment to this paper has been published and can be accessed via a link at the top of the paper

eScholarship - University of California

Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome

Author: A Esteve-Codina
A Li
A Necsulea
A Pauli
A Theocharidis
AE Kornienko
AI Su
AI Su
BE Suzek
BE Suzek
BJ Haas
BS Gloss
BT Roux
C Billerey
C Camacho
C Jiang
CC Hon
Charity Muriuki
CK Tuggle
CP Ponting
D Kim
David A. Hume
DM Bickhart
E Boutet
EL Clark
EL Gautier
Emily L. Clark
EY Scott
F Santa De
G Natoli
GSC Slater
H Bakel van
H Chen
H Jia
H Takahashi
I Yanai
Iseabail L. Farquhar
J Bouckenheimer
J Chen
J Mistry
J Wang
J Xia
JJ Qiu
JJ Quinn
JL Rinn
JM Engreitz
JM Engreitz
JS Mattick
JT Kung
JW Fickett
JW Fickett
K Zhang
L Andersson
L Huminiecki
L Kong
L Ma
L Sun
L Wang
L Yu
LA Goff
LC Tsoi
LM McIntyre
LT Koufariotis
M Pertea
M Uhlen
Mary E. B. McCulloch
MK Iyer
MN Cabili
MR Bakhtiarizadeh
N Maeda
NE Ilott
NL Bray
P Carninci
P Johnsson
P Kapranov
P Rice
PJ Balwierz
R Andersson
R Karlic
R Weikard
R Weikard
RA Chodroff
RD Finn
RJ Kinsella
S Dongen van
S Katayama
S Kumar
SJ Bush
SJ Liu
Stephen J. Bush
T Derrien
T Ravasi
T Sing
T Steijger
TC Freeman
TK Kim
TR Cech
TR Mercer
UniProt Consortium
VB Bajic
VE Villegas
W Li
W Wu
XC Quek
XF Liu
Y Zhang
Y Zhao
YT Sasaki
ZY Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Additional file 3. This file contains all supplementary tables relating to lncRNA identification via the conservation of synteny. Table S3. lncRNAs inferred in one species by the genomic alignment of a transcript assembled with the RNA-seq libraries from a related spdecies. Table S12. Presence of intergenic lncRNAs both in sheep and cattle, in regions of conserved synteny. Table S13. Presence of intergenic lncRNAs both in sheep and goat, in regions of conserved synteny. Table S14. Presence of intergenic lncRNAs both in cattle and goat, in regions of conserved synteny. Table S15. Presence of intergenic lncRNAs both in sheep and humans, in regions of conserved synteny. Table S16. Presence of intergenic lncRNAs both in goat and humans, in regions of conserved synteny. Table S17. Presence of intergenic lncRNAs both in cattle and humans, in regions of conserved synteny. Table S18. High-confidence lncRNA pairs, those conserved across species both sequentially and positionally

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

Oxford University Research Archive

FigShare