Search CORE

16 research outputs found

Genome modeling system: A knowledge management platform for genomics

Author: Abbott Benjamin S
Abbott Travis E
Ainscough Benjamin J
Belter Edward A
Brummett Anthony M
Burnett Mark M
Callaway Matthew B
Carmichael Lynn K
Chen Ken
Clark Eric
Coffman Adam C
Das Indraniel
Dees Nathan D
Derickson Brian R
Ding Li
Dooling David J
Du Feiyu
Dukes Adam
Eldred James M
Fan Xian
Ferguson Ian T
Griffith Malachi
Griffith Obi L
Harris Christopher C
Hawkins Amy E
Helper Todd G
Hundal Jasreet
Kandoth Cyriac
Kim Kyung H
Kiwala Michael J
Koboldt Daniel C
Larson David E
Leonard Shawn M
Lolofie Justin T
Long Robert L
Lu Charles
Magrini Vincent J
Maher Christopher A
Maher Nicole
Mardis Elaine R
McLellan Michael D
McMichael Joshua F
Miller Christopher A
Mooney Thomas P
Morton David L
Nutter Nathaniel G
Oberkfell Ben J
Peck Joshua B
Pohl Craig S
Ramu Avinash
Regier Allison A
Sanderson Gabriel E
Schierding William S
Schroeder William E
Shi Xiaoqi
Skidmore Zachary L
Smith Scott M
Stiehr Gary
Walker Jason R
Weible James V
Weil Matthew R
Wilson Richard K
Wohlstadter Richard W
Wylie Todd N
Publication venue: Digital Commons@Becker
Publication date: 01/01/2015
Field of study

In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms

Crossref

Directory of Open Access Journals

Digital Commons@Becker

PubMed Central

FigShare

Medjugorje: Finding Peace at the Heart of Conflict

Author: Agnew J. A.
Buttimer A.
Connell J. T.
Eliade M.
Girard G.
Greeley A. M.
Helman C. G.
Hill J.
James M. Jurkovich
Johnson E. A.
Laurentin R.
Levine G. J.
Ley D.
Malcolm N.
Miravalle M.I.
Mol H.
Nolan M. L.
Pervan T.
Relph E.
Rodgers P.
Rooney L.
Rupčić L.
Silber L.
Sopher D. E.
Sumption J.
Turner V.
Turner V. W.
Vukonié B.
Weible W.
Weightman B.
Wilbert M. Gesler
Wilhelm A.J.
Young A.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Models of local governing coalitions: city politics and policy effects in Spanish municipalities

Crossref

Inter-sectoral action to support healthy and environmentally sustainable food behaviours: a study of sectoral knowledge, governance and implementation opportunities

Author: A Jones
A McMichael
A Meybeck
AC Hoek
AC Hoek
AD Jones
Annet C. Hoek
B Burlingame
B Crammond
B Seed
B Seed
BE Millen
C Hawkes
C Hawkes
C Mithril
C Richards
C Van Dooren
CM Weible
D Armstrong
D Mebratu
David Pearson
DEFRA
E Mertens
FJ He
G Jenkin
H Trevena
I Ayres
I Hawkins
J Buttriss
J Elkington
J Shill
J Wegener
JL Buttriss
KD Brownell
KJ Gile
L Barosh
L Reisch
L Wellesley
LL Sharma
M Garcia Martinez
MA Lawrence
Mark A. Lawrence
N Auestad
N Gunningham
NG Nylen
OB Carter
P Drahos
R MacRae
S Friel
S Friel
S Galbraith-Emami
S James
Sarah W. James
Sharon Friel
T Lang
TN Basit
V Braithwaite
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

DGIdb: mining the druggable genome

Author: A Gaulton
Adam C Coffman
AL Hopkins
AP Davis
AP Orth
AP Russ
C Knox
Christopher A Miller
D Maglott
David E Larson
David J Dooling
E Lim
E Lounkine
Elaine R Mardis
EM McDonagh
F Zhu
G Manning
Indraniel Das
J Barretina
J Lamb
J von Eichborn
James Koval
James M Eldred
James V Weible
Janakiraman Subramanian
Jason R Walker
Josh F McMichael
Li Ding
M Ashburner
M Kuhn
M Kuhn
M Punta
M Rask-Andersen
Malachi Griffith
Matthew B Callaway
N Hecker
N Somaiah
Nicholas C Spies
Obi L Griffith
P Flicek
P Yeh
PJ Stephens
R Bose
Ramaswamy Govindan
RD Kumar
Richard K Wilson
Ron Bose
Runjun D Kumar
S Banerji
S Hunter
S Preissner
Scott M Smith
SP Shah
T Liu
Timothy J Ley
W Yang
Y Wang
Z Gao
Z Kan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Is Collaboration a Good Investment? Modeling the Impact of Government Support for Nonprofit Collaborative Watershed Management Councils.

Author: A Gelman
Andrew Gelman
Aurelie Cosandey-Godin
C Ansell
C M Weible
C W Thomas
Carol B Griffin
Cary Coglianese
Christian P Robert
Christopher K Wikle
Christopher K Wikle
Curtis G Cude
David B Dunson
E P Weber
Edzer J Pebesma
Finn Lindgren
Finn Lindgren
G Carr
Geoffrey Habron
Gomez-Rubio
H P Huntington
Havard Rue
Havard Rue
J D Donahue
J E Innes
J M Wondolleck
James Lesage
James S Clark
James S Clark
James S Hodges
Jens Newig
John A Mclaughlin
John Hoornbeek
John M Bryson
Jos� M Bernardo
Judith A Layzer
Julien Beguin
K Emerson
Ke Xu
L M Salamon
Leonard Bickman
Lubell
M Lubell
M Schneider
M T Imperial
Mark Lubell
Marta Blangiardo
Marta Blangiardo
Maxine E Dakins
Michael Hibbard
Michael R Meador
Michela Cameletti
Nikolic
Noel Cressie
Noel Cressie
P A Sabatier
P J Johnes
R Agranoff
R D Margerum
R O&apos
R W Skaggs
Ramiro Berardo
Richard D Margerum
Richard D Margerum
Robert J Hijmans
Roger S Bivand
Roger S Bivand
Rue
S L Yaffee
Sara Martino
Sara Martino
Scottd Hardy
Steve Brooks
Steven Smith
Susan Lurie
T M Koontz
T M Koontz
Thomas C Beierle
Thomas I Gunton
Tong
Trevor J Ward
Tyler A Scott
Tyler Scott
V Ostrom
W D Leach
W D Leach
William D Leach
Xinhao Wang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Crossref

HCC1395 (“TST1”) example input, models, and outputs.

Author: Adam C. Coffman (767645)
Adam Dukes (767660)
Allison A. Regier (767646)
Amy E. Hawkins (767672)
Anthony M. Brummett (767643)
Avinash Ramu (392869)
Ben J. Oberkfell (767647)
Benjamin J. Ainscough (650163)
Benjamin S. Abbott (767668)
Brian R. Derickson (767662)
Charles Lu (33589)
Christopher A. Maher (228763)
Christopher A. Miller (231205)
Christopher C. Harris (553952)
Craig S. Pohl (767682)
Cyriac Kandoth (746224)
Daniel C. Koboldt (213793)
David E. Larson (246516)
David J. Dooling (767685)
David L. Morton (767656)
Edward A. Belter (767651)
Elaine R. Mardis (13400)
Eric Clark (767669)
Feiyu Du (767652)
Gabriel E. Sanderson (767648)
Gary Stiehr (767681)
Ian T. Ferguson (767655)
Indraniel Das (767670)
James M. Eldred (767684)
James V. Weible (767658)
Jason R. Walker (767683)
Jasreet Hundal (767663)
Joshua B. Peck (767659)
Joshua F. McMichael (659850)
Justin T. Lolofie (767661)
Ken Chen (61463)
Kyung H. Kim (188027)
Li Ding (16240)
Lynn K. Carmichael (767678)
Malachi Griffith (66161)
Mark M. Burnett (767657)
Matthew B. Callaway (767642)
Matthew R. Weil (767679)
Michael D. McLellan (25925)
Michael J. Kiwala (767644)
Nathan D. Dees (362863)
Nathaniel G. Nutter (767650)
Nicole Maher (767666)
Obi L. Griffith (63659)
Richard K. Wilson (13401)
Richard W. Wohlstadter (767680)
Robert L. Long (767653)
Scott M. Smith (268582)
Shawn M. Leonard (767675)
Thomas P. Mooney (767649)
Todd G. Hepler (767673)
Todd N. Wylie (767674)
Travis E. Abbott (767654)
Vincent J. Magrini (767667)
William E. Schroeder (767676)
William S. Schierding (767665)
Xian Fan (767671)
Xiaoqi Shi (767677)
Zachary L. Skidmore (767664)
Publication venue
Publication date
Field of study

A test dataset for the HCC1395 cell line is provided with the GMS software to allow testing of software installation, and facilitate further development. It is also used to illustrate much of the current functionality of the GMS. HCC1395 tumor and the corresponding HCC1395BL ‘normal’ cell line DNA and RNA samples were sequenced by whole genome, exome, and RNA-seq methods producing six sets of instrument data for input to various GMS pipelines. Additional required inputs for the pipelines include a reference genome (e.g., GRCh37), gene annotations (e.g., Ensembl 67_37l), and variant databases (e.g., dbSNP37). Different versions (processing profiles) of the reference alignment were used to align WGS and exome DNA reads to the reference genome. A separate RNA-seq pipeline similarly aligns RNA reads. Alternate versions of the somatic variation pipeline are used to call various types of variants from exome and WGS data by comparing tumor and normal reference alignments. A differential expression pipeline identifies significantly altered transcript expression levels by comparing the tumor and normal RNA-seq alignments. Finally, the MedSeq pipeline summarizes all upstream pipelines into a single convenient result set. This includes a multitude of reports and visualizations for single nucleotide variants (SNVs), Indels (insertions and deletions), SVs (structural variants), CNVs (copy number variations), transcript fusions, differentially expressed genes, alternatively expressed isoforms, and much more. Data types are further integrated to, for example, identify which variants at the DNA level are expressed at the RNA level and which events affect known cancer driver genes or druggable targets.</p

FigShare

Terminology for the Genome Modeling System.

Author: Adam C. Coffman (767645)
Adam Dukes (767660)
Allison A. Regier (767646)
Amy E. Hawkins (767672)
Anthony M. Brummett (767643)
Avinash Ramu (392869)
Ben J. Oberkfell (767647)
Benjamin J. Ainscough (650163)
Benjamin S. Abbott (767668)
Brian R. Derickson (767662)
Charles Lu (33589)
Christopher A. Maher (228763)
Christopher A. Miller (231205)
Christopher C. Harris (553952)
Craig S. Pohl (767682)
Cyriac Kandoth (746224)
Daniel C. Koboldt (213793)
David E. Larson (246516)
David J. Dooling (767685)
David L. Morton (767656)
Edward A. Belter (767651)
Elaine R. Mardis (13400)
Eric Clark (767669)
Feiyu Du (767652)
Gabriel E. Sanderson (767648)
Gary Stiehr (767681)
Ian T. Ferguson (767655)
Indraniel Das (767670)
James M. Eldred (767684)
James V. Weible (767658)
Jason R. Walker (767683)
Jasreet Hundal (767663)
Joshua B. Peck (767659)
Joshua F. McMichael (659850)
Justin T. Lolofie (767661)
Ken Chen (61463)
Kyung H. Kim (188027)
Li Ding (16240)
Lynn K. Carmichael (767678)
Malachi Griffith (66161)
Mark M. Burnett (767657)
Matthew B. Callaway (767642)
Matthew R. Weil (767679)
Michael D. McLellan (25925)
Michael J. Kiwala (767644)
Nathan D. Dees (362863)
Nathaniel G. Nutter (767650)
Nicole Maher (767666)
Obi L. Griffith (63659)
Richard K. Wilson (13401)
Richard W. Wohlstadter (767680)
Robert L. Long (767653)
Scott M. Smith (268582)
Shawn M. Leonard (767675)
Thomas P. Mooney (767649)
Todd G. Hepler (767673)
Todd N. Wylie (767674)
Travis E. Abbott (767654)
Vincent J. Magrini (767667)
William E. Schroeder (767676)
William S. Schierding (767665)
Xian Fan (767671)
Xiaoqi Shi (767677)
Zachary L. Skidmore (767664)
Publication venue
Publication date
Field of study

Brief descriptions of critical objects in the Genome Modeling System.</p

FigShare

Key concepts of the GMS.

Author: Adam C. Coffman (767645)
Adam Dukes (767660)
Allison A. Regier (767646)
Amy E. Hawkins (767672)
Anthony M. Brummett (767643)
Avinash Ramu (392869)
Ben J. Oberkfell (767647)
Benjamin J. Ainscough (650163)
Benjamin S. Abbott (767668)
Brian R. Derickson (767662)
Charles Lu (33589)
Christopher A. Maher (228763)
Christopher A. Miller (231205)
Christopher C. Harris (553952)
Craig S. Pohl (767682)
Cyriac Kandoth (746224)
Daniel C. Koboldt (213793)
David E. Larson (246516)
David J. Dooling (767685)
David L. Morton (767656)
Edward A. Belter (767651)
Elaine R. Mardis (13400)
Eric Clark (767669)
Feiyu Du (767652)
Gabriel E. Sanderson (767648)
Gary Stiehr (767681)
Ian T. Ferguson (767655)
Indraniel Das (767670)
James M. Eldred (767684)
James V. Weible (767658)
Jason R. Walker (767683)
Jasreet Hundal (767663)
Joshua B. Peck (767659)
Joshua F. McMichael (659850)
Justin T. Lolofie (767661)
Ken Chen (61463)
Kyung H. Kim (188027)
Li Ding (16240)
Lynn K. Carmichael (767678)
Malachi Griffith (66161)
Mark M. Burnett (767657)
Matthew B. Callaway (767642)
Matthew R. Weil (767679)
Michael D. McLellan (25925)
Michael J. Kiwala (767644)
Nathan D. Dees (362863)
Nathaniel G. Nutter (767650)
Nicole Maher (767666)
Obi L. Griffith (63659)
Richard K. Wilson (13401)
Richard W. Wohlstadter (767680)
Robert L. Long (767653)
Scott M. Smith (268582)
Shawn M. Leonard (767675)
Thomas P. Mooney (767649)
Todd G. Hepler (767673)
Todd N. Wylie (767674)
Travis E. Abbott (767654)
Vincent J. Magrini (767667)
William E. Schroeder (767676)
William S. Schierding (767665)
Xian Fan (767671)
Xiaoqi Shi (767677)
Zachary L. Skidmore (767664)
Publication venue
Publication date
Field of study

The genome modeling system is architected around the idea of a ‘genome model’. The following vignettes illustrate key concepts integral to these models: (A) A subject can be modeled multiple times, possibly each with distinct ‘processing profiles’. For example, two different models can be defined for the HCC1395 genome using the ‘reference alignment’ pipeline. In Model 1, the processing profile specifies the use of BWA for alignment and Samtools for variant detection. In Model 2, Bowtie2 and GATK are used for these steps instead. (B) A given processing profile can be used across a group of models, ensuring, for instance, that all subjects in a cohort are processed in similar ways. In this example, two different cell line genomes (HCC1395 and XY2123) have models defined of the exact same type, using the processing profile with BWA/Samtools specified. (C) A model has no results until a build is generated. If the model is updated to have new inputs, a new build is required. Builds are immutable snapshots of modeling pipeline results. In this example, the HCC1395 genome has a reference alignment model again making use of the BWA/Samtools profile. However, as new instrument data becomes available, new builds are constructed to reflect the most complete data. (D) When models are used as inputs for other models, the last complete build for the input model is used as an input for the downstream build. In this example, both tumor and normal genomes are available for an individual (in this case HCC1395). Reference alignment models are built for each sample and then both are used as inputs for a third ‘somatic variation’ model. In reality, it is the underlying data in the reference alignment builds that are used to create a somatic variation build, identifying all variants that are thought to be tumor specific.</p

FigShare

Overview of the GMS.

Author: Adam C. Coffman (767645)
Adam Dukes (767660)
Allison A. Regier (767646)
Amy E. Hawkins (767672)
Anthony M. Brummett (767643)
Avinash Ramu (392869)
Ben J. Oberkfell (767647)
Benjamin J. Ainscough (650163)
Benjamin S. Abbott (767668)
Brian R. Derickson (767662)
Charles Lu (33589)
Christopher A. Maher (228763)
Christopher A. Miller (231205)
Christopher C. Harris (553952)
Craig S. Pohl (767682)
Cyriac Kandoth (746224)
Daniel C. Koboldt (213793)
David E. Larson (246516)
David J. Dooling (767685)
David L. Morton (767656)
Edward A. Belter (767651)
Elaine R. Mardis (13400)
Eric Clark (767669)
Feiyu Du (767652)
Gabriel E. Sanderson (767648)
Gary Stiehr (767681)
Ian T. Ferguson (767655)
Indraniel Das (767670)
James M. Eldred (767684)
James V. Weible (767658)
Jason R. Walker (767683)
Jasreet Hundal (767663)
Joshua B. Peck (767659)
Joshua F. McMichael (659850)
Justin T. Lolofie (767661)
Ken Chen (61463)
Kyung H. Kim (188027)
Li Ding (16240)
Lynn K. Carmichael (767678)
Malachi Griffith (66161)
Mark M. Burnett (767657)
Matthew B. Callaway (767642)
Matthew R. Weil (767679)
Michael D. McLellan (25925)
Michael J. Kiwala (767644)
Nathan D. Dees (362863)
Nathaniel G. Nutter (767650)
Nicole Maher (767666)
Obi L. Griffith (63659)
Richard K. Wilson (13401)
Richard W. Wohlstadter (767680)
Robert L. Long (767653)
Scott M. Smith (268582)
Shawn M. Leonard (767675)
Thomas P. Mooney (767649)
Todd G. Hepler (767673)
Todd N. Wylie (767674)
Travis E. Abbott (767654)
Vincent J. Magrini (767667)
William E. Schroeder (767676)
William S. Schierding (767665)
Xian Fan (767671)
Xiaoqi Shi (767677)
Zachary L. Skidmore (767664)
Publication venue
Publication date
Field of study

The genome modeling system (GMS) is implemented to use a federated disk SAN, with meta-data stored in a PostgreSQL relational database. Sample management tools allow the import of new samples and instrument data. Data are then processed through various analysis pipelines (e.g., reference alignment, somatic variation detection, etc.) that in turn are managed and monitored by a workflow system (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004274#box001" target="_blank">Box 1</a>). Stand-alone GMS tools, not part of automated pipelines, are available through a common tool tree. Most components of the system can be accessed through an Ubuntu Linux command-line interface or Ruby-on-Rails web interface.</p

FigShare