Search CORE

115 research outputs found

Ensembl Genomes 2016: more genomes, more complexity

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Edinburgh Research Explorer

PLAZA 4.0 : an integrative resource for functional, evolutionary and comparative plant genomics

Author: Botzki Alexander
Coppens Frederik
Diels Tim
Kreft Lukasz
Van Bel Michiel
Van de Peer Yves
Vancaester Emmelien
Vandepoele Klaas
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

PLAZA (https://bioinformatics.psb.ugent.be/plaza) is a plant-oriented online resource for comparative, evolutionary and functional genomics. The PLAZA platform consists of multiple independent instances focusing on different plant clades, while also providing access to a consistent set of reference species. Each PLAZA instance contains structural and functional gene annotations, gene family data and phylogenetic trees and detailed gene colinearity information. A user-friendly web interface makes the necessary tools and visualizations accessible, specific for each data type. Here we present PLAZA 4.0, the latest iteration of the PLAZA framework. This version consists of two new instances (Dicots 4.0 and Monocots 4.0) providing a large increase in newly available species, and offers access to updated and newly implemented tools and visualizations, helping users with the ever-increasing demands for complex and in-depth analyzes. The total number of species across both instances nearly doubles from 37 species in PLAZA 3.0 to 71 species in PLAZA 4.0, with a much broader coverage of crop species (e.g. wheat, palm oil) and species of evolutionary interest (e.g. spruce, Marchantia). The new PLAZA instances can also be accessed by a programming interface through a RESTful web service, thus allowing bioinformaticians to optimally leverage the power of the PLAZA platform

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

UPSpace at the University of Pretoria

The ISB Cancer Genomics Cloud: A Flexible Cloud-Based Platform for Cancer Genomics Research.

Author: Backus Mark
Bingham Jonathan
Bookman Matthew
Deflaux Nicole
Dhankani Varsha
Gibbs David L
Hahn Abigail
Lee Phyliss
Leinonen Kalle
Longabaugh William J
Miller Michael
Paquette Suzanne M
Pihl Todd
Pot David
Reyes Madelyn
Reynolds Sheila M
Rodebaugh Zack
Shmulevich Ilya
Slagel Joseph
Publication venue: Providence St. Joseph Health Digital Commons
Publication date: 01/11/2017
Field of study

The ISB Cancer Genomics Cloud (ISB-CGC) is one of three pilot projects funded by the National Cancer Institute to explore new approaches to computing on large cancer datasets in a cloud environment. With a focus on Data as a Service, the ISB-CGC offers multiple avenues for accessing and analyzing The Cancer Genome Atlas, TARGET, and other important references such as GENCODE and COSMIC using the Google Cloud Platform. The open approach allows researchers to choose approaches best suited to the task at hand: from analyzing terabytes of data using complex workflows to developing new analysis methods in common languages such as Python, R, and SQL; to using an interactive web application to create synthetic patient cohorts and to explore the wealth of available genomic data. Links to resources and documentation can be found at www.isb-cgc.or

Providence St. Joseph Health Digital Commons

Domestication of rice has reduced the occurrence of transposable elements within gene coding regions

Author: A Zuccolo
AD Ewing
AFA Smit
B McClintock
B Piegu
BF Zhu
BR Lu
C Trapnell
C Trapnell
C Zhang
D Ellinghaus
D Lisch
DA Vaughan
DA Vaughan
DA Vaughan
DS Brar
E Wang
E Wang
F Sabot
G Second
GS Khush
Guosheng Xie
H Morishima
HI Oka
J Jurka
J Wang
J Yu
J Zhang
JL Bennetzen
JS Ammiraju
JS Hawkins
Kai Guo
L Li
L Tang
Liangcai Peng
Lingqiang Wang
M Mitreva
M Sweeney
M Wang
MI Tenaillon
MI Tenaillon
OF Linares
P Huang
P Viguier
Peng Chen
PJ Kersey
Q Zhu
QJ Zhang
RB Flavell
S Lockton
S Roffler
S Yan
SA Goff
SF Altschul
SL Chou
SR Wessler
Staffan Persson
T Sang
T Zhu
TT Chang
TT Hu
X Huang
X Wang
XH Zou
Xiaobo Zhu
Xukai Li
Y Wang
Y Xiong
Y-E Chu
Yanting Wang
Ying Li
Z Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

proGenomes2: an improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes

Author: Bork P.
Coelho L.P.
Forslund S.K.
Hernández-Plaza A.
Huerta-Cepas J.
Letunic I.
Maistrenko O.M.
Mende D.R.
Milanese A.
Orakov A.N.
Paoli L.
Schmidt T.S.B.
Sunagawa S.
Zeller G.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/01/2020
Field of study

Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/

MDC Repository

proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes

Author: Bork P.
Forslund K.
Huerta-Cepas J.
Letunic I.
Li S.S.
Mende D.R.
Sunagawa S.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/10/2016
Field of study

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de

Repository for Publications and Research Data

PubMed Central

UNSWorks

MDC Repository

Online-Publikations-Server der Universität Würzburg

proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes

Author: Bork Peer
Ducarmon Quinten R
Fullam Anthony
Huerta-Cepas Jaime
Karcher Nicolai
Khedkar Supriya
Kuhn Michael
Larralde Martin
Letunic Ivica
Maistrenko Oleksandr M
Malfertheiner Lukas
Mende Daniel R
Milanese Alessio
Rodrigues Joao Frederico Matias
Sanchis-López Claudia
Schmidt Thomas S B
Schudoma Christian
Sunagawa Shinichi
Szklarczyk Damian
von Mering Christian
Zeller Georg
Publication venue: 'Oxford University Press (OUP)'
Publication date: 06/01/2023
Field of study

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/

ZORA

CerealsDB 3.0:Expansion of resources and data integration

Author: Barker Gary L A
Bian Xingdong
Burridge Amanda
Caccamo Mario
Coghill Jane
Davey Robert
Edwards Keith
Przewieslik-Allen Sacha
Tyrrell Simon
Waterfall Christy
Wilkinson Paul A
Winfield Mark O
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2016
Field of study

BACKGROUND: The increase in human populations around the world has put pressure on resources, and as a consequence food security has become an important challenge for the 21st century. Wheat (Triticum aestivum) is one of the most important crops in human and livestock diets, and the development of wheat varieties that produce higher yields, combined with increased resistance to pests and resilience to changes in climate, has meant that wheat breeding has become an important focus of scientific research. In an attempt to facilitate these improvements in wheat, plant breeders have employed molecular tools to help them identify genes for important agronomic traits that can be bred into new varieties. Modern molecular techniques have ensured that the rapid and inexpensive characterisation of SNP markers and their validation with modern genotyping methods has produced a valuable resource that can be used in marker assisted selection. CerealsDB was created as a means of quickly disseminating this information to breeders and researchers around the globe. DESCRIPTION: CerealsDB version 3.0 is an online resource that contains a wide range of genomic datasets for wheat that will assist plant breeders and scientists to select the most appropriate markers for use in marker assisted selection. CerealsDB includes a database which currently contains in excess of a million putative varietal SNPs, of which several hundreds of thousands have been experimentally validated. In addition, CerealsDB also contains new data on functional SNPs predicted to have a major effect on protein function and we have constructed a web service to encourage data integration and high-throughput programmatic access. CONCLUSION: CerealsDB is an open access website that hosts information on SNPs that are considered useful for both plant breeders and research scientists. The recent inclusion of web services designed to federate genomic data resources allows the information on CerealsDB to be more fully integrated with the WheatIS network and other biological databases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1139-x) contains supplementary material, which is available to authorized users

Crossref

PubMed Central

Explore Bristol Research

RNAcentral : a hub of information for non-coding RNA sequences

Author: Basu Siddhartha
Bateman Alex
Berardini Tanya Z
Billis Kostantinos
Blake Judith A
Boccaletto Pietro
Bruford Elspeth
Bujnicki Janusz M
Bult Carol J
Burkov Boris
Cannone Jamie J
Chan Patricia P
Chen Runsheng
Cherry J Michael
Cochrane Guy
Cole James
Davis Paul
Dinger Marcel
Emmert David
Engel Stacia R
Fey Petra
Finn Robert D
Frankish Adam
Gillespie Marc E
Gorodkin Jan
Griffiths-Jones Sam
Gutell Robin R
Hatzigeorgiou Artemis
He Shunmin
Howe Kevin
Huntley Rachael P
Kalvari Ioanna
Karagkouni Dimitra
Karlowski Wojciech M
Kay Simon
Kenmochi Naoya
Laulederkind Stanley J F
Lovering Ruth C
Lowe Todd M
Ma Lina
Marygold Steven J
Nawrocki Eric
Orlic-Milacic Marija
Paraskevopoulou Maria
Petrov Anton I
Rivas Elena
Rutherford Kim
Seal Ruth
Seemann Stefan E
Shimoyama Mary
Stadler Peter F
Sweeney Blake A
Szymanski Maciej
Team SILVA
Vandesompele Jo
Volders Pieter-Jan
Williams Kelly P
Wood Valerie
Yoshihama Maki
Zhang Zhang
Zhao Yi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences, collating information on ncRNA sequences of all types from a broad range of organisms. We have recently added a new genome mapping pipeline that identifies genomic locations for ncRNA sequences in 296 species. We have also added several new types of functional annotations, such as tRNA secondary structures, Gene Ontology annotations, and miRNA-target interactions. A new quality control mechanism based on Rfam family assignments identifies potential contamination, incomplete sequences, and more. The RNAcentral database has become a vital component of many workflows in the RNA community, serving as both the primary source of sequence data for academic and commercial groups, as well as a source of stable accessions for the annotation of genomic and functional features. These examples are facilitated by an improved RNAcentral web interface, which features an updated genome browser, a new sequence feature viewer, and improved text search functionality. RNAcentral is freely available at https://rnacentral.org

Ghent University Academic Bibliography

eScholarship - University of California

UNSWorks