Search CORE

8 research outputs found

Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the AgBioData Consortium

Author: Berardini Tanya Z.
Clarke Jennifer L.
Cooper Laurel D.
Elser Justin
Farmer Andrew D.
Ficklin Stephen
Kumari Sunita
Laporte Marie-Angélique
Nelson Rex T.
Poelchau Monica F.
Sadohara Rie
Selby Peter
Sen Taner Z.
Thessen Anne E.
Whitehead Brandon
Publication venue
Publication date: 17/07/2023
Field of study

Over the last several decades, there has been rapid growth in the number and scope of agricultural genetics, genomics and breeding (GGB) databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, conducted a survey to assess the status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data sharing practices by AgBioData databases are in a healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that ontology use has not substantially changed since a similar survey was conducted in 2017. We recommend 1) providing training for database personnel in specific data sharing techniques, as well as in ontology use; 2) further study on what metadata is shared, and how well it is shared among databases; 3) promoting an understanding of data sharing and ontologies in the stakeholder community; 4) improving data sharing and ontologies for specific phenotypic data types and formats; and 5) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means.Comment: 17 pages, 8 figure

arXiv.org e-Print Archive

Recommended from our members

The Gene Ontology in 2010: extensions and refinements

Author: Abdulla Amina
Aslett Martin
Balakrishnan Rama
Basu Siddhartha
Berardini Tanya Z.
Binkley Gail
Blake Judith A.
Botstein David
Bridges Susan
Bult Carol
Burgess Shane
Bushmanova Yulia
Carbon Seth
Chan Juancarlos
Cherry J. Michael
Chibucos Marcus
Chisholm Rex L.
Christie Karen R.
Collmer Candace
Costanzo Maria C.
D'Eustachio Peter
Deegan Jennifer I.
Diehl Alexander D.
Dolan Mary
Dolinski Kara
Drabkin Harold
Eilbeck Karen
Engel Stacia R.
Eppig Janan T.
Feltrin Erika
Fey Petra
Fisk Dianna G.
Gaudet Pascale
Gene Ontology Consortium
Gwinn-Giglio Michelle
Hannick Linda
Harris Midori A.
Hill David P.
Hirschman Jodi E.
Hitz Benjamin C.
Hong Eurie L.
Hu James C.
Huala Eva
Ireland Amelia
Jaiswal Pankaj
Khodiyar Varsha K.
Kibbe Warren
Kishore Ranjana
Krieger Cynthia J.
Laulederkind Stan
Lewis Suzanna E.
Li Donghui
Livstone Michael S.
Lomax Jane
Lovering Ruth C.
Madupu Ramana
Matthews Lisa
McCarthy Fiona
McIntosh Brenley
Miyasato Stuart R.
Mungall Christopher J.
Nash Robert S.
Ni Li
Oughtred Rose
Park Julie
Renfro Daniel
Ringwald Martin
Shimoyama Mary
Siegele Deborah A.
Sitnikov Dmitry
Skrzypek Marek S.
Sternberg Paul
Talmud Philippa J.
Torto-Alalibo Trudy
Twigger Simon
Valle Giorgio
Van Auken Kimberly
Weng Shuai
Wong Edith D.
Wood Valerie
Wortman Jennifer
Zweifel Adrienne
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

The Gene Ontology (GO) Consortium (http://www.geneontology.org) (GOC) continues to develop, maintain and use a set of structured, controlled vocabularies for the annotation of genes, gene products and sequences. The GO ontologies are expanding both in content and in structure. Several new relationship types have been introduced and used, along with existing relationships, to create links between and within the GO domains. These improve the representation of biology, facilitate querying, and allow GO developers to systematically check for and correct inconsistencies within the GO. Gene product annotation using GO continues to increase both in the number of total annotations and in species coverage. GO tools, such as OBO-Edit, an ontology-editing tool, and AmiGO, the GOC ontology browser, have seen major improvements in functionality, speed and ease of use.This is the publisher’s final pdf. The published article is copyrighted by the author(s) and published by Oxford University Press. The published article can be found at: http://nar.oxfordjournals.org/

ScholarsArchive@OSU

The Gene Ontology (GO) Cellular Component Ontology: integration with SAO (Subcellular Anatomy Ontology) and other recent developments.

Author: Berardini Tanya Z
Drabkin Harold J
Foulger Rebecca E
Hill David P
Imam Fahim T
Lomax Jane
Martone Maryann E
Mungall Christopher J
Roncaglia Paola
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2013
Field of study

BACKGROUND: The Gene Ontology (GO) (http://www.geneontology.org/) contains a set of terms for describing the activity and actions of gene products across all kingdoms of life. Each of these activities is executed in a location within a cell or in the vicinity of a cell. In order to capture this context, the GO includes a sub-ontology called the Cellular Component (CC) ontology (GO-CCO). The primary use of this ontology is for GO annotation, but it has also been used for phenotype annotation, and for the annotation of images. Another ontology with similar scope to the GO-CCO is the Subcellular Anatomy Ontology (SAO), part of the Neuroscience Information Framework Standard (NIFSTD) suite of ontologies. The SAO also covers cell components, but in the domain of neuroscience. DESCRIPTION: Recently, the GO-CCO was enriched in content and links to the Biological Process and Molecular Function branches of GO as well as to other ontologies. This was achieved in several ways. We carried out an amalgamation of SAO terms with GO-CCO ones; as a result, nearly 100 new neuroscience-related terms were added to the GO. The GO-CCO also contains relationships to GO Biological Process and Molecular Function terms, as well as connecting to external ontologies such as the Cell Ontology (CL). Terms representing protein complexes in the Protein Ontology (PRO) reference GO-CCO terms for their species-generic counterparts. GO-CCO terms can also be used to search a variety of databases. CONCLUSIONS: In this publication we provide an overview of the GO-CCO, its overall design, and some recent extensions that make use of additional spatial information. One of the most recent developments of the GO-CCO was the merging in of the SAO, resulting in a single unified ontology designed to serve the needs of GO annotators as well as the specific needs of the neuroscience community

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Springer - Publisher Connector

PubMed Central

UCL Discovery

eScholarship - University of California

Recommended from our members

Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the Agbiodata Consortium

Author: Berardini Tanya Z
Clarke Jennifer L
Cooper Laurel D
Elser Justin
Farmer Andrew D
Ficklin Stephen
Kumari Sunita
Laporte Marie-Angélique
Nelson Rex T
Poelchau Monica F
Sadohara Rie
Selby Peter
Sen Taner Z
Thessen Anne E
Whitehead Brandon
Publication venue: eScholarship, University of California
Publication date: 15/11/2023
Field of study

Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources (https://www.agbiodata.org/databases) covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, respectively, conducted a Consortium-wide survey to assess the current status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data-sharing practices by AgBioData databases are in a fairly healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that, ontology use has not substantially changed since a similar survey was conducted in 2017. Based on our evaluation of the survey results, we recommend (i) providing training for database personnel in a specific data-sharing techniques, as well as in ontology use; (ii) further study on what metadata is shared, and how well it is shared among databases; (iii) promoting an understanding of data sharing and ontologies in the stakeholder community; (iv) improving data sharing and ontologies for specific phenotypic data types and formats; and (v) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means. Database URL  https://www.agbiodata.org/databases

eScholarship - University of California

Crowdsourcing biocuration: The Community Assessment of Community Annotation with Ontologies (CACAO).

Experimental data about gene functions curated from the primary literature have enormous value for research scientists in understanding biology. Using the Gene Ontology (GO), manual curation by experts has provided an important resource for studying gene function, especially within model organisms. Unprecedented expansion of the scientific literature and validation of the predicted proteins have increased both data value and the challenges of keeping pace. Capturing literature-based functional annotations is limited by the ability of biocurators to handle the massive and rapidly growing scientific literature. Within the community-oriented wiki framework for GO annotation called the Gene Ontology Normal Usage Tracking System (GONUTS), we describe an approach to expand biocuration through crowdsourcing with undergraduates. This multiplies the number of high-quality annotations in international databases, enriches our coverage of the literature on normal gene function, and pushes the field in new directions. From an intercollegiate competition judged by experienced biocurators, Community Assessment of Community Annotation with Ontologies (CACAO), we have contributed nearly 5,000 literature-based annotations. Many of those annotations are to organisms not currently well-represented within GO. Over a 10-year history, our community contributors have spurred changes to the ontology not traditionally covered by professional biocurators. The CACAO principle of relying on community members to participate in and shape the future of biocuration in GO is a powerful and scalable model used to promote the scientific enterprise. It also provides undergraduate students with a unique and enriching introduction to critical reading of primary literature and acquisition of marketable skills

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

RNAcentral: a comprehensive database of non-coding RNA sequences: a comprehensive database of non-coding RNA sequences

Author: Basu S. (Siddhartha)
Bateman A. (Alex)
Berardini T. (Tanya) Z. (Z)
Bruford E. (Elspeth) A. (A)
Bujnicki J. (Janusz) M. (M)
Cannone J. (Jamie) J. (J)
Chai B. (Benli)
Chan P. (Patricia) P. (P)
Chen R. (Runsheng)
Cherry J. (J) M. (Michael)
Clark M. (Michael)
Cochrane G. (Guy)
Cole J. (James) R. (R)
Dinger M. (Marcel) E. (E)
Engel S. (Stacia) R. (R)
F. S. (Stadler) P. (Peter)
Fey P. (Petra)
Finn R. (Robert) D. (D)
Frankish A. (Adam)
G H. (Hatzigeorgiou) A. (Artemis)
Gray K. (Kristian) A. (A)
Griffiths-Jones S. (Sam)
Gutell R. (Robin) R. (R)
Howe K. (Kevin) L. (L)
Huala E. (Eva)
Kalvari I. (Ioanna)
Karlowski W. (Wojciech) M. (M)
Kay S. (Simon) J. (J) E. (E)
Kenmochi N. (Naoya)
Kersey P. (Paul) J. (J)
Kozomara A. (Ana)
Lau B. (Britney) Y. (Y)
Lowe T. (Todd) M. (M)
Ma L. (Lina)
Machnicka M. (Magdalena) A. (A)
McDonald D. (Daniel)
Mestdagh P. (Pieter)
Paraskevopoulou M. (Maria) D. (D)
Petrov A. (Anton) I. (I)
Putz J. (Joern)
Quek X. (Xiu) C. (Cheng)
Szymanski M. (Maciej)
Vlachos I. (Ioannis) S. (S)
Volders P. (Pieter-Jan)
Williams K. (Kelly) P. (P)
Wood V. (Valerie)
Wower J. (Jacek)
Yoshihama M. (Maki)
Zhang Z. (Zhang)
Zhao Y. (Yi)
Zhu W. (Weimin)
Zwieb C. (Christian) W. (W)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/

univOAK