39 research outputs found
BrAPI-an application programming interface for plant breeding applications
Motivation: Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. Results: To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases
GridScore:a tool for accurate, cross-platform phenotypic data collection and visualization
Background: Plant breeding and crop research rely on experimental phenotyping trials. These trials generate data for large numbers of traits and plant varieties that needs to be captured efficiently and accurately to support further research and downstream analysis. Traditionally scored by hand, phenotypic data is nowadays collected using spreadsheets or specialized apps. While many solutions exist, which increase efficiency and reduce errors, none offer the same familiarity as printed field plans which have been used for decades and offer an intuitive overview over the trial setup, previously recorded data and plots still requiring scoring.Results: We introduce GridScore which utilizes cutting-edge web technologies to reproduce the familiarity of printed field plans while enhancing the phenotypic data collection process by adding advanced features like georeferencing, image tagging and speech recognition. GridScore is a cross-platform open-source plant phenotyping app that combines barcode-based systems with a guided data collection approach while offering a top-down view onto the data collected in a field layout. GridScore is compared to existing tools across a wide spectrum of criteria including support for barcodes, multiple platforms, and visualizations.Conclusion: Compared to its competition, GridScore shows strong performance across the board offering a complete manual phenotyping experience.</p
From bits to bites: Advancement of the Germinate platform to support prebreeding informatics for crop wild relatives
Management and distribution of experimental data from prebreeding projects
is important to ensure uptake of germplasm into breeding and research programs.
Being able to access and share this data in standard formats is essential.
The adoption of a common informatics platform for crops that may have limited
resources brings economies of scale, allowing common informatics components
to be used across multiple species. The close integration of such a platform with
commonly used breeding software, visualization, and analysis tools reduces the
barrier for entry to researchers and provides a common framework to facilitate
collaborations and data sharing. This work presents significant updates to the
Germinate platform and highlights its value in distributing prebreeding data for
14 crops as part of the project âAdapting Agriculture to Climate Change: Collecting,
Protecting and Preparing Crop Wild Relativesâ (hereafter Crop Trust Crop
Wild Relatives project) led by the Crop Trust (https://www.cwrdiversity.org). The
addition of data on these species compliments data already publicly available in
Germinate. We present a suite of updated Germinate features using examples
from these crop species and their wild relatives. The use of Germinate within the
Crop TrustCropWildRelatives project demonstrates the usefulness of the system
and the benefits a shared informatics platform provides. These data resources
provide a foundation on which breeding and research communities can develop
additional online resources for their crops, harness new data as it becomes available,
and benefit collectively from future developments of the Germinate platform
Tripal, a community update after 10 years of supporting open source, standards-based genetic, genomic and breeding databases
Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here, we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles
Gigwa v2âExtended and improved genotype investigator
The study of genetic variations is the basis of many research domains in biology. From genome structure to population dynamics, many applications involve the use of genetic variants. The advent of next-generation sequencing technologies led to such a flood of data that the daily work of scientists is often more focused on data management than data analysis. This mass of genotyping data poses several computational challenges in terms of storage, search, sharing, analysis, and visualization. While existing tools try to solve these challenges, few of them offer a comprehensive and scalable solution.
Gigwa v2 is an easy-to-use, species-agnostic web application for managing and exploring high-density genotyping data. It can handle multiple databases and may be installed on a local computer or deployed as an online data portal. It supports various standard import and export formats, provides advanced filtering options, and offers means to visualize density charts or push selected data into various stand-alone or online tools. It implements 2 standard RESTful application programming interfaces, GA4GH, which is health-oriented, and BrAPI, which is breeding-oriented, thus offering wide possibilities of interaction with third-party applications. The project home page provides a list of live instances allowing users to test the system on public data (or reasonably sized user-provided data).
This new version of Gigwa provides a more intuitive and more powerful way to explore large amounts of genotyping data by offering a scalable solution to search for genotype patterns, functional annotations, or more complex filtering. Furthermore, its user-friendliness and interoperability make it widely accessible to the life science community
Data management in multi-disciplinary African RTB crop breeding programs
Quality phenotype and genotype data are important for the success of a breeding program. Like most programs, African breeding programs generate large multi-disciplinary phenotypic and genotypic datasets from several locations, that must be carefully managed through the use of an appropriate database management
system (DBMS) in order to generate reliable and accurate information for breedingdecisions. A DBMS is essential in data collection, storage, retrieval, validation, curation and analysis in plant breeding programs to enhance the ultimate goal of increasing genetic gain. The International Institute of Tropical Agriculture (IITA),
working on the roots, tubers and banana (RTB) crops like cassava, yam, banana and plantain has deployed a FAIR-compliant (Findable, Accessible, Interoperable, Reusable) database; BREEDBASE. The functionalities of this database in data management and analysis have been instrumental in achieving breeding goals. Standard
Operating Procedures (SOP) for each breeding process have been developed to allow a cognitive walkthrough for users. This has further helped to increase the usage and enhance the acceptability of the system. The wide acceptability gained among breeders in global cassava research programs has resulted in improvements in the precision and quality of genotype and phenotype data, and subsequent improvement in achievement of breeding program goals. Several innovative gender responsive approaches and initiatives have identified users and their preferences which have informed improved customer and product profiles. A remaining bottleneck is the effective linking of data on preferences and social information of crop users with technical breeding data to make this process more effective
Applying FAIR Principles to plant phenotypic data management in GnpIS
GnpIS is a data repository for plant phenomics that stores whole field and greenhouse experimental data including environment measures. It allows long-term access to datasets following the FAIR principles: Findable, Accessible, Interoperable, and Reusable, by using a flexible and original approach. It is based on a generic and ontology driven data model and an innovative software architecture that uncouples data integration, storage, and querying. It takes advantage of international standards including the Crop Ontology, MIAPPE, and the Breeding API. GnpIS allows handling data for a wide range of species and experiment types, including multiannual perennial plants experimental network or annual plant trials with either raw data, i.e., direct measures, or computed traits. It also ensures the integration and the interoperability among phenotyping datasets and with genotyping data. This is achieved through a careful curation and annotation of the key resources conducted in close collaboration with the communities providing data. Our repository follows the Open Science data publication principles by ensuring citability of each dataset. Finally, GnpIS compliance with international standards enables its interoperability with other data repositories hence allowing data links between phenotype and other data types. GnpIS can therefore contribute to emerging international federations of information systems
Introducing the Brassica Information Portal: Towards integrating genotypic and phenotypic Brassica crop data [version 1; referees: 2 approved]
The Brassica Information Portal (BIP) is a centralised repository for Brassica phenotypic data. Trait data associated with Brassica research and breeding experiments conducted on Brassica crops, used as vegetables, for livestock fodder and biofuels, is hosted on the site, together with information on the experimental plant materials used, as well as trial design. BIP is an open access and open source project, built on the schema of CropStoreDB, and as such can provide trait data management strategies for any crop data. A new user interface and programmatic submission/retrieval system helps to simplify data access for scientists and breeders. BIP opens up the opportunity to apply big data analyses to data generated by the Brassica Research Community. Here, we present a short description of the current status of the repository
Crop Ontology Governance and Stewardship Framework
A governance & stewardship framework for the Crop Ontology Project is required as this is a collaborative tool developed by a Community of Practice. Over the last 12 years of its existence, it has increased significantly in scope and use. Collecting and storing plant trait data and annotating the data with ontology terms is widely accepted by the crop science community to be critical to enable data interoperability and interexchange through tools such as the Breeding API (BrAPI). The Crop Ontology Community of Practice is organised around roles, curation principles and validation processes that require a formal description. A governance framework is defined by the various actors involved in the assetâs design, development and maintenance. It is complemented by a quality assurance process to ensure that trust levels, value creation, and sustainability objectives meet appropriate quality levels. The general principles underlying data governance are integrity, transparency, accountability and ownership, stewardship, standardization, change management and a robust data audit
Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the AgBioData Consortium
Over the last several decades, there has been rapid growth in the number and
scope of agricultural genetics, genomics and breeding (GGB) databases and
resources. The AgBioData Consortium (https://www.agbiodata.org/) currently
represents 44 databases and resources covering model or crop plant and animal
GGB data, ontologies, pathways, genetic variation and breeding platforms
(referred to as 'databases' throughout). One of the goals of the Consortium is
to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data
management and the integration of datasets which requires data sharing, along
with structured vocabularies and/or ontologies. Two AgBioData working groups,
focused on Data Sharing and Ontologies, conducted a survey to assess the status
and future needs of the members in those areas. A total of 33 researchers
responded to the survey, representing 37 databases. Results suggest that data
sharing practices by AgBioData databases are in a healthy state, but it is not
clear whether this is true for all metadata and data types across all
databases; and that ontology use has not substantially changed since a similar
survey was conducted in 2017. We recommend 1) providing training for database
personnel in specific data sharing techniques, as well as in ontology use; 2)
further study on what metadata is shared, and how well it is shared among
databases; 3) promoting an understanding of data sharing and ontologies in the
stakeholder community; 4) improving data sharing and ontologies for specific
phenotypic data types and formats; and 5) lowering specific barriers to data
sharing and ontology use, by identifying sustainability solutions, and the
identification, promotion, or development of data standards. Combined, these
improvements are likely to help AgBioData databases increase development
efforts towards improved ontology use, and data sharing via programmatic means.Comment: 17 pages, 8 figure