36 research outputs found

    The Agricultural Genome to Phenome Initiative (AG2PI): creating a shared vision across crop and livestock research communities

    Get PDF
    Predicting phenotype from genotype is a central challenge in biology. By understanding genomic information to predict and improve traits, scientists can address the challenges and opportunities of achieving sustainable genetic improvement of complex, economically important traits in agriculturally relevant species. Converting the enormous, recent technical advances in all areas of genomics and phenomics into sustained and ecologically responsible improvements in food and fuel production is complex. It will require engaging agricultural genome to phenome (G2P) experts, drawing from a broad community, including crop and livestock scientists and essential integrative disciplines (e.g., engineers, economists, data and social scientists). To achieve this vision, the USDA NIFA-funded project inaugurating the Agricultural Genome to Phenome Initiative (AG2PI) is working to: Develop a cohesive vision for agricultural G2P research by identifying research gaps and opportunities; advancing community solutions to these challenges and gaps; and rapidly disseminating findings to the broader community. Towards these ends, this AG2PI project is organizing virtual field days, conferences, training workshops, and awarding seed grants to conceive new insights (details at www.ag2pi.org). Since October 2020, more than 10,000 unique participants from every inhabited continent have engaged in these activities. To illustrate AG2PI’s scope, we present survey results on agricultural G2P research needs and opportunities, highlighting opinions and suggestions for the future. We invite stakeholders interested in this complex but critical effort to help create an optimal, sustainable food supply for society and challenge the community to add to our vision for future accomplishments by a fully actualized AG2PI enterprise

    The effect of artificial selection on phenotypic plasticity in maize

    Get PDF
    Remarkable productivity has been achieved in crop species through artificial selection and adaptation to modern agronomic practices. Whether intensive selection has changed the ability of improved cultivars to maintain high productivity across variable environments is unknown. Understanding the genetic control of phenotypic plasticity and genotype by environment (G × E) interaction will enhance crop performance predictions across diverse environments. Here we use data generated from the Genomes to Fields (G2F) Maize G × E project to assess the effect of selection on G × E variation and characterize polymorphisms associated with plasticity. Genomic regions putatively selected during modern temperate maize breeding explain less variability for yield G × E than unselected regions, indicating that improvement by breeding may have reduced G × E of modern temperate cultivars. Trends in genomic position of variants associated with stability reveal fewer genic associations and enrichment of variants 0–5000 base pairs upstream of genes, hypothetically due to control of plasticity by short-range regulatory elements

    Maize Genomes to Fields: 2014 and 2015 field season genotype, phenotype, environment, and inbred ear image datasets

    Get PDF
    Objectives: Crop improvement relies on analysis of phenotypic, genotypic, and environmental data. Given large, well-integrated, multi-year datasets, diverse queries can be made: Which lines perform best in hot, dry environments? Which alleles of specific genes are required for optimal performance in each environment? Such datasets also can be leveraged to predict cultivar performance, even in uncharacterized environments. The maize Genomes to Fields (G2F) Initiative is a multi-institutional organization of scientists working to generate and analyze such datasets from existing, publicly available inbred lines and hybrids. G2F’s genotype by environment project has released 2014 and 2015 datasets to the public, with 2016 and 2017 collected and soon to be made available. Data description: Datasets include DNA sequences; traditional phenotype descriptions, as well as detailed ear, cob, and kernel phenotypes quantified by image analysis; weather station measurements; and soil characterizations by site. Data are released as comma separated value spreadsheets accompanied by extensive README text descriptions. For genotypic and phenotypic data, both raw data and a version with outliers removed are reported. For weather data, two versions are reported: a full dataset calibrated against nearby National Weather Service sites and a second calibrated set with outliers and apparent artifacts removed

    Integrated genomic characterization of oesophageal carcinoma

    Get PDF
    Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies

    Integrated genomic characterization of pancreatic ductal adenocarcinoma

    Get PDF
    We performed integrated genomic, transcriptomic, and proteomic profiling of 150 pancreatic ductal adenocarcinoma (PDAC) specimens, including samples with characteristic low neoplastic cellularity. Deep whole-exome sequencing revealed recurrent somatic mutations in KRAS, TP53, CDKN2A, SMAD4, RNF43, ARID1A, TGFβR2, GNAS, RREB1, and PBRM1. KRAS wild-type tumors harbored alterations in other oncogenic drivers, including GNAS, BRAF, CTNNB1, and additional RAS pathway genes. A subset of tumors harbored multiple KRAS mutations, with some showing evidence of biallelic mutations. Protein profiling identified a favorable prognosis subset with low epithelial-mesenchymal transition and high MTOR pathway scores. Associations of non-coding RNAs with tumor-specific mRNA subtypes were also identified. Our integrated multi-platform analysis reveals a complex molecular landscape of PDAC and provides a roadmap for precision medicine

    A hypothesis-driven approach to assessing significance of differences in RNA expression levels among specific groups of genes

    No full text
    Genome-wide molecular gene expression studies generally compare expression values for each gene across multiple conditions followed by cluster and gene set enrichment analysis to determine whether differentially expressed genes are enriched in specific biochemical pathways, cellular components, biological processes, and/or molecular functions, etc. This approach to analyzing differences in gene expression enables discovery of gene function, but is not useful to determine whether pre-defined groups of genes share or diverge in their expression patterns in response to treatments nor to assess the correctness of pre-defined gene set groupings. Here we present a simple method that changes the dimension of comparison by treating genes as variable traits to directly assess significance of differences in expression levels among pre-defined gene groups. Because expression distributions are typically skewed (thus unfit for direct assessment using Gaussian statistical methods) our method involves transforming expression data to approximate a normal distribution followed by dividing the genes into groups, then applying Gaussian parametric methods to assess significance of observed differences. This method enables the assessment of differences in gene expression distributions within and across samples, enabling hypothesis-based comparison among groups of genes. We demonstrate this method by assessing the significance of specific gene groups’ differential response to heat stress conditions in maize

    Computing on Phenotypic Descriptions for Candidate Gene Discovery and Crop Improvement

    No full text
    Many newly observed phenotypes are first described, then experimentally manipulated. These language-based descriptions appear in both the literature and in community datastores. To standardize phenotypic descriptions and enable simple data aggregation and analysis, controlled vocabularies and specific data architectures have been developed. Such simplified descriptions have several advantages over natural language: they can be rigorously defined for a particular context or problem, they can be assigned and interpreted programmatically, and they can be organized in a way that allows for semantic reasoning (inference of implicit facts). Because researchers generally report phenotypes in the literature using natural language, curators have been translating phenotypic descriptions into controlled vocabularies for decades to make the information computable. Unfortunately, this methodology is highly dependent on human curation, which does not scale to the scope of all publications available across all of plant biology. Simultaneously, researchers in other domains have been working to enable computation on natural language. This has resulted in new, automated methods for computing on language that are now available, with early analyses showing great promise. Natural language processing (NLP) coupled with machine learning (ML) allows for the use of unstructured language for direct analysis of phenotypic descriptions. Indeed, we have found that these automated methods can be used to create data structures that perform as well or better than those generated by human curators on tasks such as predicting gene function and biochemical pathway membership. Here, we describe current and ongoing efforts to provide tools for the plant phenomics community to explore novel predictions that can be generated using these techniques. We also describe how these methods could be used along with mobile speech-to-text tools to collect and analyze in-field spoken phenotypic descriptions for association genetics and breeding applications
    corecore