5 research outputs found

    Semantic Segmentation of Sorghum Using Hyperspectral Data Identifies Genetic Associations

    Get PDF
    This study describes the evaluation of a range of approaches to semantic segmentation of hyperspectral images of sorghum plants, classifying each pixel as either nonplant or belonging to one of the three organ types (leaf, stalk, panicle). While many current methods for segmentation focus on separating plant pixels from background, organ-specific segmentation makes it feasible to measure a wider range of plant properties. Manually scored training data for a set of hyperspectral images collected from a sorghum association population was used to train and evaluate a set of supervised classification models. Many algorithms show acceptable accuracy for this classification task. Algorithms trained on sorghum data are able to accurately classify maize leaves and stalks, but fail to accurately classify maize reproductive organs which are not directly equivalent to sorghum panicles. Trait measurements extracted from semantic segmentation of sorghum organs can be used to identify both genes known to be controlling variation in a previously measured phenotypes (e.g., panicle size and plant height) as well as identify signals for genes controlling traits not previously quantified in this population (e.g., stalk/leaf ratio). Organ level semantic segmentation provides opportunities to identify genes controlling variation in a wide range of morphological phenotypes in sorghum, maize, and other related grain crops

    USE OF CLUSTERING TECHNIQUES FOR PROTEIN DOMAIN ANALYSIS

    Get PDF
    Next-generation sequencing has allowed many new protein sequences to be identified. However, this expansion of sequence data limits the ability to determine the structure and function of most of these newly-identified proteins. Inferring the function and relationships between proteins is possible with traditional alignment-based phylogeny. However, this requires at least one shared subsequence. Without such a subsequence, no meaningful alignments between the protein sequences are possible. The entire protein set (or proteome) of an organism contains many unrelated proteins. At this level, the necessary similarity does not occur. Therefore, an alternative method of understanding relationships within diverse sets of proteins is needed. Related proteins generally share key subsequences. These conserved subsequences are called domains. Proteins that share several common domains can be inferred to have similar function. We refer to the set of all domains that a protein has as the protein’s domain architecture. We present a technique which clusters proteins sharing identical domain architecture. Matching a domain to a protein is determined with a confidence estimate (e.g., the E-value). The confidence with which a domain is matched to the sequence varies widely. By using a threshold for what is considered an acceptable match, domains with weak similarities can be ignored. By changing this E-value threshold, the clustering patterns and relationships between proteins can be analyzed. Clusters may merge or split as their domain architecture shifts based on this threshold. By studying the relationships between clusters from one iteration to the next as the threshold is made more stringent, phylogeny-like networks can be constructed. This technique clusters together proteins with identical domain architecture, and also illustrates relationships among clusters with similar architecture. This technique was tested on the multi-domain Regulator of G-protein Signaling family. The output is consistent with the known functional subdivisions of this protein family. This technique is also considerably faster than typical alignment-based phylogenetic reconstruction on this family. Use of the technique at the proteome level was also tested using bacterial proteome data from Bacillus subtilis. Advisors: Stephen Scott, Etsuko Moriyam

    Semantic Segmentation of Sorghum Using Hyperspectral Data Identifies Genetic Associations

    Get PDF
    This study describes the evaluation of a range of approaches to semantic segmentation of hyperspectral images of sorghum plants, classifying each pixel as either nonplant or belonging to one of the three organ types (leaf, stalk, panicle). While many current methods for segmentation focus on separating plant pixels from background, organ-specific segmentation makes it feasible to measure a wider range of plant properties. Manually scored training data for a set of hyperspectral images collected from a sorghum association population was used to train and evaluate a set of supervised classification models. Many algorithms show acceptable accuracy for this classification task. Algorithms trained on sorghum data are able to accurately classify maize leaves and stalks, but fail to accurately classify maize reproductive organs which are not directly equivalent to sorghum panicles. Trait measurements extracted from semantic segmentation of sorghum organs can be used to identify both genes known to be controlling variation in a previously measured phenotypes (e.g., panicle size and plant height) as well as identify signals for genes controlling traits not previously quantified in this population (e.g., stalk/leaf ratio). Organ level semantic segmentation provides opportunities to identify genes controlling variation in a wide range of morphological phenotypes in sorghum, maize, and other related grain crops

    Semantic Segmentation of Sorghum Using Hyperspectral Data Identifies Genetic Associations

    No full text
    This study describes the evaluation of a range of approaches to semantic segmentation of hyperspectral images of sorghum plants, classifying each pixel as either nonplant or belonging to one of the three organ types (leaf, stalk, panicle). While many current methods for segmentation focus on separating plant pixels from background, organ-specific segmentation makes it feasible to measure a wider range of plant properties. Manually scored training data for a set of hyperspectral images collected from a sorghum association population was used to train and evaluate a set of supervised classification models. Many algorithms show acceptable accuracy for this classification task. Algorithms trained on sorghum data are able to accurately classify maize leaves and stalks, but fail to accurately classify maize reproductive organs which are not directly equivalent to sorghum panicles. Trait measurements extracted from semantic segmentation of sorghum organs can be used to identify both genes known to be controlling variation in a previously measured phenotypes (e.g., panicle size and plant height) as well as identify signals for genes controlling traits not previously quantified in this population (e.g., stalk/leaf ratio). Organ level semantic segmentation provides opportunities to identify genes controlling variation in a wide range of morphological phenotypes in sorghum, maize, and other related grain crops

    A UAV‐based high‐throughput phenotyping approach to assess time‐series nitrogen responses and identify trait‐associated genetic components in maize

    No full text
    Abstract Advancements in the use of genome‐wide markers have provided unprecedented opportunities for dissecting the genetic components that control phenotypic trait variation. However, cost‐effectively characterizing agronomically important phenotypic traits on a large scale remains a bottleneck. Unmanned aerial vehicle (UAV)‐based high‐throughput phenotyping has recently become a prominent method, as it allows large numbers of plants to be analyzed in a time‐series manner. In this experiment, 233 inbred lines from the maize (Zea mays L.) diversity panel were grown in the field under different nitrogen treatments. Unmanned aerial vehicle images were collected during different plant developmental stages throughout the growing season. A workflow for extracting plot‐level images, filtering images to remove nonfoliage elements, and calculating canopy coverage and greenness ratings based on vegetation indices (VIs) was developed. After applying the workflow, about 100,000 plot‐level image clips were obtained for 12 different time points. High correlations were detected between VIs and ground truth physiological and yield‐related traits. The genome‐wide association study was performed, resulting in n = 29 unique genomic regions associated with image extracted traits from two or more of the 12 total time points. A candidate gene Zm00001d031997, a maize homolog of the Arabidopsis HCF244 (high chlorophyll fluorescence 244), located underneath the leading single nucleotide polymorphisms of the canopy coverage associated signals were repeatedly detected under both nitrogen conditions. The plot‐level time‐series phenotypic data and the trait‐associated genes provide great opportunities to advance plant science and to facilitate plant breeding
    corecore