106 research outputs found
New Approaches to Use Genomics, Field Traits, and High-throughput Phenotyping for Gene Discovery in Maize (\u3ci\u3eZea mays\u3c/i\u3e)
Maize is one of major crop species over the world. With lots of genetic resources and genomic tools, maize also serves as a model species to understand genetic diversity, facilitate the development of trait extraction algorithms and map candidate functional genes. Since the first version of widely used B73 reference genome was released, independent research groups in the maize community propagated seeds themselves for further research purposes. However, unexpected or occasional contamination may happen during this process. The first study in this thesis used public RNA-seq data of B73 from 27 research groups across three countries for calling single nucleotide polymorphisms (SNP). Those SNPs were applied for investigating the distance of 27 maize B73 samples from the reference genome and three major clades were defined for determining their original sources. On the other side, maize is a plant with clear plant architecture. The second study was to employ the high-throughput plant phenotyping to dissect plant phenotypes using computer vision methods. A total of 32 maize inbreds distributed from the Genomes to Fields project were captured images in daily by 4 types of cameras (RGB, Hyperspectral, Fluorescence and Thermal-IR) for approximate 1 month. Differences between computer vision measurements and manual measurements about the plant fresh biomass were evaluated. Broad-sense heritability was estimated for extracted measurements from images. The expanded types of plant phenotype from the perspective of imaging provided a broader range of opportunities for connecting phenotypic variants with genetic variants. The third study utilized the phenome-wide variants in maize Goodman-Buckler 282 association panel to scan and associate with genetic variants of annotated genes along the maize genome. Genes detected by the proposed model, Genome-Phenome Wide Association Study (GPWAS), are significantly different from conventional GWAS detected genes. GPWAS genes tend to be more functionally conserved and more similar as classical maize mutants with known functions. Results from these researches assist to answer question about the genetic purity of same maize genotype. Methods developed in this thesis can also provide the valuable reference for trait discoveries from images and candidate functional gene identification using a broad set of phenotypes.
Adviser: James C. Schnabl
RNA-Seq Based Analysis of Population Structure within the Maize Inbred B73
Recent reports have shown than many identically named genetic lines used in research around the world actually contain large amounts of uncharacterized genetic variation as a result of cross contamination of stocks, unintentional crossing, residual heterozygosity within original stocks, or de novo mutation. 27 public, large scale, RNA-seq datasets from 20 independent research groups around the world were used to assess variation within the maize (Zea mays ssp. mays) inbred B73, a four decade old variety which served as the reference genotype for the original maize genome sequencing project and is widely used in genetic, genomic, and phenotypic research. Several clearly distinct clades were identified among putatively B73 samples. A number of these clades were defined by the presence of clearly defined genomic blocks containing a haplotype which did not match the published B73 reference genome. The overall proportion of the maize genotype where multiple distinct haplotypes were observed across different research groups was approximately 2.3%. In some cases the relationship among B73 samples generated by different research groups recapitulated mentor/mentee relationships within the maize genetics community
Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check
Retrieval-Augmented Generation (RAG) aims to generate more reliable and
accurate responses, by augmenting large language models (LLMs) with the
external vast and dynamic knowledge. Most previous work focuses on using RAG
for single-round question answering, while how to adapt RAG to the complex
conversational setting wherein the question is interdependent on the preceding
context is not well studied. In this paper, we propose a conversation-level RAG
approach, which incorporates fine-grained retrieval augmentation and self-check
for conversational question answering (CQA). In particular, our approach
consists of three components, namely conversational question refiner,
fine-grained retriever and self-check based response generator, which work
collaboratively for question understanding and relevant information acquisition
in conversational settings. Extensive experiments demonstrate the great
advantages of our approach over the state-of-the-art baselines. Moreover, we
also release a Chinese CQA dataset with new features including reformulated
question, extracted keyword, retrieved paragraphs and their helpfulness, which
facilitates further researches in RAG enhanced CQA
Enhancing Hybrid Prediction in Pearl Millet Using Genomic and/or Multi- Environment Phenotypic Information of Inbreds
Genomic selection (GS) is an emerging methodology that helps select superior lines among experimental cultivars in plant breeding programs. It offers the opportunity to increase the productivity of cultivars by delivering increased genetic gains and reducing the breeding cycles. This methodology requires inexpensive and sufficiently dense marker information to be successful, and with whole genome sequencing, it has become an important tool in many crops. The recent assembly of the pearl millet genome has made it possible to employ GS models to improve the selection procedure in pearl millet breeding programs. Here, three GS models were implemented and compared using grain yield and dense molecular marker information of pearl millet obtained from two different genotyping platforms (C [conventional GBS RAD-seq] and T [tunable GBS tGBS]). The models were evaluated using three different cross-validation (CV) schemes mimicking real situations that breeders face in breeding programs: CV2 resembles an incomplete field trial, CV1 predicts the performance of untested hybrids, and CV0 predicts the performance of hybrids in unobserved environments. We found that (i) adding phenotypic information of parental inbreds to the calibration sets improved predictive ability, (ii) accounting for genotype-by-environment interaction also increased the performance of the models, and (iii) superior strategies should consider the use of the molecular markers derived from the T platform (tGBS)
Fault Tolerant Free Gait and Footstep Planning for Hexapod Robot Based on Monte-Carlo Tree
Legged robots can pass through complex field environments by selecting gaits
and discrete footholds carefully. Traditional methods plan gait and foothold
separately and treat them as the single-step optimal process. However, such
processing causes its poor passability in a sparse foothold environment. This
paper novelly proposes a coordinative planning method for hexapod robots that
regards the planning of gait and foothold as a sequence optimization problem
with the consideration of dealing with the harshness of the environment as leg
fault. The Monte Carlo tree search algorithm(MCTS) is used to optimize the
entire sequence. Two methods, FastMCTS, and SlidingMCTS are proposed to solve
some defeats of the standard MCTS applicating in the field of legged robot
planning. The proposed planning algorithm combines the fault-tolerant gait
method to improve the passability of the algorithm. Finally, compared with
other planning methods, experiments on terrains with different densities of
footholds and artificially-designed challenging terrain are carried out to
verify our methods. All results show that the proposed method dramatically
improves the hexapod robot's ability to pass through sparse footholds
environment
Non-homology-based prediction of gene functions in maize (\u3ci\u3eZea mays\u3c/i\u3e ssp. \u3ci\u3emays\u3c/i\u3e)
Advances in genome sequencing and annotation have eased the difficulty of identifying new gene sequences. Predicting the functions of these newly identified genes remains challenging. Genes descended from a common ancestral sequence are likely to have common functions.As a result, homology is widely used for gene function prediction. This means functional annotation errors also propagate from one species to another. Several approaches based on machine learning classification algorithms were evaluated for their ability to accurately predict gene function from non-homology gene features. Among the eight supervised classification algorithms evaluated, random forest-based prediction consistently provided the most accurate gene function prediction. Non-homology-based functional annotation provides complementary strengths to homology-based annotation, with higher average performance in Biological Process GO terms, the domain where homology-based functional annotation performs the worst, and weaker performance in Molecular Function GO terms, the domain where the accuracy of homology-based functional annotation is highest. GO prediction models trained with homology-based annotations were able to successfully predict annotations from a manually curated “gold standard” GO annotation set. Non-homology-based functional annotation based on machine learning may ultimately prove useful both as a method to assign predicted functions to orphan genes which lack functionally characterized homologs, and to identify and correct functional annotation errors which were propagated through homology-based functional annotations
Conventional and hyperspectral time-series imaging of maize lines widely used in field trials
Background: Maize (Zea mays ssp. mays) is 1 of 3 crops, along with rice and wheat, responsible for more than one-half of all calories consumed around the world. Increasing the yield and stress tolerance of these crops is essential to meet the growing need for food. The cost and speed of plant phenotyping are currently the largest constraints on plant breeding efforts. Datasets linking new types of high-throughput phenotyping data collected from plants to the performance of the same genotypes under agronomic conditions across a wide range of environments are essential for developing new statistical approaches and computer vision–based tools. Findings A set of maize inbreds—primarily recently off patent lines—were phenotyped using a high-throughput platform at University of Nebraska-Lincoln. These lines have been previously subjected to high-density genotyping and scored for a core set of 13 phenotypes in field trials across 13 North American states in 2 years by the Genomes 2 Fields Consortium. A total of 485 GB of image data including RGB, hyperspectral, fluorescence, and thermal infrared photos has been released. Conclusions Correlations between image-based measurements and manual measurements demonstrated the feasibility of quantifying variation in plant architecture using image data. However, naive approaches to measuring traits such as biomass can introduce nonrandom measurement errors confounded with genotype variation. Analysis of hyperspectral image data demonstrated unique signatures from stem tissue. Integrating heritable phenotypes from high-throughput phenotyping data with field data from different environments can reveal previously unknown factors that influence yield plasticity
Mucinous intrahepatic cholangiocarcinoma: a distinct variant
Mucinous variant of intrahepatic cholangiocarcinoma (iCC) is rare, and its clinicopathological features and prognosis are far less clear. Six patients who had iCCs with more than 50% of mucinous component and 79 conventional iCCs were included in the study. The mean size of mucinous and conventional iCCs was 6.2 cm and 6.0 cm, respectively. The majority of patients (83%) with mucinous iCC presented at T3 stage or above, compared to 28% of the conventional group (p < 0.01). Three patients with mucinous iCC (50%) died within 1 year. The average survival time of patients with mucinous iCCs was significantly reduced compared to that of conventional group (9 months vs 2 years; P < .001). Immunohistochemistry was performed on 6 mucinous and 12 conventional iCCs with matched age, sex and stage, which revealed positive immunoreactivity in MUC1 (83% vs 58%), MUC2 (33% vs 17%), MUC5AC (100% vs 42%), MUC6 (50% vs 0), CK7 (83% vs 83%), CK20 (0 vs 17%), and CDX2 (17% vs 0) in mucinous and conventional iCCs, respectively. Molecular studies showed one mucinous iCC with KRAS G12C mutation and no BRAF or IDH1/2 mutations. Mucinous iCC is a unique variant that constitutes 7.2% of iCCs. It is more immunoreactive for MUC1, MUC2, MUC5AC and MUC6. Unlike adenocarcinomas of colorectal primary, mucinous iCCs are often CK7+/CK20-/CDX2- and microsatellite stable. Patients with mucinous iCC likely present at advanced stage upon diagnosis with shorter survival time compared to the conventional counterparts
Differentially Regulated Orthologs in Sorghum and the Subgenomes of Maize
Identifying interspecies changes in gene regulation, one of the two primary sources of phenotypic variation, is challenging on a genome-wide scale. The use of paired time-course data on cold-responsive gene expression in maize (Zea mays) and sorghum (Sorghum bicolor) allowed us to identify differentially regulated orthologs. While the majority of cold-responsive transcriptional regulation of conserved gene pairs is species specific, the initial transcriptional responses to cold appear to be more conserved than later responses. In maize, the promoters of genes with conserved transcriptional responses to cold tend to contain more micrococcal nuclease hypersensitive sites in their promoters, a proxy for open chromatin. Genes with conserved patterns of transcriptional regulation between the two species show lower ratios of nonsynonymous to synonymous substitutions. Genes involved in lipid metabolism, known to be involved in cold acclimation, tended to show consistent regulation in both species. Genes with species-specific cold responses did not cluster in particular pathways nor were they enriched in particular functional categories. We propose that cold-responsive transcriptional regulation in individual species may not be a reliable marker for function, while a core set of genes involved in perceiving and responding to cold stress are subject to functionally constrained cold-responsive regulation across the grass tribe Andropogoneae
Construction and immunological characterization of CD40L or GM-CSF incorporated Hantaan virus like particle
Infection of Hantaan virus (HTNV) usually causes hemorrhagic fever with renal syndrome (HFRS). China has the worst epidemic incidence of HFRS as well as high fatality. Inactivated whole virus has been used for HFRS vaccination, however there are still problems such as safety concerns. CD40 ligand (CD40L) and granulocyte macrophage colony-stimulating factor (GM-CSF) are well-known immune stimulating molecules that can enhance antigen presenting, lymphocytes activation and maturation, incorporation of CD40L and GM-CSF to the surface of virus like particles (VLPs) can greatly improve the vaccination effect. We constructed eukaryotic vectors expressing HTNV M segment and S segment, as well as vectors expressing HTNV M segment with CD40L or GM-CSF, our results showed successful production of CD40L or GM-CSF incorporated HTNV VLPs. In vitro stimulation with CD40L or GM-CSF anchored HTNV VLP showed enhanced activation of macrophages and DCs. CD40L/GM-CSF incorporated VLP can induce higher level of HTNV specific antibody and neutralizing antibody in mice. Immunized mice splenocytes showed higher ability of secreting IFN-Îł and IL-2, as well as enhancing CTL activity. These results suggest CD40L/GM-CSF incorporated VLP can serve as prospective vaccine candidate
- …