50 research outputs found

    Formal Model and Simulation of the Gene Assembly Process in Ciliates

    Get PDF
    The construction process of the functional macronucleus in certain types of ciliates is known as the ciliate gene assembly process. It consists of a massive amount of DNA excision from the micronucleus and the rearrangement of the rest of the DNA sequences (in the case of stichotrichous ciliates). While several computational models have tried to represent certain parts of the gene assembly process, the real process remains not completely understood. In this research, a new formal model called the Computational 2JLP model is introduced based on the recent biological 2JLP model. For justifying the formal model, a simulation is created and tested with real data. Several parameters are introduced in the model that are used to test ambiguities or edge cases of the biological model. Parameters are systematically tested from the simulation to try to find their optimal values. Interestingly, a negative correlation is found between a parameter (which is used to filter out scnRNAs that are similar to IES specific sequences from the macronucleus) and the outcome of the simulation. It indicates that if a scnRNA consists of both an MDS and IES, then from the perspective of maximizing the outcome of the simulation, it is desirable to filter out this scnRNA. The simulator successfully performs the gene assembly process whether the inputs are scrambled or unscrambled DNA sequences. It is desirable for this model to serve as a foundation for future computational and mathematical study, and to help inform and refine the biological model

    Chromosome Descrambling Order Analysis in ciliates

    Get PDF
    Ciliates are a type of unicellular eukaryotic organism that has two types of nuclei within each cell; one is called the macronucleus (MAC) and the other is known as the micronucleus (MIC). During mating, ciliates exchange their MIC, destroy their own MAC, and create a new MAC from the genetic material of their new MIC. The process of developing a new MAC from the exchanged new MIC is known as gene assembly in ciliates, and it consists of a massive amount of DNA excision from the micronucleus, and the rearrangement of the rest of the DNA sequences. During the gene assembly process, the DNA segments that get eliminated are known as internal eliminated segments (IESs), and the remaining DNA segments that are rearranged in an order that is correct for creating proteins, are called macronuclear destined segments (MDSs). A topic of interest is to predict the correct order to descramble a gene or chromosomal segment. A prediction can be made based on the principle of parsimony, whereby the smallest sequence of operations is likely close to the actual number of operations that occurred. Interestingly, the order of MDSs in the newly assembled 22,354 Oxytricha trifallax MIC chromosome fragments provides evidence that multiple parallel recombinations occur, where the structure of the chromosomes allows for interleaving between two sections of the developing macronuclear chromosome in a manner that can be captured with a common string operation called the shuffle operation (the shuffle operation on two strings results in a new string by weaving together the first two, while preserving the order within each string). Thus, we studied four similar systems involving applications of shuffle to see how the minimum number of operations needed to assemble differs between the types. Two algorithms for each of the first two systems have been implemented that are both shown to be optimal. And, for the third and fourth systems, four and two heuristic algorithms, respectively, have been implemented. The results from these algorithms revealed that, in most cases, the third system gives the minimum number of applications of shuffle to descramble, but whether the best implemented algorithm for the third system is optimal or not remains an open question. The best implemented algorithm for the third system showed that 96.63% of the scrambled micronuclear chromosome fragments of Oxytricha trifallax can be descrambled by only 1 or 2 applications of shuffle. This small number of steps lends theoretical evidence that some structural component is enforcing an alignment of segments in a shuffle-like fashion, and then parallel recombination is taking place to enable MDS rearrangement and IES elimination. Another problem of interest is to classify segments of the MIC into MDSs and IESs; this is the second topic of the thesis, and is a matter of determining the right "class label", i.e. MDS or IES, on each nucleotide. Thus, training data of labelled input sequences was used with hidden Markov models (HMMs), which is a well-known supervised machine learning classification algorithm. HMMs of first-, second-, third-, fourth-, and fifth-order have been implemented. The accuracy of the classification was verified through 10-fold cross validation. Results from this work show that an HMM is more likely to fail to accurately classify micronuclear chromosomes without having some additional knowledge

    A Computational Study of Elongation Factor G (EFG) Duplicated Genes: Diverged Nature Underlying the Innovation on the Same Structural Template

    Get PDF
    BACKGROUND: Elongation factor G (EFG) is a core translational protein that catalyzes the elongation and recycling phases of translation. A more complex picture of EFG's evolution and function than previously accepted is emerging from analyzes of heterogeneous EFG family members. Whereas the gene duplication is postulated to be a prominent factor creating functional novelty, the striking divergence between EFG paralogs can be interpreted in terms of innovation in gene function. METHODOLOGY/PRINCIPAL FINDINGS: We present a computational study of the EFG protein family to cover the role of gene duplication in the evolution of protein function. Using phylogenetic methods, genome context conservation and insertion/deletion (indel) analysis we demonstrate that the EFG gene copies form four subfamilies: EFG I, spdEFG1, spdEFG2, and EFG II. These ancient gene families differ by their indispensability, degree of divergence and number of indels. We show the distribution of EFG subfamilies and describe evidences for lateral gene transfer and recent duplications. Extended studies of the EFG II subfamily concern its diverged nature. Remarkably, EFG II appears to be a widely distributed and a much-diversified subfamily whose subdivisions correlate with phylum or class borders. The EFG II subfamily specific characteristics are low conservation of the GTPase domain, domains II and III; absence of the trGTPase specific G2 consensus motif "RGITI"; and twelve conserved positions common to the whole subfamily. The EFG II specific functional changes could be related to changes in the properties of nucleotide binding and hydrolysis and strengthened ionic interactions between EFG II and the ribosome, particularly between parts of the decoding site and loop I of domain IV. CONCLUSIONS/SIGNIFICANCE: Our work, for the first time, comprehensively identifies and describes EFG subfamilies and improves our understanding of the function and evolution of EFG duplicated genes

    Women in Science 2014

    Get PDF
    Women in Science 2014 summarizes research done by Smith College’s Summer Research Fellowship (SURF) Program participants. Ever since its 1967 start, SURF has been a cornerstone of Smith’s science education. In 2014, 150 students participated in SURF (141 hosted on campus and nearby eld sites), supervised by 61 faculty mentor-advisors drawn from the Clark Science Center and connected to its eighteen science, mathematics, and engineering departments and programs and associated centers and units. At summer’s end, SURF participants were asked to summarize their research experiences for this publication.https://scholarworks.smith.edu/clark_womeninscience/1003/thumbnail.jp

    Systematics of the Hedychieae (Zingiberaceae), with emphasis on Roscoea Sm.

    Get PDF

    Defining the molecular, genetic and transcriptomic mechanisms underlying the variation in glycation gap between individuals

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.The discrepancy between HbA1c and fructosamine estimations in the assessment of glycaemia has frequently been observed and is referred to as the glycation gap (G-gap). This could be explained by the higher activity of the fructosamine-3-kinase (FN3K) deglycating enzyme in the negative G-gap group (patients with lower than predicted HbA1c for their mean glycaemia) as compared to the positive G-gap group. This G-gap is linked with differences in complications in patients with diabetes and this potentially happens because of dissimilarities in deglycation. The difference in deglycation rate in turn leads to altered production of advanced glycation end products (AGEs). These AGEs are both receptor dependent and receptor independent. It was hypothesised that variations in the level of the deglycating enzyme fructosamine-3-kinase (FN3K) might be as a result of known Single Nucleotide Polymorphisms (SNPs): rs1056534, rs3848403 and rs1046896 in FN3K gene, SNP in ferroportin1/SLC40A1 gene (rs11568350 linked with FN3K activity), differentially expressed genes (DEGs), differentially expressed transcripts or alternatively spliced transcript variants. Previous studies reported accelerated telomere length shortening in patients with diabetes. In this study, 184 patients with diabetes were included as dichotomised groups with either a strongly negative or positive G-gap. This study was conducted to analyse the differences in genotype frequency of specific SNPs via real time qPCR,determine soluble receptors for AGE (sRAGE) concentration via ELISA, finding association of sRAGE concentration with SNPs genotype, and evaluate relative average telomere length ratio via real time qPCR. This study also aimed at the investigation of underlying mechanisms of G-gap via transcriptome study for the identification of the DEGs and differentially expressed transcripts and to consequently identify pathways, biological processes and diseases linked to situations in which DEGs were enriched. The relative length of the telomere was normalised to the expression of a single copy gene (S). Chi-squared test was used for estimating the expected genotype frequencies in diabetic patients with negative and positive G-gap. Genotype frequencies of FN3K SNPs (rs1056534, rs3848403 and rs1046896) and SLC40A1/ferroportin1 SNP (rs11568350) polymorphisms within the studied groups were non-significant. With respect to genotypes, the rs1046896 genotype (CT) and rs11568350 genotype (AC) were only found in heterozygous state in all the investigated cohorts. No association between sRAGE concentration and FN3K SNPs (rs3848403 and rs1056534) was observed as the sRAGE concentration was also found not to be different between the groups. Similarly, the relative average telomere length was not different in both groups. Plasma sRAGE levels were not different in the cohort studied even though the Wolverhampton Diabetes Research Group (WDRG) previously reported that AGE is higher in positive G-gap. The latter is a more likely consequence of lower FN3K activities. In this study, it was found that SNPs in the FN3K/ferroportin1 gene are not responsible for the discrepancy in average glycaemia. The transcriptomic study via RNA-Seq mapped a total of 64451 gene transcripts to the human transcriptome. The DEGs and differentially expressed transcripts were 103 and 342 respectively (p 1.5). Of 103 DEGs, 61 were downregulated in G-gap positive and 42 were upregulated in positive G-gap individuals while 14 genes produced alternatively spliced transcript variants. Four pathways (Viral carcinogenesis, Ribosome, Phagosome and Dorso-ventral axis) were identified in the bioinformatics analysis of samples in which DEGs were enriched. These DEGs were also found to be associated with raised blood pressure and glycated haemoglobin (conditions that coexist with diabetes). Future analysis based on these results will be necessary to elucidate the significant drivers of gene expression leading to the G-gap in these patients
    corecore