168 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationFungal polyketides are a complex class of natural products with diverse scaffolds. The biosynthesis of these molecules involves the iterative condensation of acetate units using a minimal set of domains on a single polypeptide. The different levels of reduction at each iterative step generates the structural diversity observed via the stuttering action of these domains. For some polyketides, further structural complexity is introduced via amidation by a nonribosomal peptide synthetase (NRPS) module fused to the polyketide synthase (PKS). The NRPS contains a condensation domain (C) that catalyzes the coupling of the polyketide to a specific amino acid. Studies by others have suggested that PKS function is independent of NRPS activity and thus, this system is amenable to combinatorial biosynthesis to generate analogues with different polyketide chains and amino acids by performing module swaps. Forging intermodular interactions and understanding C domain selectivity for substrates is key to successful engineering of these analogues. Herein, I investigate the impact of these factors on the ability to synthesize unnatural products by this route. Studying these components required the overexpression of chimeric gene constructs, which was anticipated to result in low compound yields. Therefore, a novel platform to express fungal genes with yields exceeding 1 g per kg media was developed and validated through the characterization of a silent pathway. Application to the PKS-NRPS combinatorial biosynthesis problem led to the discovery that C domains are highly selective for closely related substrates in the presence of favorable PKS/NRPS interactions

    Construction and fine-scale analysis of a high-density, genome-wide linkage map to examine meiotic recombination in the honey bee, Apis mellifera

    Get PDF
    The western honey bee, A. mellifera , is an important biological model organism in research for ecological and behavioral studies in addition to molecular studies. Honey bees are also imperative in nature for reproduction and diversification of plants via pollination. A unique feature of honey bees is that they have the highest recombination rate of all metazoans. This gives rise to the important question: what causes honey bees to have such a high rate of recombination? The honey bee genome has already been sequenced, but the available linkage maps are not detailed enough to characterize individual recombination events at the genome level. High recombination rates in honey bees may be caused by abundant recombination hotspots found throughout the genome. Resequencing the honey bee genome with next-generation sequencing and using over 900,000 markers genome-wide to identify recombination events showed that recombination rate in honey bees may be underestimated. This study calculated the average recombination rate to be 178.7 cM/Mb as opposed to the second most recent average of 22 cM/Mb. These high recombination rates in this study could be explained by mistakes in the current assembly of the reference genome. Further analyses are necessary to verify proper assembly of the current reference genome, genome-wide recombination events, and recombination rates. Based on the verified data set it will then be possible to confirm whether hotspots are present in honey bees and to correctly correlate recombination hotspots to sequence motifs

    Path planning algorithms for atmospheric science applications of autonomous aircraft systems

    No full text
    Among current techniques, used to assist the modelling of atmospheric processes, is an approach involving the balloon or aircraft launching of radiosondes, which travel along uncontrolled trajectories dependent on wind speed. Radiosondes are launched daily from numerous worldwide locations and the data collected is integral to numerical weather prediction.This thesis proposes an unmanned air system for atmospheric research, consisting of multiple, balloon-launched, autonomous gliders. The trajectories of the gliders are optimised for the uniform sampling of a volume of airspace and the efficient mapping of a particular physical or chemical measure. To accomplish this we have developed a series of algorithms for path planning, driven by the dual objectives of uncertainty andinformation gain.Algorithms for centralised, discrete path planning, a centralised, continuous planner and finally a decentralised, real-time, asynchronous planner are presented. The continuous heuristics search a look-up table of plausible manoeuvres generated by way of an offline flight dynamics model, ensuring that the optimised trajectories are flyable. Further to this, a greedy heuristic for path growth is introduced alongside a control for search coarseness, establishing a sliding control for the level of allowed global exploration, local exploitation and computational complexity. The algorithm is also integrated with a flight dynamics model, and communications and flight systems hardware, enabling software and hardware-in-the-loop simulations. The algorithm outperforms random search in two and three dimensions. We also assess the applicability of the unmanned air system in ‘real’ environments, accounting for the presence of complicated flow fields and boundaries. A case study based on the island South Georgia is presented and indicates good algorithm performance in strong, variable winds. We also examine the impact of co-operation within this multi-agent system of decentralised, unmanned gliders, investigating the threshold for communication range, which allows for optimal search whilst reducing both the cost of individual communication devices and the computational resources associated with the processing of data received by each aircraft. Reductions in communication radius are found to have a significant, negative impact upon the resulting efficiency of the system. To somewhat recover these losses, we utilise a sorting algorithm, determining information priority between any two aircraft in range. Furthermore, negotiation between aircraft is introduced, allowing aircraft to resolve any possible conflicts between selected paths, which helps to counteractany latency in the search heuristic

    EXPLORING FATC DOMAIN FUNCTION IN YEAST TRA1

    Get PDF
    Tral is an essential yeast protein required for regulated transcription. Its human homolog TRRAP regulates factors important in oncogenesis. Mutation of leucine to alanine at position 3733 in the FATC domain {tralla) results in growth phenotypes including sensitivity to ethanol. My aim was to examine genetic interactions o f the FA TC domain o f Tral to define its cellular role. I screened for extragenic suppressors of the ethanol sensitivity caused by tralla, identifying an opal mutation at tryptophan 165 of NAM7 as a suppressor. Deleting nam7, upf3, or nmd2 similarly suppressed tralLA, thereby linking Tral to nonsense mediated decay. I propose that Tral regulates transcription of genes also regulated by NMD. This work emphasizes the importance of NMD in gene regulation. Furthermore, the cross regulation between Tral and NMD suggests that mutations in the human NMD machinery may provide a mechanism to alter pathways influenced by TRRAP in human disease

    Minería de datos mediante programación automática con colonias de hormigas

    Get PDF
    La presente tesis doctoral supone el primer acercamiento de la metaheur stica de programaci on autom atica mediante colonias de hormigas (Ant Programming) a tareas de miner a de datos. Esta t ecnica de aprendizaje autom atico ha demostrado ser capaz de obtener buenos resultados en problemas de optimizaci on, pero su aplicaci on a la miner a de datos no hab a sido explorada hasta el momento. Espec camente, esta tesis cubre las tareas de clasi caci on y asociaci on. Para la primera se presentan tres modelos que inducen un clasi cador basado en reglas. Dos de ellos abordan el problema de clasi caci on desde el punto de vista de evaluaci on monobjetivo y multiobjetivo, respectivamente, mientras que el tercero afronta el problema espec co de clasi caci on en conjuntos de datos no balanceados desde una perspectiva multiobjetivo. Por su parte, para la tarea de extracci on de reglas de asociaci on se han desarrollado dos algoritmos que llevan a cabo la extracci on de patrones frecuentes. El primero de ellos propone una evaluaci on de los individuos novedosa, mientras que el segundo lo hace desde un punto de vista basado en la dominancia de Pareto. Todos los algoritmos han sido evaluados en un marco experimental adecuado, utilizando numerosos conjuntos de datos y comparando su rendimiento frente a otros m etodos ya publicados de contrastada calidad. Los resultados obtenidos, que han sido veri cados mediante la aplicaci on de test estad sticos no param etricos, demuestran los bene cios de utilizar la metaheur stica de programaci on autom atica con colonias de hormigas para dichas tareas de miner a de datos.This Doctoral Thesis involves the rst approximation of the ant programming metaheuristic to data mining. This automatic programming technique has demonstrated good results in optimization problems, but its application to data mining has not been explored until the present moment. Speci cally, this Thesis deals with the classi cation and association rule mining tasks of data mining. For the former, three models for the induction of rule-based classi ers are presented. Two of them address the classi cation problem from the point of view of single-objective and multi-objective evaluation, respectively, while the third proposal tackles the particular problem of imbalanced classi cation from a multi-objective perspective. On the other hand, for the task of association rule mining two algorithms for extracting frequent patterns have been developed. The rst one evaluates the quality of individuals by using a novel tness function, while the second algorithm performs the evaluation from a Pareto dominance point of view. All the algorithms proposed in this Thesis have been evaluated in a proper experimental framework, using a large number of data sets and comparing their performance against other published methods of proved quality. The results obtained have been veri ed by applying non-parametric statistical tests, demonstrating the bene ts of using the ant programming metaheuristic to address these data mining tasks

    Development of a hybrid genetic programming technique for computationally expensive optimisation problems

    Get PDF
    The increasing computational power of modern computers has contributed to the advance of nature-inspired algorithms in the fields of optimisation and metamodelling. Genetic programming (GP) is a genetically-inspired technique that can be used for metamodelling purposes. GP main strength is in the ability to infer the mathematical structure of the best model fitting a given data set, relying exclusively on input data and on a set of mathematical functions given by the user. Model inference is based on an iterative or evolutionary process, which returns the model as a symbolic expression (text expression). As a result, model evaluation is inexpensive and the generated expressions can be easily deployed to other users. Despite genetic programming has been used in many different branches of engineering, its diffusion on industrial scale is still limited. The aims of this thesis are to investigate the intrinsic limitations of genetic programming, to provide a comprehensive review of how researchers have tackled genetic programming main weaknesses and to improve genetic programming ability to extract accurate models from data. In particular, research has followed three main directions. The first has been the development of regularisation techniques to improve the generalisation ability of a model of a given mathematical structure, based on the use of a specific tuning algorithm in case sinusoidal functions are among the functions the model is composed of. The second has been the analysis of the influence that prior knowledge regarding the function to approximate may have on genetic programming inference process. The study has led to the introduction of a strategy that allows to use prior knowledge to improve model accuracy. Thirdly, the mathematical structure of the models returned by genetic programming has been systematically analysed and has led to the conclusion that the linear combination is the structure that is mostly returned by genetic programming runs. A strategy has been formulated to reduce the evolutionary advantage of linear combinations and to protect more complex classes of individuals throughout the evolution. The possibility to use genetic programming in industrial optimisation problems has also been assessed with the help of a new genetic programming implementation developed during the research activity. Such implementation is an open source project and is freely downloadable from http://www.personal.leeds.ac.uk/~cnua/mypage.html

    Multiobjective genetic programming for financial portfolio management in dynamic environments

    Get PDF
    Multiobjective (MO) optimisation is a useful technique for evolving portfolio optimisation solutions that span a range from high-return/high-risk to low-return/low-risk. The resulting Pareto front would approximate the risk/reward Efficient Frontier [Mar52], and simplifies the choice of investment model for a given client’s attitude to risk. However, the financial market is continuously changing and it is essential to ensure that MO solutions are capturing true relationships between financial factors and not merely over fitting the training data. Research on evolutionary algorithms in dynamic environments has been directed towards adapting the algorithm to improve its suitability for retraining whenever a change is detected. Little research focused on how to assess and quantify the success of multiobjective solutions in unseen environments. The multiobjective nature of the problem adds a unique feature to be satisfied to judge robustness of solutions. That is, in addition to examining whether solutions remain optimal in the new environment, we need to ensure that the solutions’ relative positions previously identified on the Pareto front are not altered. This thesis investigates the performance of Multiobjective Genetic Programming (MOGP) in the dynamic real world problem of portfolio optimisation. The thesis provides new definitions and statistical metrics based on phenotypic cluster analysis to quantify robustness of both the solutions and the Pareto front. Focusing on the critical period between an environment change and when retraining occurs, four techniques to improve the robustness of solutions are examined. Namely, the use of a validation data set; diversity preservation; a novel variation on mating restriction; and a combination of both diversity enhancement and mating restriction. In addition, preliminary investigation of using the robustness metrics to quantify the severity of change for optimum tracking in a dynamic portfolio optimisation problem is carried out. Results show that the techniques used offer statistically significant improvement on the solutions’ robustness, although not on all the robustness criteria simultaneously. Combining the mating restriction with diversity enhancement provided the best robustness results while also greatly enhancing the quality of solutions

    RNA, the Epicenter of Genetic Information

    Get PDF
    The origin story and emergence of molecular biology is muddled. The early triumphs in bacterial genetics and the complexity of animal and plant genomes complicate an intricate history. This book documents the many advances, as well as the prejudices and founder fallacies. It highlights the premature relegation of RNA to simply an intermediate between gene and protein, the underestimation of the amount of information required to program the development of multicellular organisms, and the dawning realization that RNA is the cornerstone of cell biology, development, brain function and probably evolution itself. Key personalities, their hubris as well as prescient predictions are richly illustrated with quotes, archival material, photographs, diagrams and references to bring the people, ideas and discoveries to life, from the conceptual cradles of molecular biology to the current revolution in the understanding of genetic information. Key Features Documents the confused early history of DNA, RNA and proteins - a transformative history of molecular biology like no other. Integrates the influences of biochemistry and genetics on the landscape of molecular biology. Chronicles the important discoveries, preconceptions and misconceptions that retarded or misdirected progress. Highlights major pioneers and contributors to molecular biology, with a focus on RNA and noncoding DNA. Summarizes the mounting evidence for the central roles of non-protein-coding RNA in cell and developmental biology. Provides a thought-provoking retrospective and forward-looking perspective for advanced students and professional researchers
    • …
    corecore