98 research outputs found

    Optimização de genes para expressão heteróloga

    Get PDF
    Mestrado em Engenharia de Computadores e TelemáticaCom o uso de computadores para assistir investigadores na área da biologia na resolução de tarefas complexas, o seu potencial surgiu como uma ajuda preciosa para alcançar o que está para além das capacidades humanas. Para um biólogo, nos tempos que correm, lidar com um computador é uma tarefa tão trivial como realizar experiencias em laboratório. Assim, a capacidade fornecida pela tecnologia computacional, juntamente com as centenas de aplicações e ferramentas de software que já existem, concedem à Biologia um apoio significativo para a investigação e desenvolvimento. O ramo da Biologia Molecular tem testemunhado um uso crescente destas capacidades tecnológicas, sobretudo nos programas de sequenciação de genomas, que traduzem a informação genética de seres vivos para formatos digitais. Como fruto destes projectos, são gerados grandes volumes de dados de várias espécies, que são disponibilizados. Em consequência, muitos sistemas de bioinformática tem como objectivo analisar estes dados. Novas descobertas e avanços requerem novas ferramentas e técnicas. Esta tese debruça-se sobre o problema das metodologias de redesenho de genes, estudando e reunindo várias características conhecidas dos genes e o seu impacto na criação de proteínas, na perspectiva das estratégias de manipulação de sequências de genes. Estas características e algoritmos de redesenho devem ser encaixados numa só ferramenta que permita aos investigadores estudar mais apropriadamente os genes e os factores que influenciam as suas sequências. Também objecto de estudo nesta tese é a capacidade de combinar esses factores de forma óptima, num só processo de redesenho.As computers started assisting biology researchers in complex tasks, their potential arose as a precious aid to achieve what was beyond human capacity. In modern times, for a biologist, dealing with a computer is as trivial as working with test tubes in the laboratory. Thus, the power provided by computational technology along with hundreds of software applications and tools that already exist, grant biology a signi_cant support for research and development. Molecular biology has witnessed an increased use of these technological capabilities, especially with the genome sequencing projects that translate the genetic information from living beings into digital formats. Large volumes of data from various species are, thus, generated and made available. Analyzing that data is now the goal of many bioinformatics systems. Consequently, new discoveries and advancements demand new tools and techniques. This thesis lays on the problem of gene redesign methodologies, by studying and gathering the available known gene characteristics and its impact on protein production, from the perspective of their sequence manipulation strategies. These characteristics and redesign algorithms should be assembled into a single package tool, to allow researchers to better study genes and all factors that inuence their sequence. Also a subject of study is the capacity to correctly and optimally combine those factors into a single redesign process

    Métodos computacionais para a caracterização de genes e extração de conhecimento genómico

    Get PDF
    Doutoramento conjunto MAPi em Ciências da ComputaçãoMotivation: Medicine and health sciences are changing from the classical symptom-based to a more personalized and genetics-based paradigm, with an invaluable impact in health-care. While advancements in genetics were already contributing significantly to the knowledge of the human organism, the breakthrough achieved by several recent initiatives provided a comprehensive characterization of the human genetic differences, paving the way for a new era of medical diagnosis and personalized medicine. Data generated from these and posterior experiments are now becoming available, but its volume is now well over the humanly feasible to explore. It is then the responsibility of computer scientists to create the means for extracting the information and knowledge contained in that data. Within the available data, genetic structures contain significant amounts of encoded information that has been uncovered in the past decades. Finding, reading and interpreting that information are necessary steps for building computational models of genetic entities, organisms and diseases; a goal that in due course leads to human benefits. Aims: Numerous patterns can be found within the human variome and exome. Exploring these patterns enables the computational analysis and manipulation of digital genomic data, but requires specialized algorithmic approaches. In this work we sought to create and explore efficient methodologies to computationally calculate and combine known biological patterns for various purposes, such as the in silico optimization of genetic structures, analysis of human genes, and prediction of pathogenicity from human genetic variants. Results: We devised several computational strategies to evaluate genes, explore genomes, manipulate sequences, and analyze patients’ variomes. By resorting to combinatorial and optimization techniques we were able to create and combine sequence redesign algorithms to control genetic structures; by combining the access to several web-services and external resources we created tools to explore and analyze available genetic data and patient data; and by using machine learning we developed a workflow for analyzing human mutations and predicting their pathogenicity.Motivação: A medicina e as ciências da saúde estão atualmente num processo de alteração que muda o paradigma clássico baseado em sintomas para um personalizado e baseado na genética. O valor do impacto desta mudança nos cuidados da saúde é inestimável. Não obstante as contribuições dos avanços na genética para o conhecimento do organismo humano até agora, as descobertas realizadas recentemente por algumas iniciativas forneceram uma caracterização detalhada das diferenças genéticas humanas, abrindo o caminho a uma nova era de diagnóstico médico e medicina personalizada. Os dados gerados por estas e outras iniciativas estão disponíveis mas o seu volume está muito para lá do humanamente explorável, e é portanto da responsabilidade dos cientistas informáticos criar os meios para extrair a informação e conhecimento contidos nesses dados. Dentro dos dados disponíveis estão estruturas genéticas que contêm uma quantidade significativa de informação codificada que tem vindo a ser descoberta nas últimas décadas. Encontrar, ler e interpretar essa informação são passos necessários para construir modelos computacionais de entidades genéticas, organismos e doenças; uma meta que, em devido tempo, leva a benefícios humanos. Objetivos: É possível encontrar vários padrões no varioma e exoma humano. Explorar estes padrões permite a análise e manipulação computacional de dados genéticos digitais, mas requer algoritmos especializados. Neste trabalho procurámos criar e explorar metodologias eficientes para o cálculo e combinação de padrões biológicos conhecidos, com a intenção de realizar otimizações in silico de estruturas genéticas, análises de genes humanos, e previsão da patogenicidade a partir de diferenças genéticas humanas. Resultados: Concebemos várias estratégias computacionais para avaliar genes, explorar genomas, manipular sequências, e analisar o varioma de pacientes. Recorrendo a técnicas combinatórias e de otimização criámos e conjugámos algoritmos de redesenho de sequências para controlar estruturas genéticas; através da combinação do acesso a vários web-services e recursos externos criámos ferramentas para explorar e analisar dados genéticos, incluindo dados de pacientes; e através da aprendizagem automática desenvolvemos um procedimento para analisar mutações humanas e prever a sua patogenicidade

    GRAIL-genQuest: A comprehensive computational system for DNA sequence analysis. Final report, DOE SBIR Phase II

    Full text link

    Structure determination of membrane proteins by electron crystallography

    Get PDF
    A fundamental principle of life is the separation of environments into different compartments. Prokaryotes shield their interior from the environment by a plasma membrane and in some cases also by a cell wall. Eukaryotes refine this compartmentalization by building different organelles for different parts of the cell metabolism. Nevertheless, these different compartments are dependent on each other and are interconnected by membrane proteins that transport specific nutrients, hormones, ions, water and waste products across the membrane and facilitate signal transmission between different compartments. Understanding the structure and function of membrane proteins can therefore allow an enormous insight into the regulation of different metabolic pathways. The electron microscope (EM) proved itself a great tool for studying membrane proteins, offering the unique opportunity to image membrane proteins within a lipid bilayer as close to the natural conditions as possible. Processing of images acquired by an electron microscope poses a challenging task for both scientist and processing hardware. Newly developed and optimized algorithms are needed to improve the image processing to a level that allows atomic resolution to be achieved regularly. Membrane proteins pose a difficult challenge for a structural biologist. To crystallize membrane proteins into well ordered two dimensional (2D) or three dimensional (3D) crystals is one of the most important prerequisites for structural analysis at the atomic level, yet membrane proteins are notoriously difficult to crystallize. One exception may be bacteriorhodopsin, which forms near-perfect crystals already in its native membrane. This may explain the fact that the first 2D electron crystallographic structure determined at 7 Å resolution by Henderson and Unwin[20][43] in 1975 was the structure of bacteriorhodopsin. In 1990 the structure of Br was determined to atomic resolution by Henderson et al.[19], being the first atomic structure of a membrane protein. The structure determination of Br was also the starting point for the mrc program suite, which is widely used at the moment in the, albeit small, 2D electron crystallography community. Using the mrc software Kühlbrandt et al.[26] solved the structure of the light-harvesting chlorophyll a/b-protein complex in 1994. For recording the images they used the spot scan technique developed by Downing in 1991[9]. The first aquaporin water channel determined was aquaporin 1, resolved by Walz et al. in 1997[45] at 6 Å resolution, and subsequently solved to atomic resolution by Murata et al. in 2000[29]. Recently, several more aquaporin structures were determined by 2D electron crystallographic methods, aquaporin-0 (AQP0) by Gonen et al. in 2004[14] at 3 Å and in 2005[13] at 1.9 Å and aquaporin-4 (AQP4) by Hiroaki et al. in 2006[22]. Interestingly, AQP4 shows exactly the same monomer arrangement as SoPIP2;1. The recent publications show that the trend goes from recording solely images to the recording of diffraction data in combination with images or even to recording diffraction data exclusively, and then using methods developed for x-ray crystallography to obtain the phase information. Given the fact that the software available for processing of 2D electron diffraction patterns is less evolved than the one for processing images, and given this new development of increased usage of diffraction patterns, it only makes sense to focus on implementing new and improved programs for 2D electron diffraction processing. In this work I would like to present the advances I achieved in the structural determination of aquaporin 2, as well as my contribution to other projects, in particular the structural investigations of SoPIP2;1 and KdgM. I will also explain the modified sample preparation methods which made data recording at high tilt angles more reliable and achieved an improvement in resolution of the measured data. A second, equally important and detailed part of my thesis is the work invested in improving and extending the image processing to a point where a user, not adept in programming in several languages, can use it and produce good results. For this I improved the functionality and performance at several points, including a strong emphasis on user friendliness and ease of maintenance

    Ethical issues of synthetic biology: a personalist perspective

    Get PDF
    The main objective of this thesis is to assess the bioethical issues raised by Synthetic Biology from a specific bioethical approach, personalism, specifically ontological personalism, a philosophy that shows the objective value of the person on the basis of its ontological structure. The person, as a being endowed with reason, freedom and awareness, has a special value which is above that of other beings.El objetivo principal de este trabajo es evaluar las cuestiones bioéticas planteadas por la Biología Sintética desde un enfoque bioético específico, el personalismo, específicamente el personalismo ontológico, una filosofía que muestra el valor objetivo de la persona sobre la base de su estructura ontológica.Ciencias ExperimentalesPrograma Oficial de Doctorado en Bioétic

    Towards Personalized Medicine: Computational Approaches to Support Drug Design and Clinical Decision Making

    Get PDF
    The future looks bright for a clinical practice that tailors the therapy with the best efficacy and highest safety to a patient. Substantial amounts of funding have resulted in technological advances regarding patient-centered data acquisition --- particularly genetic data. Yet, the challenge of translating this data into clinical practice remains open. To support drug target characterization, we developed a global maximum entropy-based method that predicts protein-protein complexes including the three-dimensional structure of their interface from sequence data. To further speed up the drug development process, we present methods to reposition drugs with established safety profiles to new indications leveraging paths in cellular interaction networks. We validated both methods on known data, demonstrating their ability to recapitulate known protein complexes and drug-indication pairs, respectively. After studying the extent and characteristics of genetic variation with a predicted impact on protein function across 60,607 individuals, we showed that most patients carry variants in drug-related genes. However, for the majority of variants, their impact on drug efficacy remains unknown. To inform personalized treatment decisions, it is thus crucial to first collate knowledge from open data sources about known variant effects and to then close the knowledge gaps for variants whose effect on drug binding is still not characterized. Here, we built an automated annotation pipeline for patient-specific variants whose value we illustrate for a set of patients with hepatocellular carcinoma. We further developed a molecular modeling protocol to predict changes in binding affinity in proteins with genetic variants which we evaluated for several clinically relevant protein kinases. Overall, we expect that each presented method has the potential to advance personalized medicine by closing knowledge gaps about protein interactions and genetic variation in drug-related genes. To reach clinical applicability, challenges with data availability need to be overcome and prediction performance should be validated experimentally.Therapien mit der besten Wirksamkeit und höchsten Sicherheit werden in Zukunft auf den Patienten zugeschnitten werden. Hier haben erhebliche finanzielle Mittel zu technologischen Fortschritten bei der patientenzentrierten Datenerfassung geführt, aber diese Daten in die klinische Praxis zu übertragen, bleibt aktuell noch eine Herausforderung. Um die Wirkstoffforschung in der Charakterisierung therapeutischer Zielproteine zu unterstützen, haben wir eine Maximum-Entropie-Methode entwickelt, die Protein-Interaktionen und ihre dreidimensionalen Struktur aus Sequenzdaten vorhersagt. Darüber hinaus, stellen wir Methoden zur Repositionierung von etablierten Arzneimitteln auf neue Indikationen vor, die Pfade in zellulären Interaktionsnetze nutzen. Diese Methoden haben wir anhand bekannter Daten validiert und ihre Fähigkeit demonstriert, bekannte Proteinkomplexe bzw. Wirkstoff-Indikations-Paare zu rekapitulieren. Unsere Analyse genetischer Variation mit einem Einfluss auf die Proteinfunktion in 60,607 Individuen konnte zeigen, dass nahezu jeder Patient funktionsverändernde Varianten in Medikamenten-assoziierten Genen trägt. Der direkte Einfluss der meisten beobachteten Varianten auf die Medikamenten-Wirksamkeit ist jedoch noch unbekannt. Um dennoch personalisierte Behandlungsentscheidungen treffen zu können, präsentieren wir eine Annotationspipeline für genetische Varianten, deren Wert wir für Patienten mit hepatozellulärem Karzinom illustrieren konnten. Darüber hinaus haben wir ein molekulares Modellierungsprotokoll entwickelt, um die Veränderungen in der Bindungsaffinität von Proteinen mit genetischen Varianten voraussagen. Insgesamt sind wir davon überzeugt, dass jede der vorgestellten Methoden das Potential hat, Wissenslücken über Proteininteraktionen und genetische Variationen in medikamentenbezogenen Genen zu schlie{\ss}en und somit das Feld der personalisierten Medizin voranzubringen. Um klinische Anwendbarkeit zu erreichen, gilt es in der Zukunft, verbleibende Herausforderungen bei der Datenverfügbarkeit zu bewältigen und unsere Vorhersagen experimentell zu validieren

    Faculty Publications & Presentations, 2006-2007

    Get PDF
    corecore