98 research outputs found
Optimização de genes para expressão heteróloga
Mestrado em Engenharia de Computadores e TelemáticaCom o uso de computadores para assistir investigadores na área da biologia
na resolução de tarefas complexas, o seu potencial surgiu como uma ajuda
preciosa para alcançar o que está para além das capacidades humanas. Para
um biólogo, nos tempos que correm, lidar com um computador é uma tarefa
tão trivial como realizar experiencias em laboratório. Assim, a capacidade
fornecida pela tecnologia computacional, juntamente com as centenas de
aplicações e ferramentas de software que já existem, concedem à Biologia
um apoio significativo para a investigação e desenvolvimento.
O ramo da Biologia Molecular tem testemunhado um uso crescente destas
capacidades tecnológicas, sobretudo nos programas de sequenciação de
genomas, que traduzem a informação genética de seres vivos para formatos
digitais. Como fruto destes projectos, são gerados grandes volumes de dados
de várias espécies, que são disponibilizados. Em consequência, muitos
sistemas de bioinformática tem como objectivo analisar estes dados. Novas
descobertas e avanços requerem novas ferramentas e técnicas.
Esta tese debruça-se sobre o problema das metodologias de redesenho de
genes, estudando e reunindo várias características conhecidas dos genes e
o seu impacto na criação de proteínas, na perspectiva das estratégias de
manipulação de sequências de genes. Estas características e algoritmos
de redesenho devem ser encaixados numa só ferramenta que permita aos
investigadores estudar mais apropriadamente os genes e os factores que
influenciam as suas sequências. Também objecto de estudo nesta tese é a
capacidade de combinar esses factores de forma óptima, num só processo
de redesenho.As computers started assisting biology researchers in complex tasks, their
potential arose as a precious aid to achieve what was beyond human capacity.
In modern times, for a biologist, dealing with a computer is as trivial
as working with test tubes in the laboratory. Thus, the power provided by
computational technology along with hundreds of software applications and
tools that already exist, grant biology a signi_cant support for research and
development.
Molecular biology has witnessed an increased use of these technological capabilities,
especially with the genome sequencing projects that translate the
genetic information from living beings into digital formats. Large volumes of
data from various species are, thus, generated and made available. Analyzing
that data is now the goal of many bioinformatics systems. Consequently,
new discoveries and advancements demand new tools and techniques.
This thesis lays on the problem of gene redesign methodologies, by studying
and gathering the available known gene characteristics and its impact
on protein production, from the perspective of their sequence manipulation
strategies. These characteristics and redesign algorithms should be assembled
into a single package tool, to allow researchers to better study genes
and all factors that inuence their sequence. Also a subject of study is
the capacity to correctly and optimally combine those factors into a single
redesign process
Métodos computacionais para a caracterização de genes e extração de conhecimento genómico
Doutoramento conjunto MAPi em Ciências da ComputaçãoMotivation: Medicine and health sciences are changing from the classical
symptom-based to a more personalized and genetics-based paradigm, with an
invaluable impact in health-care. While advancements in genetics were already
contributing significantly to the knowledge of the human organism, the
breakthrough achieved by several recent initiatives provided a comprehensive
characterization of the human genetic differences, paving the way for a new era
of medical diagnosis and personalized medicine.
Data generated from these and posterior experiments are now becoming
available, but its volume is now well over the humanly feasible to explore. It is
then the responsibility of computer scientists to create the means for extracting
the information and knowledge contained in that data.
Within the available data, genetic structures contain significant amounts of
encoded information that has been uncovered in the past decades. Finding,
reading and interpreting that information are necessary steps for building
computational models of genetic entities, organisms and diseases; a goal that
in due course leads to human benefits.
Aims: Numerous patterns can be found within the human variome and exome.
Exploring these patterns enables the computational analysis and manipulation
of digital genomic data, but requires specialized algorithmic approaches. In this
work we sought to create and explore efficient methodologies to
computationally calculate and combine known biological patterns for various
purposes, such as the in silico optimization of genetic structures, analysis of
human genes, and prediction of pathogenicity from human genetic variants.
Results: We devised several computational strategies to evaluate genes,
explore genomes, manipulate sequences, and analyze patients’ variomes. By
resorting to combinatorial and optimization techniques we were able to create
and combine sequence redesign algorithms to control genetic structures; by
combining the access to several web-services and external resources we
created tools to explore and analyze available genetic data and patient data;
and by using machine learning we developed a workflow for analyzing human
mutations and predicting their pathogenicity.Motivação: A medicina e as ciências da saúde estão atualmente num
processo de alteração que muda o paradigma clássico baseado em sintomas
para um personalizado e baseado na genética. O valor do impacto desta
mudança nos cuidados da saúde é inestimável. Não obstante as contribuições
dos avanços na genética para o conhecimento do organismo humano até
agora, as descobertas realizadas recentemente por algumas iniciativas
forneceram uma caracterização detalhada das diferenças genéticas humanas,
abrindo o caminho a uma nova era de diagnóstico médico e medicina
personalizada.
Os dados gerados por estas e outras iniciativas estão disponíveis mas o seu
volume está muito para lá do humanamente explorável, e é portanto da
responsabilidade dos cientistas informáticos criar os meios para extrair a
informação e conhecimento contidos nesses dados.
Dentro dos dados disponíveis estão estruturas genéticas que contêm uma
quantidade significativa de informação codificada que tem vindo a ser
descoberta nas últimas décadas. Encontrar, ler e interpretar essa informação
são passos necessários para construir modelos computacionais de entidades
genéticas, organismos e doenças; uma meta que, em devido tempo, leva a
benefícios humanos.
Objetivos: É possível encontrar vários padrões no varioma e exoma humano.
Explorar estes padrões permite a análise e manipulação computacional de
dados genéticos digitais, mas requer algoritmos especializados. Neste trabalho
procurámos criar e explorar metodologias eficientes para o cálculo e
combinação de padrões biológicos conhecidos, com a intenção de realizar
otimizações in silico de estruturas genéticas, análises de genes humanos, e
previsão da patogenicidade a partir de diferenças genéticas humanas.
Resultados: Concebemos várias estratégias computacionais para avaliar
genes, explorar genomas, manipular sequências, e analisar o varioma de
pacientes. Recorrendo a técnicas combinatórias e de otimização criámos e
conjugámos algoritmos de redesenho de sequências para controlar estruturas
genéticas; através da combinação do acesso a vários web-services e recursos
externos criámos ferramentas para explorar e analisar dados genéticos,
incluindo dados de pacientes; e através da aprendizagem automática
desenvolvemos um procedimento para analisar mutações humanas e prever a
sua patogenicidade
Structure determination of membrane proteins by electron crystallography
A fundamental principle of life is the separation of environments into different compartments.
Prokaryotes shield their interior from the environment by a plasma membrane
and in some cases also by a cell wall. Eukaryotes refine this compartmentalization
by building different organelles for different parts of the cell metabolism. Nevertheless,
these different compartments are dependent on each other and are interconnected
by membrane proteins that transport specific nutrients, hormones, ions, water and
waste products across the membrane and facilitate signal transmission between different
compartments. Understanding the structure and function of membrane proteins
can therefore allow an enormous insight into the regulation of different metabolic pathways.
The electron microscope (EM) proved itself a great tool for studying membrane proteins,
offering the unique opportunity to image membrane proteins within a lipid bilayer
as close to the natural conditions as possible. Processing of images acquired by an electron
microscope poses a challenging task for both scientist and processing hardware.
Newly developed and optimized algorithms are needed to improve the image processing
to a level that allows atomic resolution to be achieved regularly.
Membrane proteins pose a difficult challenge for a structural biologist. To crystallize
membrane proteins into well ordered two dimensional (2D) or three dimensional (3D)
crystals is one of the most important prerequisites for structural analysis at the atomic
level, yet membrane proteins are notoriously difficult to crystallize.
One exception may be bacteriorhodopsin, which forms near-perfect crystals already
in its native membrane. This may explain the fact that the first 2D electron crystallographic
structure determined at 7 Å resolution by Henderson and Unwin[20][43] in
1975 was the structure of bacteriorhodopsin. In 1990 the structure of Br was determined
to atomic resolution by Henderson et al.[19], being the first atomic structure of
a membrane protein. The structure determination of Br was also the starting point
for the mrc program suite, which is widely used at the moment in the, albeit small,
2D electron crystallography community. Using the mrc software Kühlbrandt et al.[26]
solved the structure of the light-harvesting chlorophyll a/b-protein complex in 1994.
For recording the images they used the spot scan technique developed by Downing in
1991[9].
The first aquaporin water channel determined was aquaporin 1, resolved by Walz et
al. in 1997[45] at 6 Å resolution, and subsequently solved to atomic resolution by
Murata et al. in 2000[29]. Recently, several more aquaporin structures were determined
by 2D electron crystallographic methods, aquaporin-0 (AQP0) by Gonen et al. in
2004[14] at 3 Å and in 2005[13] at 1.9 Å and aquaporin-4 (AQP4) by Hiroaki et al.
in 2006[22]. Interestingly, AQP4 shows exactly the same monomer arrangement as
SoPIP2;1. The recent publications show that the trend goes from recording solely
images to the recording of diffraction data in combination with images or even to
recording diffraction data exclusively, and then using methods developed for x-ray
crystallography to obtain the phase information.
Given the fact that the software available for processing of 2D electron diffraction patterns
is less evolved than the one for processing images, and given this new development
of increased usage of diffraction patterns, it only makes sense to focus on implementing
new and improved programs for 2D electron diffraction processing.
In this work I would like to present the advances I achieved in the structural determination
of aquaporin 2, as well as my contribution to other projects, in particular the
structural investigations of SoPIP2;1 and KdgM. I will also explain the modified sample
preparation methods which made data recording at high tilt angles more reliable
and achieved an improvement in resolution of the measured data.
A second, equally important and detailed part of my thesis is the work invested in
improving and extending the image processing to a point where a user, not adept
in programming in several languages, can use it and produce good results. For this
I improved the functionality and performance at several points, including a strong
emphasis on user friendliness and ease of maintenance
Ethical issues of synthetic biology: a personalist perspective
The main objective of this thesis is to assess the bioethical issues raised by Synthetic Biology from a specific bioethical approach, personalism, specifically ontological personalism, a philosophy that shows the objective value of the person on the basis of its ontological structure. The person, as a being endowed with reason, freedom and awareness, has a special value which is above that of other beings.El objetivo principal de este trabajo es evaluar las cuestiones bioéticas planteadas por la Biología Sintética desde un enfoque bioético específico, el personalismo, específicamente el personalismo ontológico, una filosofía que muestra el valor objetivo de la persona sobre la base de su estructura ontológica.Ciencias ExperimentalesPrograma Oficial de Doctorado en Bioétic
Towards Personalized Medicine: Computational Approaches to Support Drug Design and Clinical Decision Making
The future looks bright for a clinical practice that tailors the
therapy with the best efficacy and highest safety to a patient. Substantial
amounts of funding have resulted in technological advances regarding
patient-centered data acquisition --- particularly genetic data. Yet, the
challenge of translating this data into clinical practice remains open.
To support drug target characterization, we developed a global maximum
entropy-based method that predicts protein-protein complexes including the
three-dimensional structure of their interface from sequence data. To further
speed up the drug development process, we present methods to reposition drugs
with established safety profiles to new indications leveraging paths in
cellular interaction networks. We validated both methods on known data,
demonstrating their ability to recapitulate known protein complexes and
drug-indication pairs, respectively.
After studying the extent and characteristics of genetic variation with a
predicted impact on protein function across 60,607 individuals, we showed that
most patients carry variants in drug-related genes. However, for the majority
of variants, their impact on drug efficacy remains unknown. To inform
personalized treatment decisions, it is thus crucial to first collate knowledge
from open data sources about known variant effects and to then close the
knowledge gaps for variants whose effect on drug binding is still not
characterized. Here, we built an automated annotation pipeline for
patient-specific variants whose value we illustrate for a set of patients with
hepatocellular carcinoma. We further developed a molecular modeling protocol to
predict changes in binding affinity in proteins with genetic variants which we
evaluated for several clinically relevant protein kinases.
Overall, we expect that each presented method has the potential to advance
personalized medicine by closing knowledge gaps about protein interactions and
genetic variation in drug-related genes. To reach clinical applicability,
challenges with data availability need to be overcome and prediction
performance should be validated experimentally.Therapien mit der besten Wirksamkeit und höchsten
Sicherheit werden in Zukunft auf den Patienten zugeschnitten werden. Hier haben
erhebliche finanzielle Mittel zu technologischen Fortschritten bei der
patientenzentrierten Datenerfassung geführt, aber diese Daten in die
klinische Praxis zu übertragen, bleibt aktuell noch eine Herausforderung.
Um die Wirkstoffforschung in der Charakterisierung therapeutischer Zielproteine
zu unterstützen, haben wir eine Maximum-Entropie-Methode entwickelt,
die Protein-Interaktionen und ihre dreidimensionalen Struktur
aus Sequenzdaten vorhersagt. Darüber hinaus, stellen wir Methoden
zur Repositionierung von etablierten Arzneimitteln auf
neue Indikationen vor, die Pfade in zellulären Interaktionsnetze nutzen.
Diese Methoden haben wir anhand bekannter Daten validiert und ihre Fähigkeit
demonstriert, bekannte Proteinkomplexe bzw. Wirkstoff-Indikations-Paare zu
rekapitulieren.
Unsere Analyse genetischer Variation mit einem Einfluss auf die
Proteinfunktion in 60,607 Individuen konnte zeigen, dass nahezu jeder Patient
funktionsverändernde Varianten in Medikamenten-assoziierten Genen
trägt. Der direkte Einfluss der meisten beobachteten Varianten auf die
Medikamenten-Wirksamkeit ist jedoch noch unbekannt. Um dennoch personalisierte
Behandlungsentscheidungen treffen zu können, präsentieren wir eine Annotationspipeline für genetische
Varianten, deren Wert wir für Patienten mit hepatozellulärem
Karzinom illustrieren konnten. Darüber hinaus haben wir ein molekulares
Modellierungsprotokoll entwickelt, um die Veränderungen in der
Bindungsaffinität von Proteinen mit genetischen Varianten voraussagen.
Insgesamt sind wir davon überzeugt, dass jede der vorgestellten Methoden das
Potential hat, Wissenslücken über Proteininteraktionen und
genetische Variationen in medikamentenbezogenen Genen zu schlie{\ss}en und
somit das Feld der personalisierten Medizin voranzubringen. Um klinische
Anwendbarkeit zu erreichen, gilt es in der Zukunft, verbleibende
Herausforderungen bei der Datenverfügbarkeit zu bewältigen und unsere
Vorhersagen experimentell zu validieren
- …