520 research outputs found
Metilação diferencial de DNA no envelhecimento: exploração in silico utilizando dados de elevado rendimento
The emergence of high-throughput methodologies after the conclusion of the Human Genome Project has brought genomic and epigenomic wide studies to the forefront of current research of biological and biomedical knowledge. Currently, the focus in genetic mutations as primary cause of certain disorders is not so relevant as before, since it was demonstrated that epigenetic mechanisms are involved in cellular programming and gene regulation providing adaptive variants of a given gene to a changing environment with an association to cellular differentiation.
The research in the DNA methylation field has already revealed essential facts as the existence of methylation in CpG islands and alternative contexts that influence gene expression in tissue-specific manner. The influence of lifestyle choices in aging processes has also been related to methylome variations. And, in the case of cancer, the cooperation of epigenetic and genetic information is essential to understand the progress of cancer development as well as the silencing of key regulatory genes. An overall hypomethylation in cancer genome leads to oncogene activation whereas hypermethylation in specific regions is associated with silencing of tumour suppressor genes. For that reason, the research for new therapeutic approaches to cancer and aging is a current issue of the scientific community that work in the epigenomic field.
In order to contribute to the study of mammalian epigenomes during lifespans, this research focused on the usage of public databases datasets to further investigation about DNA methylation across aged individuals in order to extract tissue-specific markers related with healthy aging. The validation of results was made through the usage of samples, form healthy individuals with good or bad cognitive performances, available in iBiMED. In both situations the genes ELOVL2 (cg16867657) and FHL2 (cg06639320) were identified as good markers of ageO aparecimento de metodologias de sequenciação de elevado rendimento após a conclusão do Projeto do Genoma Humano foi um avanço fundamental para a pesquisa biológica e biomédica na área da genómica. Embora as mutações genéticas tenham sido durante décadas o foco principal na causa de certas desordens, atualmente demonstrou-se que os mecanismos epigenéticos estão envolvidos na programação celular e na regulação genética, providenciando variações adaptativas do mesmo gene a um determinado ambiente e possuindo ainda uma associação direta com a diferenciação celular.
O desenvolvimento científico no campo da metilação de DNA revela atualmente factos essenciais na biologia molecular, como a existência de metilação nas ilhas CpG e em contextos alternativos que influenciam a expressão genética nos diferentes tecidos humanos. Para além disso, a influência dos estilos de vida no processo de envelhecimento já demonstrou estar relacionada com o estado do epigenoma, nomeadamente com as variações no metiloma humano. No caso do cancro, a cooperação dos fatores genéticos e epigenéticos é essencial para a compreensão do desenvolvimento desta patologia no organismo humano nomeadamente através do silenciamento de genes reguladores essenciais. Uma hipometilação global no genoma do cancro conduz geralmente a uma ativação de oncogenes enquanto que hipermetilações localizadas estão associadas com o silenciamento de genes supressores de tumores. Por estes motivos, o desenvolvimento de novas terapias para o cancro ou o envelhecimento torna-se um tópico de interesse pela comunidade científica da área da epigenómica.
Com o objetivo de desenvolver estes temas e melhorar a determinação de variações globais no epigenoma humano, esta investigação desenvolveu-se com base na utilização de dados de bases de dados públicas de indivíduos saudaveis de forma a extrair marcadores de metilação diferenciada em variados tecidos ao longo do envelhecimento saudável. O projeto foi validado através da utilização de amostras saúdaveis e de indivíduos com boas ou más performances cognitivas disponíveis no iBiMED. Em ambas as situações os genes ELOVL2 (cg16867657) e FHL2 (cg06639320) foram identificados como bons marcadores da idade dos indivíduosMestrado em Biotecnologi
Cracking the Code of Human Diseases Using Next-Generation Sequencing: Applications, Challenges, and Perspectives
Next-generation sequencing (NGS) technologies have greatly impacted on every field of molecular research mainly because they reduce costs and increase throughput of DNA sequencing. These features, together with the technology's flexibility, have opened the way to a variety of applications including the study of the molecular basis of human diseases. Several analytical approaches have been developed to selectively enrich regions of interest from the whole genome in order to identify germinal and/or somatic sequence variants and to study DNA methylation. These approaches are now widely used in research, and they are already being used in routine molecular diagnostics. However, some issues are still controversial, namely, standardization of methods, data analysis and storage, and ethical aspects. Besides providing an overview of the NGS-based approaches most frequently used to study the molecular basis of human diseases at DNA level, we discuss the principal challenges and applications of NGS in the field of human genomics
BRCA1 promoter methylation: the influence on gene expression and the effect of long term drug treatment
Breast cancer is the most common type of cancer among woman all over the world, with over 1.67 million new cases in 2012. Heritable breast cancer is closely linked to mutations in the tumor suppressor gene BRCA1, with up to 80% lifetime risk for developing breast cancer among women harboring a mutation in this gene. However, most breast cancer cases are sporadic and somatic mutations of the BRCA1 gene are rare. Furthermore, some tumors show BRCAness, despite being BRCA1 wild-type. Thus, it is of great interest to assess alternative mechanisms for inactivation of the BRCA1 gene, and addressing the missing causality of many breast cancers. Furthermore, it is of great interest to assess the mechanisms of drug resistance, a major challenge in cancer treatment today, where BRCA1 may play an important role. The overall aim of this thesis is to increase the understanding of the biological role of BRCA1 promoter methylation in breast cancer. Three sub aims for the present project were outlined; 1) Quantify the BRCA1 α and β transcripts and the total BRCA1 protein levels and relate the expression data to the methylation pattern in the BRCA1 promoter region in a panel of breast cancer cell lines. 2) Investigate how the total expression levels, as well as the ratio between the α and β transcripts are affected by alterations in the α and β promoter region of BRCA1, including methylation of specific CpGs as well as the polymorphisms rs71361504 and rs799905. 3) Investigate the effect of long term treatment with the drugs olaparib and doxorubicin on the BRCA1 promoter methylation in SKBR3 breast cancer cells as a potential cause of drug resistance. The study showed a weak correlation between BRCA1 methylation pattern and BRCA1 mRNA expression. No correlation was observed between the methylation pattern and protein expressed or between mRNA levels and protein expression. Analysis of polymorphisms rs71361504 and rs799905 found in the BRCA1 promoter showed that the two variants seemed to counter-balance each other, giving equal luciferase expression levels when differing in two positions and lower expression levels when intermediate variants were studied. Finally, long term drug treatment of the cell line SKBR3 did not alter the methylation levels in the BRCA1 promoter, consequently demethylation seems not to be a mechanism for drug resistance in the experimental setup tested in this study.Masteroppgåve i molekylærbiologiMAMN-MOLMOL39
The mapping task and its various applications in next-generation sequencing
The aim of this thesis is the development and benchmarking of
computational methods for the analysis of high-throughput data from
tiling arrays and next-generation sequencing. Tiling arrays have been
a mainstay of genome-wide transcriptomics, e.g., in the identification
of functional elements in the human genome. Due to limitations of
existing methods for the data analysis of this data, a novel
statistical approach is presented that identifies expressed segments
as significant differences from the background distribution and thus
avoids dataset-specific parameters. This method detects differentially
expressed segments in biological data with significantly lower false
discovery rates and equivalent sensitivities compared to commonly used
methods. In addition, it is also clearly superior in the recovery of
exon-intron structures. Moreover, the search for local accumulations
of expressed segments in tiling array data has led to the
identification of very large expressed regions that may constitute a
new class of macroRNAs.
This thesis proceeds with next-generation sequencing for which various
protocols have been devised to study genomic, transcriptomic, and
epigenomic features. One of the first crucial steps in most NGS data
analyses is the mapping of sequencing reads to a reference
genome. This work introduces algorithmic methods to solve the mapping
tasks for three major NGS protocols: DNA-seq, RNA-seq, and
MethylC-seq. All methods have been thoroughly benchmarked and
integrated into the segemehl mapping suite.
First, mapping of DNA-seq data is facilitated by the core mapping
algorithm of segemehl. Since the initial publication, it has been
continuously updated and expanded. Here, extensive and reproducible
benchmarks are presented that compare segemehl to state-of-the-art
read aligners on various data sets. The results indicate that it is
not only more sensitive in finding the optimal alignment with respect
to the unit edit distance but also very specific compared to most
commonly used alternative read mappers. These advantages are
observable for both real and simulated reads, are largely independent
of the read length and sequencing technology, but come at the cost of
higher running time and memory consumption.
Second, the split-read extension of segemehl, presented by Hoffmann,
enables the mapping of RNA-seq data, a computationally more difficult
form of the mapping task due to the occurrence of splicing. Here, the
novel tool lack is presented, which aims to recover missed RNA-seq
read alignments using de novo splice junction information. It
performs very well in benchmarks and may thus be a beneficial
extension to RNA-seq analysis pipelines.
Third, a novel method is introduced that facilitates the mapping of
bisulfite-treated sequencing data. This protocol is considered the
gold standard in genome-wide studies of DNA methylation, one of the
major epigenetic modifications in animals and plants. The treatment of
DNA with sodium bisulfite selectively converts unmethylated cytosines
to uracils, while methylated ones remain unchanged. The bisulfite
extension developed here performs seed searches on a collapsed
alphabet followed by bisulfite-sensitive dynamic programming
alignments. Thus, it is insensitive to bisulfite-related mismatches
and does not rely on post-processing, in contrast to other methods. In
comparison to state-of-the-art tools, this method achieves
significantly higher sensitivities and performs time-competitive in
mapping millions of sequencing reads to vertebrate
genomes. Remarkably, the increase in sensitivity does not come at the
cost of decreased specificity and thus may finally result in a better
performance in calling the methylation rate.
Lastly, the potential of mapping strategies for de novo genome
assemblies is demonstrated with the introduction of a new guided
assembly procedure. It incorporates mapping as major component and
uses the additional information (e.g., annotation) as guide. With this
method, the complete mitochondrial genome of Eulimnogammarus verrucosus has been
successfully assembled even though the sequencing library has been
heavily dominated by nuclear DNA.
In summary, this thesis introduces algorithmic methods that
significantly improve the analysis of tiling array, DNA-seq, RNA-seq,
and MethylC-seq data, and proposes standards for benchmarking NGS read
aligners. Moreover, it presents a new guided assembly procedure that
has been successfully applied in the de novo assembly of a
crustacean mitogenome.Diese Arbeit befasst sich mit der Entwicklung und dem Benchmarken von
Verfahren zur Analyse von Daten aus Hochdurchsatz-Technologien, wie
Tiling Arrays oder Hochdurchsatz-Sequenzierung. Tiling Arrays bildeten
lange Zeit die Grundlage für die genomweite Untersuchung des
Transkriptoms und kamen beispielsweise bei der Identifizierung
funktioneller Elemente im menschlichen Genom zum Einsatz. In dieser
Arbeit wird ein neues statistisches Verfahren zur Auswertung von
Tiling Array-Daten vorgestellt. Darin werden Segmente als exprimiert
klassifiziert, wenn sich deren Signale signifikant von der
Hintergrundverteilung unterscheiden. Dadurch werden keine auf den
Datensatz abgestimmten Parameterwerte benötigt. Die hier
vorgestellte Methode erkennt differentiell exprimierte Segmente in
biologischen Daten bei gleicher Sensitivität mit geringerer
Falsch-Positiv-Rate im Vergleich zu den derzeit hauptsächlich
eingesetzten Verfahren. Zudem ist die Methode bei der Erkennung von
Exon-Intron Grenzen präziser. Die Suche nach Anhäufungen
exprimierter Segmente hat darüber hinaus zur Entdeckung von sehr
langen Regionen geführt, welche möglicherweise eine neue
Klasse von macroRNAs darstellen.
Nach dem Exkurs zu Tiling Arrays konzentriert sich diese Arbeit nun
auf die Hochdurchsatz-Sequenzierung, für die bereits verschiedene
Sequenzierungsprotokolle zur Untersuchungen des Genoms, Transkriptoms
und Epigenoms etabliert sind. Einer der ersten und entscheidenden
Schritte in der Analyse von Sequenzierungsdaten stellt in den meisten
Fällen das Mappen dar, bei dem kurze Sequenzen (Reads) auf ein
großes Referenzgenom aligniert werden. Die vorliegende Arbeit
stellt algorithmische Methoden vor, welche das Mapping-Problem für
drei wichtige Sequenzierungsprotokolle (DNA-Seq, RNA-Seq und
MethylC-Seq) lösen. Alle Methoden wurden ausführlichen
Benchmarks unterzogen und sind in der segemehl-Suite integriert.
Als Erstes wird hier der Kern-Algorithmus von segemehl vorgestellt,
welcher das Mappen von DNA-Sequenzierungsdaten ermöglicht. Seit
der ersten Veröffentlichung wurde dieser kontinuierlich optimiert
und erweitert. In dieser Arbeit werden umfangreiche und auf
Reproduzierbarkeit bedachte Benchmarks präsentiert, in denen
segemehl auf zahlreichen Datensätzen mit bekannten
Mapping-Programmen verglichen wird. Die Ergebnisse zeigen, dass
segemehl nicht nur sensitiver im Auffinden von optimalen Alignments
bezüglich der Editierdistanz sondern auch sehr spezifisch im
Vergleich zu anderen Methoden ist. Diese Vorteile sind in realen und
simulierten Daten unabhängig von der Sequenzierungstechnologie
oder der Länge der Reads erkennbar, gehen aber zu Lasten einer
längeren Laufzeit und eines höheren Speicherverbrauchs.
Als Zweites wird das Mappen von RNA-Sequenzierungsdaten untersucht,
welches bereits von der Split-Read-Erweiterung von segemehl
unterstützt wird. Aufgrund von Spleißen ist diese Form des
Mapping-Problems rechnerisch aufwendiger. In dieser Arbeit wird das
neue Programm lack vorgestellt, welches darauf abzielt, fehlende
Read-Alignments mit Hilfe von de novo Spleiß-Information zu
finden. Es erzielt hervorragende Ergebnisse und stellt somit eine
sinnvolle Ergänzung zu Analyse-Pipelines für
RNA-Sequenzierungsdaten dar.
Als Drittes wird eine neue Methode zum Mappen von Bisulfit-behandelte
Sequenzierungsdaten vorgestellt. Dieses Protokoll gilt als
Goldstandard in der genomweiten Untersuchung der DNA-Methylierung,
einer der wichtigsten epigenetischen Modifikationen in Tieren und
Pflanzen. Dabei wird die DNA vor der Sequenzierung mit Natriumbisulfit
behandelt, welches selektiv nicht methylierte Cytosine zu Uracilen
konvertiert, während Methylcytosine davon unberührt
bleiben. Die hier vorgestellte Bisulfit-Erweiterung führt die
Seed-Suche auf einem reduziertem Alphabet durch und verifiziert die
erhaltenen Treffer mit einem auf dynamischer Programmierung
basierenden Bisulfit-sensitiven Alignment-Algorithmus. Das verwendete
Verfahren ist somit unempfindlich gegenüber
Bisulfit-Konvertierungen und erfordert im Gegensatz zu anderen
Verfahren keine weitere Nachverarbeitung. Im Vergleich zu aktuell
eingesetzten Programmen ist die Methode sensitiver und benötigt
eine vergleichbare Laufzeit beim Mappen von Millionen von Reads auf
große Genome. Bemerkenswerterweise wird die erhöhte
Sensitivität bei gleichbleibend guter Spezifizität
erreicht. Dadurch könnte diese Methode somit auch bessere
Ergebnisse bei der präzisen Bestimmung der Methylierungsraten
erreichen.
Schließlich wird noch das Potential von Mapping-Strategien für
Assemblierungen mit der Einführung eines neuen,
Kristallisation-genanntes Verfahren zur unterstützten
Assemblierung aufgezeigt. Es enthält Mapping als Hauptbestandteil
und nutzt Zusatzinformation (z.B. Annotationen) als
Unterstützung. Dieses Verfahren ermöglichte die erfolgreiche
Assemblierung des kompletten mitochondrialen Genoms von Eulimnogammarus verrucosus trotz
einer vorwiegend aus nukleärer DNA bestehenden genomischen
Bibliothek.
Zusammenfassend stellt diese Arbeit algorithmische Methoden vor,
welche die Analysen von Tiling Array, DNA-Seq, RNA-Seq und MethylC-Seq
Daten signifikant verbessern. Es werden zudem Standards für den
Vergleich von Programmen zum Mappen von Daten der
Hochdurchsatz-Sequenzierung vorgeschlagen. Darüber hinaus wird ein
neues Verfahren zur unterstützten Genom-Assemblierung vorgestellt,
welches erfolgreich bei der de novo-Assemblierung eines
mitochondrialen Krustentier-Genoms eingesetzt wurde
Epigenetic characterization of human hepatocyte subpopulations in context of complex metabolic diseases and during in vitro differentiation of hepatocyte-like cells
The comprehensive transcriptional and epigenetic characterization of human hepatocyte subpopulations is necessary to achieve a better understanding of regulatory processes in health and complex metabolic diseases as well as during in vitro differentiation. Based on integrative analysis of genome-wide sequencing data, this thesis aims to unravel hepatocyte heterogeneity in different biological contexts. A deeper understanding of spatial organization of cells in human tissues is an important challenge. Using a unique experimental set-up based on laser capture microdissection coupled to next generation sequencing, which preserves spatial orientation and still provides genome-wide data of well defined subpopulations, the first combined spatial analysis of transcriptomes and methylomes across three micro-dissected zones of human liver provides a wealth of new positional insights, both in health and in context of fatty liver disease. In addition, these spatial maps serve as reference for projection of single cell data into hepatic pseudospace, which is still a major challenge. Hence, a novel pseudospace inference approach, which considerably improves spatial reconstruction of single cells into tissue context, is demonstrated for human liver. Finally, the identification of underlying regulatory networks by integrative epigenomic analysis of in vitro differentiated hepatocyte-like cells contributes to the development of reasonable cell culture interventions to improve differentiation.Die umfassende transkriptionelle und epigenetische Charakterisierung humaner Leberzellsubpopulationen ist notwendig für die Aufklärung regulatorischer Prozesse in gesundem Gewebe, sowie im Zusammenhang mit komplexen metabolischen Erkrankungen und während der in vitro Differenzierung. Ziel dieser Arbeit ist es, basierend auf der integrativen Analyse genomweiter Sequenzierungsdaten, die Heterogenität von Leberzellen besser zu verstehen. Die räumliche Organisation von Zellen in humanem Gewebe stellt eine große Herausforderung dar. Mit Hilfe von Lasermikrodissektion gekoppelt an Hochdurchsatzsequenzierung ist es möglich definierte Subpopulationen hinsichtlich ihres Gewebekontextes zu analysieren. Somit konnte die erste räumliche Analyse von Transkriptom und Methylom dreier Zonen der humanen Leber erstellt werden, die eine Vielzahl neuer Erkenntnisse sowohl in gesundem Lebergewebe als auch in Zusammenhang mit Fettlebererkrankungen liefert. Außerdem wurde auf Grundlage dieser räumlichen Karten ein neuer Ansatz zur Projektion von Einzelzelldaten in den räumlichen Gewebekontext etabliert. Schließlich konnte durch die integrative Analyse der ausschlaggebenden regulatorischen Netzwerke während der in vitro Differenzierung von Hepatozyten-ähnlichen Zellen neue Strategien zur Verbesserung der Differenzierung entwickelt werden
Targeted Epigenetic Editing to Increase Adult Pancreatic Β-Cell Proliferation
β-cell replacement therapy is potentially a curative approach in treating diabetes, as demonstrated by the success of pancreatic islet transplantation in type 1 diabetes. However, there are an insufficient number of organ donors to meet the demand of this disease, which is increasing in prevalence. One strategy to increase the supply of human β-cells for transplantation in type 1 diabetics, or to increase residual β-cell mass in type 2 diabetics, is to induce human β-cell replication. This strategy has not been implemented clinically because adult human β-cells are largely quiescent and the capacity for proliferation decreases with age. I hypothesized that changes in DNA methylation contribute to the age-related decline in proliferative capacity in human β-cells, and that altering the DNA methylome in a targeted manner could improve proliferative capacity. To investigate this hypothesis, I sought to profile the β-cell across the human lifespan, and to develop tools that permit targeted DNA methylation modifications and efficiency in measuring DNA methylation. I conducted RNA-Seq and whole-genome bisulfite sequencing (WGBS) to profile the aging human β-cell transcriptome and DNA methylome. I found that there are significant changes in gene expression with age, and in DNA methylation, particularly at islet-specific active enhancers. Further, I developed transcription activator-like effector (TALE) fusion proteins conjugated to DNA methyltransferases (DNMTs) and demonstrated that targeting TALE-DNMTs to the promoter of the CDKN2A locus, encoding the cell cycle inhibitor p16, increases proliferation in primary human fibroblasts. Finally, I developed BisPCR2, a novel technique for preparing targeted bisulfite next-generation sequencing libraries, which greatly improves the efficiency in which DNA methylation can be measured at target regions. I demonstrated the utility of this tool to validate genome-wide findings of type 2 diabetes CpG risk loci. Together, these novel datasets and epigenetic tools poise the β-cell regeneration field to investigate targeted epigenetic modifications as a strategy to improve proliferative capacity of adult human β-cells
Analyzing Modern Biomolecules: The Revolution of Nucleic-Acid Sequencing-Review
Recent developments have revolutionized the study of biomolecules. Among them are molecular markers, amplification and sequencing of nucleic acids. The latter is classified into three generations. The first allows to sequence small DNA fragments. The second one increases throughput, reducing turnaround and pricing, and is therefore more convenient to sequence full genomes and transcriptomes. The third generation is currently pushing technology to its limits, being able to sequence single molecules, without previous amplification, which was previously impossible. Besides, this represents a new revolution, allowing researchers to directly sequence RNA without previous retrotranscription. These technologies are having a significant impact on different areas, such as medicine, agronomy, ecology and biotechnology. Additionally, the study of biomolecules is revealing interesting evolutionary information. That includes deciphering what makes us human, including phenomena like non-coding RNA expansion. All this is redefining the concept of gene and transcript. Basic analyses and applications are now facilitated with new genome editing tools, such as CRISPR. All these developments, in general, and nucleic-acid sequencing, in particular, are opening a new exciting era of biomolecule analyses and applications, including personalized medicine, and diagnosis and prevention of diseases for humans and other animals
Deciphering Organoids: High-Dimensional Analysis of Biomimetic Cultures
Organoids are self-organising stem cell-derived ex vivo cultures widely adopted as biomimetic models of healthy and diseased tissues. Traditional low-dimensional experimental methods such as microscopy and bulk molecular analysis have generated remarkable biological insights from organoids. However, as complex heterocellular systems, organoids are especially well-positioned to take advantage of emerging high-dimensional technologies. In particular, single-cell methods offer considerable opportunities to analyse organoids at unprecedented scale and depth, enabling comprehensive characterisation of cellular processes and spatial organisation underpinning organoid heterogeneity. This review evaluates state-of-the-art analytical methods applied to organoids, discusses the latest advances in single-cell technologies, and speculates on the integration of these two rapidly developing fields
- …