20 research outputs found

    ReCombine: A Suite of Programs for Detection and Analysis of Meiotic Recombination in Whole-Genome Datasets

    Get PDF
    In meiosis, the exchange of DNA between chromosomes by homologous recombination is a critical step that ensures proper chromosome segregation and increases genetic diversity. Products of recombination include reciprocal exchanges, known as crossovers, and non-reciprocal gene conversions or non-crossovers. The mechanisms underlying meiotic recombination remain elusive, largely because of the difficulty of analyzing large numbers of recombination events by traditional genetic methods. These traditional methods are increasingly being superseded by high-throughput techniques capable of surveying meiotic recombination on a genome-wide basis. Next-generation sequencing or microarray hybridization is used to genotype thousands of polymorphic markers in the progeny of hybrid yeast strains. New computational tools are needed to perform this genotyping and to find and analyze recombination events. We have developed a suite of programs, ReCombine, for using short sequence reads from next-generation sequencing experiments to genotype yeast meiotic progeny. Upon genotyping, the program CrossOver, a component of ReCombine, then detects recombination products and classifies them into categories based on the features found at each location and their distribution among the various chromatids. CrossOver is also capable of analyzing segregation data from microarray experiments or other sources. This package of programs is designed to allow even researchers without computational expertise to use high-throughput, whole-genome methods to study the molecular mechanisms of meiotic recombination

    Characterization of PRL1 and its paralogue PRL2 in Arabidopsis thaliana

    Get PDF
    Das Arabidopsis-Gen PLEIOTROPIC REGULATORY LOCUS 1 (PRL1) kodiert fĂŒr ein Protein, das ein Mitglied einer konservierten WDR-Protein-Familie in Eukaryoten ist. PRL1-Mutationen fĂŒhren zu Zucker-Überempfindlichkeit, verursachen zahlreiche pleiotrope VerĂ€nderungen bei der Wurzel-, Blatt- und BlĂŒten-entwicklung und lösen Antworten auf abiotische und biotische Stress-Reize aus, indem sie die Pflanzen-hormon-Homöostase verĂ€ndern. Im Gegensatz zu anderen Eukaryoten enthĂ€lt das Arabidopsis-Genom mit PRL2 ein PRL1-Paralog, das mit PRL1 eine hochkonservierte C-terminale WDR-DomĂ€ne teilt. Verglichen mit PRL1 besitzt PRL2 jedoch unterschiedliche N-terminale Sequenzen. Nichtsdestotrotz weisen die N-terminalen Bereiche von PRL1 und PRL2 eine fĂŒr die Evolution im Pflanzenreich außergewöhnlich hochkonservierte Struktur auf. Untersuchungen an Hefen und SĂ€ugetieren zeigten, dass PRL1 eine SchlĂŒsselrolle bei der Aktivierung des Spliceosoms spielt. Dennoch impliziert die Divergenz der N-terminalen PRL1-Region in Arabidopsis pflanzenspezifische Funktionen beim Spleißen. Ein Ziel dieser Arbeit war es, Einblicke in regulatorische Funktionen von PRL1 zu ermöglichen. Affymetrix ATH1-Arrays und Tiling-Arrays bestĂ€tigten zusammen mit der RNA-Seq-Analyse die im gesamten Genom vorhandenen VerĂ€nderungen der prl1 Mutante. DarĂŒber hinaus legt die Hochregulierung der Transposons zusammen mit der Hochregulierung von vielen weiteren Genen nahe, dass PRL1 ebenfalls an der Kontrolle des Gen-Silencing beteiligt ist. Zudem konnte durch die Verwendung von verĂ€nderten PRL1-Konstrukten gezeigt werden, dass PRL1-Transkription und PRL1-StabilitĂ€t stressabhĂ€ngig sind und dass PRL1 nicht in die Proteasom-AktivitĂ€t eingreift. Ein weiteres Ziel war, das Ausmaß der funktionellen Übereinstimmung von PRL1 und PRL2 zu beurteilen. Ein Transkriptionsvergleich von PRL1 und PRL2 ergab, dass PRL2 in den meisten Gewebeteilen auf einem signifikant niedrigeren Niveau transkribiert wird. So wird PRL2 in BlĂŒten speziell im mĂ€nnlichen Reproduktionsgewebe, in Pollenkörnern, im Endosperm und in der Zygote exprimiert, wohingegen PRL1 in sich entwickelnden Samenanlagen und im Integument der sich entwickelnden Samen aktiv ist. Die prl2-Mutationen sind embryoletal, wĂ€hrend somatische prl2-Mosaike einen prl1-Ă€hnlichen PhĂ€notyp zeigen. Durch den Austausch der unterschiedlichen N-terminalen DomĂ€nen zwischen PRL1 und PRL2 konnte die Subfunktionalisation von PRL1 und PRL2 gezeigt werden. Die N-terminale DomĂ€ne von PRL2 komplementiert den prl1-WurzelphĂ€notyp nur teilweise. Sie kann jedoch die von prl2 verursachte embryonale LetalitĂ€t wiederherstellen. Zudem ersetzt die N-terminale PRL1-DomĂ€ne zusammen mit der C-terminalen PRL2-DomĂ€ne den prl1-PhĂ€notyp, aber nicht den embryoletalen prl2-PhĂ€notyp. ZusĂ€tzlich zu der unterschiedlichen Transkription, die man bei PRL1 und PRL2 beobachtet , hat die Diversifikation der N-terminalen DomĂ€nen zu der Subfunktionalisation von PRL1 und PRL2 in Arabidopsis beigetragen. Die in dieser Arbeit vorgelegten Ergebnisse weisen zudem darauf hin, dass PRL1 und PRL2 eine spezielle Funktion bei der Kontrolle des Gen-Silencing in Arabidopsis spielen

    Machine learning approaches for high-dimensional genome-wide association studies

    Get PDF
    FormĂ„let med Genome-wide association studies (GWAS) er Ă„ finne statistiske sammenhenger mellom genetiske varianter og egenskaper av interesser. De genetiske variantene som forklarer mye av variasjonene i genomfattende genekspresjoner kan medfĂžre konfunderende analyser av kvantitative egenskaper ved ekspresjonsplasseringer (eQTL). For Ă„ betrakte konfunderende faktorene, presenterte vi LVREML-metoden i artikkel I, en metode som er konseptuelt analogt med Ă„ estimere faste og tilfeldige effekter i LineĂŠre Blandede modeller (LMM). Vi viste at de latente variablene med “Maximum likelihood” alltid kan velges ortogonalt til de kjente faktorene (som genetiske variasjoner). Dette indikerer at “Maximum likelihood” variablene forklarer utvalgsvariansene som ikke allerede er forklart av de genetiske variantene i modellen. For Ă„ kartlegge hvilke egenskaper som pĂ„virkes av de identifiserte genetiske variantene, mĂ„ vi reversere den funksjonelle relasjonen mellom genotyper og egenskaper. I denne sammenhengen er en “multi-trait” metode mer fordelaktige enn Ă„ studere egenskapene individuelt. “Multi-trait”-metoden drar nytte av Ăžkt kapasitet som fĂžlge av Ă„ vurdere kovarianser pĂ„ tvers av egenskaper, og redusert multiple tester, fordi det trengs en enkelt test for Ă„ teste for sammenhenger til et sett med egenskaper. I artikkel II analyserte vi ulike maskinlĂŠringsmetoder (Naive Bayes/independent univariate correlation, random forests og support vector machines) for omvendt regresjon i multi-trekk GWAS, ved bruk av genotyper, genuttrykksdata og “groundtruth” transcriptional regulatory networks fra DREAM5 SysGen Challenge og fra en krysning mellom to gjĂŠrstammer for Ă„ evaluere metoder. I artikkel III utvidet vi metoden ovenfor til Ă„ behandle menneskelig data. En viktig forskjell mellom data fra artikkel II og artikkel III er at vi ikke har “Groundtruth” data tilgjengelig for sistnevnte. Vi brukte genotypen og Magnetresonanstomografi (MRI) data hentet fra ADNI databasen. Resultatene fra bĂ„de artikkel II og artikkel III viste at resultat av genotypeprediksjon varierte pĂ„ tvers av genetiske varianter. Dette hjulpet med Ă„ identifisere genomiske regioner som er assosiert med stort antall egenskaper i hĂžydimensjonale fenotypiske data. Vi observerte ogsĂ„ at koeffisientene til maskinlĂŠringsmodeller korrelerte med styrken til assosiasjonene mellom varianter og egenskaper. Resultatene vĂ„re viste ogsĂ„ at ikke-lineĂŠre maskin-lĂŠringsmetoder som “random forests” identifiserte genetiske varianter tydeligere enn de lineĂŠre metodene. Spesielt observerte vi i artikkel III at “random forests” var i stand til Ă„ identifisere enkeltnukleotidpolymorfismer (SNP-er) som var forskjellige fra de som ble identifisert “ridge” og“lasso” regresjonsmetodene. Ytterligere analyse viste at de identifiserte SNP-ene tilhĂžrte gener som tidligere var assosiert med hjernerelaterte lidelser.Genome-wide association studies (GWAS) aim to find statistical associations between genetic variants and traits of interests. The genetic variants that explain a lot of variation in genome-wide gene expression may lead to confounding in expression quantitative trait loci (eQTL) analyses. To account for these confounding factors, in Article I we proposed LVREML, a method conceptually analogous to estimating fixed and random effects in linear mixed models (LMM). We showed that the maximum-likelihood latent variables can always be chosen orthogonal to the known factors (such genetic variants). This indicates that the maximum-likelihood variables explain the sample covariances that is not already explained by the genetic variants in the model. For identifying which traits are effected by the identified genetic variants, we need to reverse the functional relation between genotypes and traits. In this regard, multitrait approaches are more advantageous than studying the traits individually. The multi-trait approaches benefit from increased power from considering cross-trait covariances and reduced multiple testing burden because a single test is needed to test for associations to a set of traits. In Article II, we analyzed various machine learning methods (ridge regression, Naive Bayes/independent univariate correlation, random forests and support vector machines) for reverse regression in multi-trait GWAS, using genotypes, gene expression data and ground-truth transcriptional regulatory networks from the DREAM5 SysGen Challenge and from a cross between two yeast strains to evaluate methods. In Article III, we extended the above approach to human dataset. An important difference between data from Article II and Article III is that we do not have groundtruth data available for the latter. We used the genotype and brain-imaging features extracted from the MRIs obtained from the ADNI database. The results from both Article II and Article III showed that the genotype prediction performance varied across genetic variants. This helped in identifying genomic regions that are associated with high number of traits in high-dimensional phenotypic data. We also observed that the feature coefficients of fitted machine learning models correlated with the strength of association between variants and traits. Our results also showed that non-linear machine learning methods like random forests identified genetic variants distinct from the linear methods. In particular, we observed in Article III that random forest was able to identify single-nueclotide-polymorphisms (SNPs) that were distinct from the ones identified by ridge and lasso regression. Further analysis showed that the identified SNPs belonged to genes previously associated with brain-related disorders.Doktorgradsavhandlin

    Modern Technologies and Their Influence in Fermentation Quality

    Get PDF
    During the last few years, industrial fermentation technologies have advanced in order to improve the quality of the final product. Some examples of those modern technologies are the biotechnology developments of microbial materials, such as Saccharomyces and non-Saccharomyces yeasts or lactic bacteria from different genera. Other technologies are related to the use of additives and adjuvants, such as nutrients, enzymes, fining agents, or preservatives and their management, which directly influence the quality and reduce the risks in final fermentation products. Other technologies are based on the management of thermal treatments, filtrations, pressure applications, ultrasounds, UV, and so on, which have also led to improvements in fermentation quality in recent years. The aim of the issue is to study new technologies able to improve the quality parameters of fermentation products, such as aroma, color, turbidity, acidity, or any other parameters related to improving sensory perception by the consumers. Food safety parameters are also included

    Novel Algorithm Development for ‘NextGeneration’ Sequencing Data Analysis

    Get PDF
    In recent years, the decreasing cost of ‘Next generation’ sequencing has spawned numerous applications for interrogating whole genomes and transcriptomes in research, diagnostic and forensic settings. While the innovations in sequencing have been explosive, the development of scalable and robust bioinformatics software and algorithms for the analysis of new types of data generated by these technologies have struggled to keep up. As a result, large volumes of NGS data available in public repositories are severely underutilised, despite providing a rich resource for data mining applications. Indeed, the bottleneck in genome and transcriptome sequencing experiments has shifted from data generation to bioinformatics analysis and interpretation. This thesis focuses on development of novel bioinformatics software to bridge the gap between data availability and interpretation. The work is split between two core topics – computational prioritisation/identification of disease gene variants and identification of RNA N6 -adenosine Methylation from sequencing data. The first chapter briefly discusses the emergence and establishment of NGS technology as a core tool in biology and its current applications and perspectives. Chapter 2 introduces the problem of variant prioritisation in the context of Mendelian disease, where tens of thousands of potential candidates are generated by a typical sequencing experiment. Novel software developed for candidate gene prioritisation is described that utilises data mining of tissue-specific gene expression profiles (Chapter 3). The second part of chapter investigates an alternative approach to candidate variant prioritisation by leveraging functional and phenotypic descriptions of genes and diseases from multiple biomedical domain ontologies (Chapter 4). Chapter 5 discusses N6 AdenosineMethylation, a recently re-discovered posttranscriptional modification of RNA. The core of the chapter describes novel software developed for transcriptome-wide detection of this epitranscriptomic mark from sequencing data. Chapter 6 presents a case study application of the software, reporting the previously uncharacterised RNA methylome of Kaposi’s Sarcoma Herpes Virus. The chapter further discusses a putative novel N6-methyl-adenosine -RNA binding protein and its possible roles in the progression of viral infection

    Book of Abstracts of MICROBIOTEC09

    Get PDF
    SĂ­tio da conferĂȘncia: http://www.deb.uminho.pt/microbiotec09/This book contains the abstracts presented at the 3rd joint meeting of the Portuguese Society of Microbiology and The Portuguese Society of Biotechnology - MicroBiotec09, held in Vilamoura, Portugal, over 3 days, from the 28th to the 30th of November, 2009. MicroBiotec09 comes in the sequence of previous conferences organized by each society, since 1982, date of the I Encontro Nacional de Biotecnologia (Lisbon), till 2005, date of the first joint meeting - MICRO'05 + BIOTEC'05 (PĂłvoa de Varzim). Following this joint meeting, another - MICRO 07 + BIOTEC 07 + XXIII JPG took place in Lisbon (2007). MicroBiotec09 is a joint organization of “Sociedade Portuguesa de Biotecnologia”, “Sociedade Portuguesa de Microbiologia”, Institute for Biotechnology and Bioengineering (Universidade do Minho – Departamento de Engenharia BiolĂłgica) and Centro de Recursos MicrobiolĂłgicos (Universidade Nova de Lisboa, Faculdade de CiĂȘncias e Tecnologia – Departamento de CiĂȘncias da Vida). MicroBiotec09 brings together both young and established researchers and end users to discuss recent developments in different areas of Biotechnology and Microbiology. The conference program has thus been divided in 8 major sessions: Microbial Physiology, Molecular Biology and Functional Genomics; Cell and Tissue Engineering, Biomaterials and Nanobiotechnologies; Clinical Microbiology and Epidemiology; Environmental Microbiology and Biotechnology; Health and Pharmaceutical Biotechnology; Cellular Microbiology and Pathogenesis; Industrial and Food Microbiology and Biotechnology; Bioinformatics, Comparative Genomics and Evolution. A special session to celebrate the 200th anniversary of Charles Darwin's birth and the 150th anniversary of the publication of his landmark work “On the Origin of Species by Means of Natural Selection” will also take place. A total of 295 abstracts are included in the book, consisting of 6 invited lecturers, 10 oral presentations and 44 short oral presentations given in 3 parallel sessions, along with 4 slots for viewing poster presentations. All abstracts have been reviewed and we are grateful to the members of scientific and organizing committees for their evaluations. It was an intensive task since 328 submitted abstracts were received. It has been an honor for us to contribute to setting up MicroBiotec09 during an intensive period of 6 months. We wish to thank the authors who have contributed to yield a high scientific standard to the program. We are thankful to the sponsors who have contributed decisively to this event. We also extend our gratefulness to all those who, through their dedicated efforts, have assisted us in this task. On behalf of the Scientific and Organizing Committees we wish you that together with an interesting reading, the scientific program and the social moments organized will be memorable for all.Fundação para a CiĂȘncia e a Tecnologia (FCT

    Characterisation of transcriptional mediator subunit, MED17 and its regulation by cyclins.

    Get PDF
    Mediator is a transcription co-factor complex that co-operates with transcriptonal activators to enhance gene specifc transcription and is conserved between yeast Drosophila and man. The MED17 subunit of Mediator (formerly known as TRAP80/CRSP6/DRIP80) has been characterised as a transcriptional activator interacting with a number of transcription factors, such as heat shock factor and p53. Expression of MED17 in yeast and Drosophila is essential to cell viability possibly due to its function as a global transcriptional regulator. In a yeast-2-hybrid screen with a viral cyclin as bait, MED17 was identified as an interacting clone. Due to the oncogenic potential of viral cydins, effects of human MED17 on p53 regulated transcription were investigated. Functional characterisation of MED17 effects on p53 showed that it repressed p53 mediated transcription in lucrferase reporter assays. Further, a MED17 constitutively expressing line generated in non-transformed mouse cells inhibited apoptosis and demonstrated other features of p53 functional loss. Human MED17 still activates heat shock regulated transcription, as previously described for the Drosophila homologue. Analysis of other transcription factors regulated by MED17 was investigated by gene expression microarray analysis of the MED17 cell line, revealing a putative co-activator function in B-eatenin regulated transcription. Also studied was the interaction of MED17 with cellular homologues of viral cyclin. Cyclin/cdks phosphorylate MED17, with cyclin A/cdk2 specifically phosphorylating MED17 to enhance its expression. This investigation reveals a novel repressor function for MED17 on p53 mediated transcription and links cell cycle regulators to the transcriptional activities of MED17/Mediator and p53
    corecore