4,182 research outputs found

    Bioinformatic Investigations Into the Genetic Architecture of Renal Disorders

    Get PDF
    Modern genomic analysis has a significant bioinformatic component due to the high volume of complex data that is involved. During investigations into the genetic components of two renal diseases, we developed two software tools. // Genome-Wide Association Studies (GWAS) datasets may be genotyped on different microarrays and subject to different annotation, leading to a mosaic case-control cohort that has inherent errors, primarily due to strand mismatching. Our software REMEDY seeks to detect and correct strand designation of input datasets, as well as filtering for common sources of noise such as structural and multi-allelic variants. We performed a GWAS on a large cohort of Steroid-sensitive nephrotic syndrome samples; the mosaic input datasets were pre-processed with REMEDY prior to merging and analysis. Our results show that REMEDY significantly reduced noise in GWAS output results. REMEDY outperforms existing software as it has significantly more features available such as auto-strand designation detection, comprehensive variant filtering and high-speed variant matching to dbSNP. // The second tool supported the analysis of a newly characterised rare renal disorder: Polycystic kidney disease with hyperinsulinemic hypoglycemia (HIPKD). Identification of the underlying genetic cause led to the hypothesis that a change in chromatin looping at a specific locus affected the aetiology of the disease. We developed LOOPER, a software suite capable of predicting chromatin loops from ChIP-Seq data to explore the possible conformations of chromatin architecture in the HIPKD genomic region. LOOPER predicted several interesting functional and structural loops that supported our hypothesis. We then extended LOOPER to visualise ChIA-PET and ChIP-Seq data as a force-directed graph to show experimental structural and functional chromatin interactions. Next, we re-analysed the HIPKD region with LOOPER to show experimentally validated chromatin interactions. We first confirmed our original predicted loops and subsequently discovered that the local genomic region has many more chromatin features than first thought

    Population-specific genotype imputations using minimac or IMPUTE2

    Get PDF
    In order to meaningfully analyze common and rare genetic variants, results from genome-wide association studies (GWASs) of multiple cohorts need to be combined in a meta-analysis in order to obtain enough power. This requires all cohorts to have the same single-nucleotide polymorphisms (SNPs) in their GWASs. To this end, genotypes that have not been measured in a given cohort can be imputed on the basis of a set of reference haplotypes. This protocol provides guidelines for performing imputations

    The Diversity of REcent and Ancient huMan (DREAM): a new microarray for genetic anthropology and genealogy, forensics, and personalized medicine

    Get PDF
    The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation (CNVs), drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. While many commercial arrays exist for genome-wide single-nucleotide polymorphism (SNP) genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM) - an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM's autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent (IBD), and CNV analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies

    Computational Workflow for the FineGrained Analysis of Metagenomic Samples

    Get PDF
    El desarrollo de nuevas tecnologías de adquisición de datos ha propiciado una enorme disponibilidad de información en casi todos los campos existentes de la investigación científica, permitiendo a la vez una especialización que resulta en desarrollos software particulares. Con motivo de facilitar al usuario final la obtención de resultados a partir de sus datos, un nuevo paradigma de computación ha surgido con fuerza: los flujos de trabajo automáticos para procesar la información, que han conseguido imponerse gracias al soporte que proporcionan para ensamblar un sistema de procesamiento completo y robusto. La bioinformática es un claro ejemplo donde muchas instituciones ofrecen servicios específicos de procesamiento que, en general, necesitan combinarse para obtener un resultado global. Los ‘gestores de flujos de trabajo’ como Galaxy [1], Swift [2] o Taverna [3] se utilizan para el análisis de datos (entre otros) obtenidos por las nuevas tecnologías de secuenciación del ADN, como Next Generation Sequencing [4], las cuales producen ingentes cantidades de datos en el campos de la genómica, y en particular, metagenómica. La metagenómica estudia las especies presentes en una muestra no cultivada, directamente recolectada del entorno, y los estudios de interés tratan de observar variaciones en la composición de las muestras con objeto de identificar diferencias significativas que correlacionen con características (fenotipo)de los individuos a los que pertenecen las muestras; lo que incluye el análisis funcional de las especies presentes en un metagenoma para comprender las consecuencias derivadas de éstas. Analizar genomas completos ya resulta una tarea importante computacionalmente, por lo que analizar metagenomas en los que no solo está presente el genoma de una especie sino de las varias que conviven en la muestra, resulta una tarea hercúlea. Por ello, el análisis metagenómico requiere algoritmos eficientes capaces de procesar estos datos de forma efectiva y eficiente, en tiempo razonable. Algunas de las dificultades que deben salvarse son (1) el proceso de comparación de muestras contra bases de datos patrón, (2) la asignación (m apping ) de lecturas (r eads ) a genomas mediante estimadores de parecido, (3) los datos procesados suelen ser pesados y necesitan formas de acceso funcionales, (4) la particularidad de cada muestra requiere programas específicos y nuevos para su análisis; (5) la representación visual de resultados ndimensionales para la comprensión y (6) los procesos de verificación de calidad y certidumbre de cada etapa. Para ello presentamos un flujo de trabajo completo pero adaptable, dividido en módulos acoplables y reutilizables mediante estructuras de datos definidas, lo que además permite fácil extensión y customización para satisfacer la demanda de nuevos experimentos

    Forensic Ancestry Analysis with Autosomal Polymorphisms

    Get PDF
    The inference of ancestry from biological material left at a crimescene has been a longstanding but specialised forensic technique, often lacking sufficient detail to make a reliable inference of ancestr y. This thesis describes the key steps in developing a forensic ancestry test that can be adopted by any laboratory using capillary electrophoresis equipment: optimisation of a PCR multiplex to detect DNA markers from contact traces; compilation of population data from which to infer the likely pop ulation of origin of the person; detection of coancestry patterns in an individual with admixed backgr ounds; and development of online statistical tools that calculate the probability of an individual’s an cestry from a submitted SNP profile. Additional types of autosomal markers were compiled from Ind el polymorphisms; short tandem repeats (STRs); multiple allele SNPs; and Microhaplotype markers

    Bioinformatic studies on structural elements for the regulation of alternative oxidase (AOX) gene activities

    Get PDF
    Trabalho de projecto de mestrado em Engenharia Informática, apresentado à Universidade de Lisboa, através da Faculdade de Ciências, 2007Alternative Oxidase genes encode a small family of isoenzymes (enzymes with some differences but act in the same chemical reaction). AOX is present in plants, fungi, algae, some yeast, and was also found in some classes of the animal kingdom. The enzymes are responsible for an alternative pathway of respiration that is responsive to stress conditions but also to pathogen attack, as well as growth and stage development. Scaffold Matrix Attachment Regions (S/MARs) are DNA sequences from 300 to 3000 nucleotides that bound with nuclear proteins serving as anchors for DNA, influencing in this way the DNA organization inside the cell. Several studies have failed to reveal a pattern of organization in the sequences, however some rules have been found that help computer based analysis. Experimental identification of these sequences is hard and time consuming, computer methods could provide a first step selection, and cover larger sequences. In order to highlight possible links between S/MARs and differential regulation of AOX genes, the first part of this project consists in identifying structurally relevant S/MAR regions in the neighborhood of AOX genes in Arabidopsis thaliana and in rice using a selected computer program. Single Nucleotide Polymorphisms (SNPs) are variations in one nucleotide base among DNA sequences from the same location, from different individuals. These differences could serve as markers to classify a specific set of individuals. The second part of this project consists in the development of a bioinformatic application that will help in the identification of specific polymorphisms (SNPs) in sequences that are experimentally obtained at the EU Marie Curie Chair in ICAM University of Évora, where this project is performed.Os genes da oxidase alternativa (ou AOX) codificam uma pequena família de isoenzimas (enzimas com algumas diferenças mas que actuam nas mesmas reacções químicas), que se encontram nas plantas, fungos, algas, algumas leveduras bem como em algumas classes do reino animal. A AOX é responsável por uma via alternativa de respiração, activada principalmente em condições de stress mas também como reacção a ataques patogénicos, bem como em estádios específicos do desenvolvimento da planta. As Scaffold Matrix Attachment Regions (S/MARs) são sequências de DNA entre 300 e 3000 nucleótidos que se ligam a proteínas do núcleo da célula, servindo como âncoras para o DNA, conferindo-lhe assim uma forma própria no interior da célula. Estudos realizados para determinar uma organização específica destas regiões não produziram muitos resultados, no entanto foram definidas algumas regras que permitem ajudar na detecção computacional destas sequências, uma vez que a detecção experimental é difícil e morosa. Com vista a estabelecer possíveis relações entre uma regulação diferenciada dos genes da AOX através dos S/MARs, a primeira parte deste projecto consiste em determinar as regiões do DNA com a estrutura de potenciais S/MARs na vizinhança dos genes da Oxidase Alternativa na Arabidopsis thaliana e no arroz. Single Nucleotide Polymorphisms (SNPs) são diferenças de um nucleótido entreas mesmas regiões de DNA de diferentes indivíduos da mesma espécie. Estas diferençaspodem servir para marcar um determinado conjunto de indivíduos.A segunda parte deste projecto consiste em desenvolver uma aplicação para ajudarna identificação de tipos específicos de polimorfismos, (SNPs) em sequências identificadas na EU Marie Curie Chair, ICAM, Universidade de Évora, onde este projecto foi desenvolvido

    A Cereal Chemist's Quick Guide to Genetics, Plant Breeding and BioIT

    Get PDF
    This book is intended as a guide for cereal chemists in quality testing laboratories and grain product development companies, to help them in their understanding of fundamental genetics, functional genomics and other concepts of relevance during their interactions with crop breeding programs. Consequently the emphasis is on quick definitions of terms and concepts, assuming that the expertise of the reader is predominantly in another field.Established and supported under the Australian Government’s Cooperative Research Centre Progra

    Multi-omics molecular profiling of lung tumours

    Get PDF
    Lung Cancer (LC) is one of the most common malignancies and is the leading cause of cancer death worldwide among both men and women. Current LC classifications are based on histopathological features which poorly reflect the molecular diversity of these tumours. Consequently, primary and secondary drug resistance are very frequent, and a high mortality is usual in LC patients. Despite the fact that LC has been intensively studied, there is a lack of effective biomarkers for early detection, stratification and prognosis. Integration of omics data is a powerful approach that can be used to identify molecular subgroups relevant in the clinical setting. This thesis addresses this challenge by characterising the molecular alterations accompanying LC at the genetic and DNA methylation level, using a combination of Whole-Exome Sequencing (WES), Targeted Capture Sequencing (TCS), Single Nucleotide Polymorphism (SNP) genotyping, Whole-Genome Bisulfite Sequencing and RNA-sequencing. The integration of different types of omics data first validated previous molecular alterations in frequently diagnosed LC tumours. This allowed comparison of the genomic and epigenomic landscapes between these common and rarer LC subtypes. Next, novel molecular subgroups of Non-Small Cell Lung Cancer (NSCLC) tumours with bad prognostic, as well as subgroups of Lung Carcinoids (L-CDs, an understudied LC subtype) have been identified and their molecular alterations and signatures characterised. Significant associations with histological features and gene expression programmes have been found by using several bioinformatic tools. These results show the value of multi-omics approaches to better understand the molecular mechanisms underlying LC and to identify new biomarkers. Importantly, some of these findings may be translatable and are likely to improve the detection, monitoring and stratification for targeted therapies in LC patients.Open Acces

    Whole exome sequencing of chronic lymphocytic leukemia (CLL)

    Get PDF
    corecore