Search CORE

Large-scale analysis of microRNA evolution.

Author: Enright Anton J
Guerra-Assunção José Afonso
Publication venue: BMC Genomics
Publication date: 01/01/2012
Field of study

BACKGROUND: In animals, microRNAs (miRNA) are important genetic regulators. Animal miRNAs appear to have expanded in conjunction with an escalation in complexity during early bilaterian evolution. Their small size and high-degree of similarity makes them challenging for phylogenetic approaches. Furthermore, genomic locations encoding miRNAs are not clearly defined in many species. A number of studies have looked at the evolution of individual miRNA families. However, we currently lack resources for large-scale analysis of miRNA evolution. RESULTS: We addressed some of these issues in order to analyse the evolution of miRNAs. We perform syntenic and phylogenetic analysis for miRNAs from 80 animal species. We present synteny maps, phylogenies and functional data for miRNAs across these species. These data represent the basis of our analyses and also act as a resource for the community. CONCLUSIONS: We use these data to explore the distribution of miRNAs across phylogenetic space, characterise their birth and death, and examine functional relationships between miRNAs and other genes. These data confirm a number of previously reported findings on a larger scale and also offer novel insights into the evolution of the miRNA repertoire in animals, and it's genomic organization

Springer - Publisher Connector

Recommended from our members

Evolutionary analysis of animal microRNAs

Author: Guerra Martins dos Santos Assunção José Afonso
Publication venue: University of Cambridge
Publication date: 08/01/2013
Field of study

In recent years, microRNAs (miRNAs) have been recognised as important genetic regulators of gene expression in Animals and Plants. They can potentially target a large fraction of the cellular transcriptome, having been shown to be important for diverse biological processes such as development, cell differentiation, proliferation and metabolism. The publication of the Human genome in 2001 marked the start of a great community effort to sequence a variety of other species. These data have great potential for comparative genomics, that can lead to better biological understanding. Some miRNA families are known to be highly conserved, across long evolutionary distances, many found in co-transcribed clusters across the genome. While these phenomena have been previously reported, a large-scale analysis of evolutionary patterns was still lacking. Furthermore, the rate at which new relevant data is being made available makes it challenging to keep up and many of the evolutionary studies performed before are now significantly out of date. This thesis describes a number of approaches taken to analyse miRNA datasets, harnessing the full potential of currently available data for comparative genomics. These were used, not only to revisit many of the notions in the field with a larger and updated dataset, but also to develop novel strategies that enable a coherent view of miRNA evolution at different evolutionary time-scales. A new tool, described within this thesis, was developed for large-scale, species independent miRNA mapping. An assessment of the evolution of the miRNA reper- toire across species was performed, together with detailed sequence conservation analysis and miRNA family clustering. Phylogenetic profile analysis uncovered in- teresting co-evolution between miRNAs and protein coding genes. The genomic organisation of miRNAs and their conservation across species was also studied, pro- viding detailed conserved synteny maps for miRNAs and proteins across more than 80 species. Finally, at the intra-specific level, I analysed the occurrence of single nucleotide polymorphisms affecting miRNA loci or their predicted target sites. All the tools built and integrated in this research were made available to the community and designed to be easily updated, making it easier to keep up with the data that is constantly being made available. Many aspects of miRNA biology are still being uncovered, and the ability to easily put these findings into an evolutionary context will potentially be useful for the community

The Personal Genome Project-UK, an open access resource of human multi-omics data

Author: Beck Stephan
Berner Alison
Chervova Olga
Conde Lucia
Guerra-Assunção José Afonso
Hamoudi Rifat
Herrero Javier
Jesus Tiago F.
Larose Cadieux Elizabeth
Moghul Ismail
Tian Yuan
Voloshin Vitaly
Webster Amy P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2019
Field of study

Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics

Warwick Research Archives Portal Repository

A blood pressure-associated variant of the SLC39A8 gene influences cellular cadmium accumulation and toxicity.

Author: Alissa
Arthur T. Tucker
Bondjers
Claudio Mauro
Fu Liang Ng
Havlik
José Afonso Guerra-Assunção
Kate Witkowska
Mark J. Caulfield
Meixia Ren
Ruoxin Zhang
Shu Ye
Tellez-Plaza
Varoni
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/07/2016
Field of study

Genome-wide association studies have revealed a relationship between inter-individual variation in blood pressure and the single nucleotide polymorphism rs13107325 in the SLC39A8 gene. This gene encodes the ZIP8 protein which co-transports divalent metal cations, including heavy metal cadmium, the accumulation of which has been associated with increased blood pressure. The polymorphism results in two variants of ZIP8 with either an alanine (Ala) or a threonine (Thr) at residue 391. We investigated the functional impact of this variant on protein conformation, cadmium transport, activation of signalling pathways and cell viability in relation to blood pressure regulation. Following incubation with cadmium, higher intracellular cadmium was detected in cultured human embryonic kidney cells (HEK293) expressing heterologous ZIP8-Ala391, compared with HEK293 cells expressing heterologous ZIP8-Thr391. This Ala391-associated cadmium accumulation also increased the phosphorylation of the signal transduction molecule ERK2, activation of the transcription factor NFκB, and reduced cell viability. Similarly, vascular endothelial cells with the Ala/Ala genotype had higher intracellular cadmium concentration and lower cell viability than their Ala/Thr counterpart following cadmium exposure. These results indicate that the ZIP8 Ala391-to-Thr391 substitution has an effect on intracellular cadmium accumulation and cell toxicity, providing a potential mechanistic explanation for the association of this genetic variant with blood pressure

University of Birmingham Research Portal

Queen Mary Research Online

St George's Online Research Archive

Leicester Research Archive

Whole Genome Sequencing Shows a Low Proportion of Tuberculosis Disease Is Attributable to Known Close Contacts in Rural Malawi.

Author: Clark Taane G
Crampin Amelia C
Glynn Judith R
Guerra-Assunção José Afonso
Houben Rein MGJ
Khan Palwasha
McNerney Ruth
Mwaungulu J Nimrod
Mwaungulu Lorrain K
Mzembe Themba
Parkhill Julian
Sichali Lifted
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

BACKGROUND: The proportion of tuberculosis attributable to transmission from close contacts is not well known. Comparison of the genome of strains from index patients and prior contacts allows transmission to be confirmed or excluded. METHODS: In Karonga District, Malawi, all tuberculosis patients are asked about prior contact with others with tuberculosis. All available strains from culture-positive patients were sequenced. Up to 10 single nucleotide polymorphisms between index patients and their prior contacts were allowed for confirmation, and ≥ 100 for exclusion. The population attributable fraction was estimated from the proportion of confirmed transmissions and the proportion of patients with contacts. RESULTS: From 1997-2010 there were 1907 new culture-confirmed tuberculosis patients, of whom 32% reported at least one family contact and an additional 11% had at least one other contact; 60% of contacts had smear-positive disease. Among case-contact pairs with sequences available, transmission was confirmed from 38% (62/163) smear-positive prior contacts and 0/17 smear-negative prior contacts. Confirmed transmission was more common in those related to the prior contact (42.4%, 56/132) than in non-relatives (19.4%, 6/31, p = 0.02), and in those with more intense contact, to younger index cases, and in more recent years. The proportion of tuberculosis attributable to known contacts was estimated to be 9.4% overall. CONCLUSIONS: In this population known contacts only explained a small proportion of tuberculosis cases. Even those with a prior family contact with smear positive tuberculosis were more likely to have acquired their infection elsewhere

LSHTM Research Online

Enlighten

Identifying mixed Mycobacterium tuberculosis infections from whole genome sequence data.

Author: Banda Louis
Clark Taane G
Crampin Amelia C
Glynn Judith R
Guerra-Assunção José Afonso
Houben Rein MGJ
Mallard Kim
McNerney Ruth
Mzembe Themba
Parkhill Julian
Phelan Jody E
Sobkowiak Benjamin
Viveiros Miguel
Publication venue: BMC Genomics
Publication date: 01/01/2018
Field of study

BACKGROUND: Mixed, polyclonal Mycobacterium tuberculosis infection occurs in natural populations. Developing an effective method for detecting such cases is important in measuring the success of treatment and reconstruction of transmission between patients. Using whole genome sequence (WGS) data, we assess two methods for detecting mixed infection: (i) a combination of the number of heterozygous sites and the proportion of heterozygous sites to total SNPs, and (ii) Bayesian model-based clustering of allele frequencies from sequencing reads at heterozygous sites. RESULTS: In silico and in vitro artificially mixed and known pure M. tuberculosis samples were analysed to determine the specificity and sensitivity of each method. We found that both approaches were effective in distinguishing between pure strains and mixed infection where there was relatively high (> 10%) proportion of a minor strain in the mixture. A large dataset of clinical isolates (n = 1963) from the Karonga Prevention Study in Northern Malawi was tested to examine correlations with patient characteristics and outcomes with mixed infection. The frequency of mixed infection in the population was found to be around 10%, with an association with year of diagnosis, but no association with age, sex, HIV status or previous tuberculosis. CONCLUSIONS: Mixed Mycobacterium tuberculosis infection was identified in silico using whole genome sequence data. The methods presented here can be applied to population-wide analyses of tuberculosis to estimate the frequency of mixed infection, and to identify individual cases of mixed infections. These cases are important when considering the evolution and transmission of the disease, and in patient treatment

LSHTM Research Online

Repositório da Universidade Nova de Lisboa

Integrated analysis of microRNA and mRNA expression and association with HIF binding reveals the complexity of microRNA expression regulation under hypoxia.

Author: Buffa Francesca M
Camps Carme
Choudhry Hani
Enright Anton J
Guerra-Assunção José Afonso
Harris Adrian L
Hatzigeorgiou Artemis G
Mole David R
Ragoussis Jiannis
Reczko Martin
Saini Harpreet K
Tian Ya-Min
Publication venue: Mol Cancer
Publication date: 01/01/2014
Field of study

BACKGROUND: In mammalians, HIF is a master regulator of hypoxia gene expression through direct binding to DNA, while its role in microRNA expression regulation, critical in the hypoxia response, is not elucidated genome wide. Our aim is to investigate in depth the regulation of microRNA expression by hypoxia in the breast cancer cell line MCF-7, establish the relationship between microRNA expression and HIF binding sites, pri-miRNA transcription and microRNA processing gene expression. METHODS: MCF-7 cells were incubated at 1% Oxygen for 16, 32 and 48 h. SiRNA against HIF-1α and HIF-2α were performed as previously published. MicroRNA and mRNA expression were assessed using microRNA microarrays, small RNA sequencing, gene expression microarrays and Real time PCR. The Kraken pipeline was applied for microRNA-seq analysis along with Bioconductor packages. Microarray data was analysed using Limma (Bioconductor), ChIP-seq data were analysed using Gene Set Enrichment Analysis and multiple testing correction applied in all analyses. RESULTS: Hypoxia time course microRNA sequencing data analysis identified 41 microRNAs significantly up- and 28 down-regulated, including hsa-miR-4521, hsa-miR-145-3p and hsa-miR-222-5p reported in conjunction with hypoxia for the first time. Integration of HIF-1α and HIF-2α ChIP-seq data with expression data showed overall association between binding sites and microRNA up-regulation, with hsa-miR-210-3p and microRNAs of miR-27a/23a/24-2 and miR-30b/30d clusters as predominant examples. Moreover the expression of hsa-miR-27a-3p and hsa-miR-24-3p was found positively associated to a hypoxia gene signature in breast cancer. Gene expression analysis showed no full coordination between pri-miRNA and microRNA expression, pointing towards additional levels of regulation. Several transcripts involved in microRNA processing were found regulated by hypoxia, of which DICER (down-regulated) and AGO4 (up-regulated) were HIF dependent. DICER expression was found inversely correlated to hypoxia in breast cancer. CONCLUSIONS: Integrated analysis of microRNA, mRNA and ChIP-seq data in a model cell line supports the hypothesis that microRNA expression under hypoxia is regulated at transcriptional and post-transcriptional level, with the presence of HIF binding sites at microRNA genomic loci associated with up-regulation. The identification of hypoxia and HIF regulated microRNAs relevant for breast cancer is important for our understanding of disease development and design of therapeutic interventions

Springer - Publisher Connector

Oxford University Research Archive

Elsevier - Publisher Connector

PolyTB: a genomic variation map for Mycobacterium tuberculosis.

Author: Clark Taane G
Coll Francesc
Drobniewski Francis
Gagneux Sebastien
Glynn Judith R
Guerra-Assunção José Afonso
Harris David
Hill-Cawthorn Grant
Martin Nigel
McNerney Ruth
Pain Arnab
Parkhill Julian
Perdigão João
Portugal Isabel
Preston Mark
Viveiros Miguel
Publication venue: Elsevier
Publication date: 01/01/2014
Field of study

Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest

LSHTM Research Online

edoc