Search CORE

146 research outputs found

Allele-specific miRNA-binding analysis identifies candidate target genes for breast cancer risk

Author: A Gilam
A Kozomara
A Maia
A-r Lee
A-T Maia
AC Antoniou
AD Johnson
AJ Enright
AM Burger
B John
B Li
B Panwar
BL Brewster
BP Lewis
CC Lord
D Betel
D Welter
DM Glubb
DP Bartel
DR Zerbino
ER Gamazon
F Kassie
FJ Couch
G Fehringer
G Sun
G Wang
GP Wagner
HS Lo
J Gong
J Lonsdale
J MacArthur
J Wynendaele
J Xavier
J Zhu
JD French
K Lawrenson
K Michailidou
K Michailidou
K Michailidou
KB Meyer
LJ Chin
LW Wattenberg
M Ghoussaini
M Mele
M Morley
M Wang
MJ Li
ML Freedman
MS Nicoloso
MT Maurano
P Flicek
Q Li
R Liu
S Durinck
S-l Hu
SL Edwards
T Koguchi
T Pastinen
T Pastinen
TG Consortium
TGP Consortium
V Agarwal
W McLaren
Y Chen
Y Hamdi
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Most breast cancer (BC) risk-associated single-nucleotide polymorphisms (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting microRNA (miRNA) genes and/or miRNA binding. To test this, we adapted two miRNA-binding prediction algorithms-TargetScan and miRanda-to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant ( P≤5×10-8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'-untranslated regions of putative miRNA target genes were predicted to alter miRNA::mRNA (messenger RNA) pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and reinforce the role of miRNA-mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.Funding Agency Portuguese Foundation for Science and Technology CRESC ALGARVE 2020 European Union (EU) 303745 Maratona da Saude Award DL 57/2016/CP1361/CT0042 SFRH/BPD/99502/2014 CBMR-UID/BIM/04773/2013 POCI-01-0145-FEDER-022184info:eu-repo/semantics/publishedVersio

Crossref

Sapientia

Assembly complexity of prokaryotic genomes using short reads

Author: A Guénoche
AR Rubinov
B Bollobás
B Haubold
C Smith
Carl Kingsford
D Gusfield
DH Huson
DR Zerbino
Dvan den Broek
E Myers
EW Myers
I Simon
J Butler
J Parkhill
JAA Quitzau
JC Dohm
JP Hutchinson
JP Hutchinson
M Antoniotti
M Margulies
Michael C Schatz
Mihai Pop
MJ Chaisson
MJ Chaisson
MS Waterman
N de Bruijn
N Whiteford
OG Troyanskaya
P Medvedev
PA Pevzner
PA Pevzner
R Barrangou
R Idury
S Batzoglou
T van Aardenne-Ehrenfest
TD Harris
WR Jeck
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes. Results We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for <it>de novo </it>reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages). Conclusions Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</p

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

Comparing de novo assemblers for 454 transcriptome data

Author: A Barakat
A Guffanti
A Papanicolaou
AJ Enright
AL Eveland
AP Weber
B Chevreux
C Cantacessi
C Soderlund
C Sun
D Bellin
D Schwarz
DA Hahn
DR Zerbino
E Ghedin
E Kristiansson
E Meyer
E Novaes
F Cheung
F Cheung
F Roeding
F Zhang
FD Guerrero
G Pertea
H Wang
I Birol
I Milne
J Schmid
JC Vega-Arreguín
JC Vera
JE Allen
JR Monaghan
L Ferguson
M Margulies
M Zagrobelny
Mark L Blaxter
MS Barker
MS Clark
N Palmieri
PK Wall
RE Timme
RL Tatusov
RT Miller
S Altschul
S Jackman
S Zeng
SJ Emrich
SR Swindell
Sujai Kumar
TL Parchman
W Wang
WJ Kent
X Huang
Y Pauchet
Y Pauchet
Z Ning
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Roche 454 pyrosequencing has become a method of choice for generating transcriptome data from non-model organisms. Once the tens to hundreds of thousands of short (250-450 base) reads have been produced, it is important to correctly assemble these to estimate the sequence of all the transcripts. Most transcriptome assembly projects use only one program for assembling 454 pyrosequencing reads, but there is no evidence that the programs used to date are optimal. We have carried out a systematic comparison of five assemblers (CAP3, MIRA, Newbler, SeqMan and CLC) to establish best practices for transcriptome assemblies, using a new dataset from the parasitic nematode <it>Litomosoides sigmodontis</it>. Results Although no single assembler performed best on all our criteria, Newbler 2.5 gave longer contigs, better alignments to some reference sequences, and was fast and easy to use. SeqMan assemblies performed best on the criterion of recapitulating known transcripts, and had more novel sequence than the other assemblers, but generated an excess of small, redundant contigs. The remaining assemblers all performed almost as well, with the exception of Newbler 2.3 (the version currently used by most assembly projects), which generated assemblies that had significantly lower total length. As different assemblers use different underlying algorithms to generate contigs, we also explored merging of assemblies and found that the merged datasets not only aligned better to reference sequences than individual assemblies, but were also more consistent in the number and size of contigs. Conclusions Transcriptome assemblies are smaller than genome assemblies and thus should be more computationally tractable, but are often harder because individual contigs can have highly variable read coverage. Comparing single assemblers, Newbler 2.5 performed best on our trial data set, but other assemblers were closely comparable. Combining differently optimal assemblies from different programs however gave a more credible final product, and this strategy is recommended.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Linkage Mapping and Comparative Genomics Using Next-Generation RAD Sequencing of a Non-Model Organism

Author: AE Van't Hof
Anthony M. Shelton
BE Suzek
C Camacho
CA Zraket
CD Jiggins
Chris D. Jiggins
CL Peichel
CN Sun
D Zerbino
David G. Heckel
DR Bentley
E d'Alencon
EG Pringle
J. Spencer Johnston
John W. Davey
JRG Turner
JW Davey
JZ Zhao
K Emerson
K Yamamoto
Mark L. Blaxter
MR Miller
MS Lampropoulou
NA Baird
NS Talekar
P Beldade
P Colosimo
PA Hohenlohe
Pär K. Ingvarsson
QY Xia
SC Kim
Simon W. Baxter
SW Baxter
SW Baxter
T Fujii
T Maeda
W Pfender
Y Chutimanitsakun
Y Yasukochi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Restriction-site associated DNA (RAD) sequencing is a powerful new method for targeted sequencing across the genomes of many individuals. This approach has broad potential for genetic analysis of non-model organisms including genotype-phenotype association mapping, phylogeography, population genetics and scaffolding genome assemblies through linkage mapping. We constructed a RAD library using genomic DNA from a Plutella xylostella (diamondback moth) backcross that segregated for resistance to the insecticide spinosad. Sequencing of 24 individuals was performed on a single Illumina GAIIx lane (51 base paired-end reads). Taking advantage of the lack of crossing over in homologous chromosomes in female Lepidoptera, 3,177 maternally inherited RAD alleles were assigned to the 31 chromosomes, enabling identification of the spinosad resistance and W/Z sex chromosomes. Paired-end reads for each RAD allele were assembled into contigs and compared to the genome of Bombyx mori (n = 28) using BLAST, revealing 28 homologous matches plus 3 expected fusion/breakage events which account for the difference in chromosome number. A genome-wide linkage map (1292 cM) was inferred with 2,878 segregating RAD alleles inherited from the backcross father, producing chromosome and location specific sequenced RAD markers. Here we have used RAD sequencing to construct a genetic linkage map de novo for an organism that has no previous genome data. Comparative analysis of P. xyloxtella linkage groups with B. mori chromosomes shows for the first time, genetic synteny appears common beyond the Macrolepidoptera. RAD sequencing is a powerful system capable of rapidly generating chromosome specific data for non-model organisms

Public Library of Science (PLOS)

CiteSeerX

Crossref

Adelaide Research & Scholarship

Directory of Open Access Journals

PubMed Central

Texas A&M Repository

Edinburgh Research Explorer

MPG.PuRe

University of Melbourne Institutional Repository

RNA-Seq reveals large quantitative differences between the transcriptomes of outbreak and non-outbreak locusts

Author: A Ayali
A Bouaichi
A Conesa
A Mortazavi
AB Hamouda
AI Tawfik
AI Tawfik
AL Deng
AR McCaffery
BF Hägele
BF Hägele
BP Uvarov
CW Whitfield
DO Ogoyi
DR Zerbino
G Wiesel
GA Miller
GA Sword
H Injeyan
H Li
H Li
HJ Ferenz
I Birol
J Cabrero
JA Dusek
JA Veenstra
JC Dohm
JM Camacho
JR Miller
JT Simpson
K Maeno
L Badisco
L Kang
M Bakkali
M Ruiz-Estevez
MB Hiel Van
ML Anstey
MS Islam
MS Islam
P Davey
P Roessingh
P Roessingh
P Symmons
PE Ellis
PE Ellis
R Martin-Blazquez
R Martín-Blázquez
R Sugahara
R Wu
RL Lester
RR Nayak
S Chen
S Simpson
S Simpson
S Tanaka
S Tanaka
S Tanaka
SD Gillett
SJ Simpson
SM Rogers
SM Rogers
SR Ott
W Guo
X Huang
Y Heifetz
Y Wang
Z Ma
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Outbreaks of locust populations repeatedly devastate economies and ecosystems in large parts of the world. The consequent behavioural shift from solitarious to gregarious and the concomitant changes in the locusts’ biology are of relevant scientific interest. Yet, research on the main locust species has not benefitted from recent advances in genomics. In this first RNA-Seq study on Schistocerca gregaria, we report two transcriptomes, including many novel genes, as well as differential gene expression results. In line with the large biological differences between solitarious and gregarious locusts, almost half of the transcripts are differentially expressed between their central nervous systems. Most of these transcripts are over-expressed in the gregarious locusts, suggesting positive correlations between the levels of activity at the population, individual, tissue and gene expression levels. We group these differentially expressed transcripts by gene function and highlight those that are most likely to be associated with locusts’ phase change either in a species-specific or general manner. Finally, we discuss our findings in the context of population-level and physiological events leading to gregariousness.M. Bakkali wishes to thank the Spanish Ministerio de Ciencia y Tecnología for the for the Ramón y Cajal fellowship and for the BFU2010-16438 grant that supported both this research and the FPI studentship to Rubén Martín Blázquez. We thank Mrs. Pernille Lavgesen for revision of the English language writing of this manuscript. We also thank the editor for the valuable comments on the manuscript

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Granada

Pathoadaptive mutations of Escherichia coli K1 in experimental neonatal systemic infection

Author: A Zelmer
AJ Fabich
AJ McCarthy
Alex J. McCarthy
AM Bolger
C Hesslinger
Catarina Pechincha
D Harvey
David Negus
DL Kasper
DR Zerbino
E Ozyamak
Eric Oswald
F Dalgakiran
F Ørskov
FR Blattner
G Croxall
G Pluschke
G Sawers
GMH Birchenough
H Li
H Li
I Kim
I Saint Girons
J Blanco
JB Robbins
K Rutherford
KA Datsenko
KA Simonsen
LA Witcomb
LD Sarff
M Achtman
M Obata-Yasuoka
MN Price
MP Glode
MP Leatham
MS Schiffer
Muna Anjum
N Mushtaq
N Mushtaq
N Peekhaus
NF Alikhan
Patricia Martin
Peter W. Taylor
RA Polin
RD Berg
Richard A. Stabler
RR Chaudhuri
S Tomlinson
SC Soares
T Conway
T Rausch
TJ Carver
TK Korhonen
X Dong
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

Although Escherichia coli K1 strains are benign commensals in adults, their acquisition at birth by the newborn may result in life-threatening systemic infections, most commonly sepsis and meningitis. Key features of these infections, including stable gastrointestinal (GI) colonization and age-dependent invasion of the bloodstream, can be replicated in the neonatal rat. We previously increased the capacity of a septicemia isolate of E. coli K1 to elicit systemic infection following colonization of the small intestine by serial passage through two-day-old (P2) rat pups. The passaged strain, A192PP (belonging to sequence type 95), induces lethal infection in all pups fed 2–6 x 106 CFU. Here we use whole-genome sequencing to identify mutations responsible for the threefold increase in lethality between the initial clinical isolate and the passaged derivative. Only four single nucleotide polymorphisms (SNPs), in genes (gloB, yjgV, tdcE) or promoters (thrA) involved in metabolic functions, were found: no changes were detected in genes encoding virulence determinants associated with the invasive potential of E. coli K1. The passaged strain differed in carbon source utilization in comparison to the clinical isolate, most notably its inability to metabolize glucose for growth. Deletion of each of the four genes from the E. coli A192PP chromosome altered the proteome, reduced the number of colonizing bacteria in the small intestine and increased the number of P2 survivors. This work indicates that changes in metabolic potential lead to increased colonization of the neonatal GI tract, increasing the potential for translocation across the GI epithelium into the systemic circulation

Public Library of Science (PLOS)

Crossref

LSHTM Research Online

Nottingham Trent Institutional Repository (IRep)

HAL-Inserm

Directory of Open Access Journals

PubMed Central

UCL Discovery

Biology of archaea from a novel family Cuniculiplasmataceae (Thermoplasmata) ubiquitous in hyperacidic environments

Author: AD Baughn
AF Andersson
AL Ducluzeau
AP Yelton
AR Pavlov
BJ Baker
BK Dhillon
C Méndez-García
C Méndez-García
CJ Castelle
CR Pointon
D Gordon
DR Zerbino
DS Jones
E Aizenman
E Desmond
EP Navrocki
F Meyer
FL Sousa
FL Sousa
G Schneider
GJ Dick
GW Tyson
J Goris
J Guo
JP Barnett
K Katoh
K Lassak
K Tamura
KS Makarova
KS Makarova
KS Makarova
LM Rodriguez-R
LX Chen
M Tsuda
MA Kozubal
MH Saier
MS Muntyan
NB Justice
ND Rawlings
O. Fütterer
Olga V. Golyshina
OV Golyshina
OV Golyshina
OV Golyshina
OV Golyshina
R Leinonen
R Mueller
S Kato
SF Altschul
SK Christensen
SW Burge
V Bonnefoy
VB Borisov
VM Markovitz
X Lin
YI Wolf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2016
Field of study

The order Thermoplasmatales (Euryarchaeota) is represented by the most acidophilic organisms known so far that are poorly amenable to cultivation. Earlier culture-independent studies in Iron Mountain (California) pointed at an abundant archaeal group, dubbed 'G-plasma'. We examined the genomes and physiology of two cultured representatives of a Family Cuniculiplasmataceae, recently isolated from acidic (pH 1-1.5) sites in Spain and UK that are 16S rRNA gene sequence-identical with 'G-plasma'. Organisms had largest genomes among Thermoplasmatales (1.87-1.94 Mbp), that shared 98.7-98.8% average nucleotide identities between themselves and 'G-plasma' and exhibited a high genome conservation even within their genomic islands, despite their remote geographical localisations. Facultatively anaerobic heterotrophs, they possess an ancestral form of A-type terminal oxygen reductase from a distinct parental clade. The lack of complete pathways for biosynthesis of histidine, valine, leucine, isoleucine, lysine and proline pre-determines the reliance on external sources of amino acids and hence the lifestyle of these organisms as scavengers of proteinaceous compounds from surrounding microbial community members. In contrast to earlier metagenomics-based assumptions, isolates were S-layer-deficient, non-motile, non-methylotrophic and devoid of iron-oxidation despite the abundance of methylotrophy substrates and ferrous iron in situ, which underlines the essentiality of experimental validation of bioinformatic predictions

Helmholtz Zentrum für Infektionsforschung Repository

Crossref

PubMed Central

Bangor University Research Portal

Capturing the cloud of diversity reveals complexity and heterogeneity of MRSA carriage, infection and transmission.

Genome sequencing is revolutionizing clinical microbiology and our understanding of infectious diseases. Previous studies have largely relied on the sequencing of a single isolate from each individual. However, it is not clear what degree of bacterial diversity exists within, and is transmitted between individuals. Understanding this 'cloud of diversity' is key to accurate identification of transmission pathways. Here, we report the deep sequencing of methicillin-resistant Staphylococcus aureus among staff and animal patients involved in a transmission network at a veterinary hospital. We demonstrate considerable within-host diversity and that within-host diversity may rise and fall over time. Isolates from invasive disease contained multiple mutations in the same genes, including inactivation of a global regulator of virulence and changes in phage copy number. This study highlights the need for sequencing of multiple isolates from individuals to gain an accurate picture of transmission networks and to further understand the basis of pathogenesis.Thanks to Dr Alex O’Neill, University of Leeds and Dr Matthew Ellington, Public Health England for provision of RN4220 and RN4200mutS. We thank the core sequencing and informatics team at the Wellcome Trust Sanger Institute for sequencing of the isolates described in this study. This work was supported by a Medical Research Council Partnership grant (G1001787/1) held between the Department of Veterinary Medicine, University of Cambridge (M.A.H.), the School of Clinical Medicine, University of Cambridge (S.J.P.), the Moredun Research Institute, and the Wellcome Trust Sanger Institute (J.P. and S.J.P). S.J.P. receives support from the NIHR Cambridge Biomedical Research Centre. M.T.G.H., S.R.H. and J.P. were funded by Wellcome Trust grant no. 098051. G.G.R.M. was funded by an MRC studentship.This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/10.1038/ncomms756

Repository@Hull - Worktribe

Crossref

PubMed Central

Edinburgh Research Explorer

Apollo (Cambridge)

University of St. Andrews - Pure

St Andrews Research Repository

Evaluation of next-generation sequencing software in mapping and assembly

Author: A Bashir
A Bateman
AC McHardy
AD Smith
B Langmead
BinBin Wang
C Trapnell
CA Tilford
D Campagna
D Hernandez
D Weese
DR Bentley
DR Zerbino
DS Horner
DW Bryant Jr
ER Mardis
ER Mardis
ES Lander
EW Myers
F Sanger
H Jiang
H Li
H Li
H Li
H Lin
HL Eaves
J Butler
JC Dohm
JC Venter
JO Korbel
JR Miller
JR Miller
JT Simpson
JT Simpson
K Chen
KE Holt
L Engstrand
L Noe
M Margulies
M Pop
M Pop
MC Schatz
MJ Chaisson
ML Metzker
MS Hossain
N Homer
N Malhis
NL Clement
O Morozova
O Morozova
P Flicek
P Flicek
P Medvedev
PA Pevzner
PJ Campbell
PJ Hurd
R Staden
RF Service
RL Warren
RQ Li
RQ Li
Rui Jiang
SC Schuster
SM Rumble
Suying Bao
WingKeung Kwan
WJ Ansorge
WR Jeck
Xu Ma
Y Chen
YJ Kim
You-Qiang Song
Z Ning
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Next-generation high-throughput DNA sequencing technologies have advanced progressively in sequence-based genomic research and novel biological applications with the promise of sequencing DNA at unprecedented speed. These new non-Sanger-based technologies feature several advantages when compared with traditional sequencing methods in terms of higher sequencing speed, lower per run cost and higher accuracy. However, reads from next-generation sequencing (NGS) platforms, such as 454/Roche, ABI/SOLiD and Illumina/Solexa, are usually short, thereby restricting the applications of NGS platforms in genome assembly and annotation. We presented an overview of the challenges that these novel technologies meet and particularly illustrated various bioinformatics attempts on mapping and assembly for problem solving. We then compared the performance of several programs in these two fields, and further provided advices on selecting suitable tools for specific biological applications.published_or_final_versio

Crossref

HKU Scholars Hub

Exploring the Zoonotic Potential of Mycobacterium avium Subspecies paratuberculosis through Comparative Genomics

A comparative genomics approach was utilised to compare the genomes of Mycobacterium avium subspecies paratuberculosis (MAP) isolated from early onset paediatric Crohn's disease (CD) patients as well as Johne's diseased animals. Draft genome sequences were produced for MAP isolates derived from four CD patients, one ulcerative colitis (UC) patient, and two non-inflammatory bowel disease (IBD) control individuals using Illumina sequencing, complemented by comparative genome hybridisation (CGH). MAP isolates derived from two bovine and one ovine host were also subjected to whole genome sequencing and CGH. All seven human derived MAP isolates were highly genetically similar and clustered together with one bovine type isolate following phylogenetic analysis. Three other sequenced isolates (including the reference bovine derived isolate K10) were genetically distinct. The human isolates contained two large tandem duplications, the organisations of which were confirmed by PCR. Designated vGI-17 and vGI-18 these duplications spanned 63 and 109 open reading frames, respectively. PCR screening of over 30 additional MAP isolates (3 human derived, 27 animal derived and one environmental isolate) confirmed that vGI-17 and vGI-18 are common across many isolates. Quantitative real-time PCR of vGI-17 demonstrated that the proportion of cells containing the vGI-17 duplication varied between 0.01 to 15% amongst isolates with human isolates containing a higher proportion of vGI-17 compared to most animal isolates. These findings suggest these duplications are transient genomic rearrangements. We hypothesise that the over-representation of vGI-17 in human derived MAP strains may enhance their ability to infect or persist within a human host by increasing genome redundancy and conferring crude regulation of protein expression across biologically important regions

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

St George's Online Research Archive

University of Melbourne Institutional Repository