Search CORE

719 research outputs found

HLA predictions from long sequence read alignments, streamed directly into HLAminer

Author: Warren René L.
Publication venue
Publication date: 19/09/2022
Field of study

The rapidly changing landscape of sequencing technologies brings new opportunities to genomics research. Longer sequence reads and higher sequence throughput coupled with ever-improving base accuracy and decreasing per-base cost is now making long reads suitable for analyzing polymorphic regions of the human genome, such as those of the human leucocyte antigen (HLA) gene complex. Here I present a simple protocol for predicting HLA signatures from whole genome shotgun (WGS) long sequencing reads, by directly streaming sequence alignments into HLAminer. The method is as simple as running minimap2, it scales with the number of sequences to align, and can be used with any read aligner capable of sam format output without the need to store bulky alignment files to disk. I show how the predictions are robust even with older and less [base] accurate WGS nanopore datasets and relatively low (10X) sequence coverage and present a step-by-step protocol to predict HLA class I and II genes from the long sequencing reads of modern third-generation technologies.Comment: 4 pages, 3 table

arXiv.org e-Print Archive

Targeted Assembly of Short Sequence Reads

Author: H Li
H Li
H Li
JD Freeman
JT Simpson
LD Stein
M Rasmussen
Olivier Lespinet
R Goya
R Li
R Li
R Morin
René L. Warren
RK Nam
RL Warren
RL Warren
RM Durbin
Robert A. Holt
S Nacu
SP Shah
WR Jeck
Publication venue
Publication date: 01/01/2011
Field of study

As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants, by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled strin-gently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming ge-nomic mutations, polymorphism, fusion and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Nature Precedings

ntLink: a toolkit for de novo genome assembly scaffolding and mapping using long reads

Author: Birol Inanc
Coombe Lauren
Nikolic Vladimir
Warren René L.
Wong Johnathan
Publication venue
Publication date: 20/01/2023
Field of study

With the increasing affordability and accessibility of genome sequencing data, de novo genome assembly is an important first step to a wide variety of downstream studies and analyses. Therefore, bioinformatics tools that enable the generation of high-quality genome assemblies in a computationally efficient manner are essential. Recent developments in long-read sequencing technologies have greatly benefited genome assembly work, including scaffolding, by providing long-range evidence that can aid in resolving the challenging repetitive regions of complex genomes. ntLink is a flexible and resource-efficient genome scaffolding tool that utilizes long-read sequencing data to improve upon draft genome assemblies built from any sequencing technologies, including the same long reads. Instead of using read alignments to identify candidate joins, ntLink utilizes minimizer-based mappings to infer how input sequences should be ordered and oriented into scaffolds. Recent improvements to ntLink have added important features such as overlap detection, gap-filling and in-code scaffolding iterations. Here, we present three basic protocols demonstrating how to use each of these new features to yield highly contiguous genome assemblies, while still maintaining ntLink's proven computational efficiency. Further, as we illustrate in the alternate protocols, the lightweight minimizer-based mappings that enable ntLink scaffolding can also be utilized for other downstream applications, such as misassembly detection. With its modularity and multiple modes of execution, ntLink has broad benefit to the genomics community, from genome scaffolding and beyond. ntLink is an open-source project and is freely available from https://github.com/bcgsc/ntLink.Comment: 23 pages, 2 figure

arXiv.org e-Print Archive

Пористые ковалентные орагнические полимеры, используемые в люминисцентных методах анализа

Author: Botnar René M.
Bücker Arno
Kim W. Yong
Manning Warren J.
Spüntrup Elmar
Publication venue: Изд-во ТПУ
Publication date: 01/01/2005
Field of study

В последнее время химическая промышленность развивается колоссальными темпами,вследствие чего активно растёт объём применяемых химических продуктов, которые в свою очередь приводят к загрязнению почвы, водных биологических систем и окружающей среды. Для контроля качества окружающей среды используются различные методы анализа, мы решили рассмотреть один из наиболее быстрых и чувствительных методов, люминесцентный. Поэтому мы решили получить пять различных образцов пористых ковалентных веществ, которые могут быть использованы, как анализаторы при люминесцентном методе

Electronic archive of Tomsk Polytechnic University

Activation of an Endogenous Retrovirus-Associated Long Non-Coding RNA in Human Adenocarcinoma

Author: Brown Scott D.
Gibb Ewan A.
Holt Robert A.
Morin Gregg B.
Robertson Gordon A.
Warren René L.
Wilson Gavin W.
Publication venue
Publication date: 01/01/2015
Field of study

Background Long non-coding RNAs (lncRNAs) are emerging as molecules that significantly impact many cellular processes and have been associated with almost every human cancer. Compared to protein-coding genes, lncRNA genes are often associated with transposable elements, particularly with endogenous retroviral elements (ERVs). ERVs can have potentially deleterious effects on genome structure and function, so these elements are typically silenced in normal somatic tissues, albeit with varying efficiency. The aberrant regulation of ERVs associated with lncRNAs (ERV-lncRNAs), coupled with the diverse range of lncRNA functions, creates significant potential for ERV-lncRNAs to impact cancer biology. Methods We used RNA-seq analysis to identify and profile the expression of a novel lncRNA in six large cohorts, including over 7,500 samples from The Cancer Genome Atlas (TCGA). Results We identified the tumor-specific expression of a novel lncRNA that we have named Endogenous retroViral-associated ADenocarcinoma RNA or ‘EVADR’, by analyzing RNA-seq data derived from colorectal tumors and matched normal control tissues. Subsequent analysis of TCGA RNA-seq data revealed the striking association of EVADR with adenocarcinomas, which are tumors of glandular origin. Moderate to high levels of EVADR were detected in 25 to 53% of colon, rectal, lung, pancreas and stomach adenocarcinomas (mean = 30 to 144 FPKM), and EVADR expression correlated with decreased patient survival (Cox regression; hazard ratio = 1.47, 95% confidence interval = 1.06 to 2.04, P = 0.02). In tumor sites of non-glandular origin, EVADR expression was detectable at only very low levels and in less than 10% of patients. For EVADR, a MER48 ERV element provides an active promoter to drive its transcription. Genome-wide, MER48 insertions are associated with nine lncRNAs, but none of the MER48-associated lncRNAs other than EVADR were consistently expressed in adenocarcinomas, demonstrating the specific activation of EVADR. The sequence and structure of the EVADR locus is highly conserved among Old World monkeys and apes but not New World monkeys or prosimians, where the MER48 insertion is absent. Conservation of the EVADR locus suggests a functional role for this novel lncRNA in humans and our closest primate relatives. Conclusions Our results describe the specific activation of a highly conserved ERV-lncRNA in numerous cancers of glandular origin, a finding with diagnostic, prognostic and therapeutic implications

Springer - Publisher Connector

Simon Fraser University Institutional Repository

Milling plant and soil material in plastic tubes over-estimates carbon and under-estimates nitrogen concentrations

Author: A. H. Jean Robertson
Andrew A. Meharg
CR Warren
D Robinson
David Johnson
DS Powlson
E Salvo-Chirnside
JA Herbst
LB Guo
M Nadeem
NS Lameck
PH Bellamy
R Lal
RE Artz
René van der Wal
Robin J. Pakeman
RR Jimenez
Sarah J. Woodin
SE Allen
SJ Kalembasa
Stuart W. Smith
YK Soon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Peer reviewedPostprin

Aberdeen University Research

Queen's University Belfast Research Portal

Crossref

University of Brighton Research Portal

The University of Manchester - Institutional Repository

NORA - Norwegian Open Research Archives

The Sensitivity of Massively Parallel Sequencing for Detecting Candidate Infectious Agents Associated with Human Tissue

Author: Chénard Caroline
Freeman J. Douglas
Friedman Jan M.
Gustavsen Julia A.
Holt Robert A.
Moore Richard A.
Suttle Curtis A.
Warren René L.
Zhao Yongjun
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Massively parallel sequencing technology now provides the opportunity to sample the transcriptome of a given tissue comprehensively. Transcripts at only a few copies per cell are readily detectable, allowing the discovery of low abundance viral and bacterial transcripts in human tissue samples. Here we describe an approach for mining large sequence data sets for the presence of microbial sequences. Further, we demonstrate the sensitivity of this approach by sequencing human RNA-seq libraries spiked with decreasing amounts of an RNA-virus. At a modest depth of sequencing, viral transcripts can be detected at frequencies less than 1 in 1,000,000. With current sequencing platforms approaching outputs of one billion reads per run, this is a highly sensitive method for detecting putative infectious agents associated with human tissues

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

The cumate gene-switch: a system for regulated expression in mammalian cells

Author: Bourget Lucie
Broussau Sophie
Caron Antoine W
Guilbault Claire
Koutroumanis Maria
Lamoureux Linda
Lo Rita
Malenfant Félix
Massie Bernard
Mullick Alaka
Pilotte Amelie
Warren René
Xu Yan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: A number of expression systems have been developed where transgene expression can be regulated. They all have specific characteristics making them more suitable for certain applications than for others. Since some applications require the regulation of several genes, there is a need for a variety of independent yet compatible systems. RESULTS: We have used the regulatory mechanisms of bacterial operons (cmt and cym) to regulate gene expression in mammalian cells using three different strategies. In the repressor configuration, regulation is mediated by the binding of the repressor (CymR) to the operator site (CuO), placed downstream of a strong constitutive promoter. Addition of cumate, a small molecule, relieves the repression. In the transactivator configuration, a chimaeric transactivator (cTA) protein, formed by the fusion of CymR with the activation domain of VP16, is able to activate transcription when bound to multiple copies of CuO, placed upstream of the CMV minimal promoter. Cumate addition abrogates DNA binding and therefore transactivation by cTA. Finally, an adenoviral library of cTA mutants was screened to identify a reverse cumate activator (rcTA), which activates transcription in the presence rather than the absence of cumate. CONCLUSION: We report the generation of a new versatile inducible expression system

NRC Publications Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dépôt Institutionnel Numérique

Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach

Author: Bainbridge Matthew N
Delaney Allen
Go Anne
Griffith Malachi
Hickenbotham Matthew
Hirst Martin
Jones Steven JM
Magrini Vincent
Mardis Elaine R
Marra Marco A
Romanuik Tammy
Sadar Marianne D
Siddiqui Asim S
Warren René L
Zeng Thomas
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: High throughput sequencing-by-synthesis is an emerging technology that allows the rapid production of millions of bases of data. Although the sequence reads are short, they can readily be used for re-sequencing. By re-sequencing the mRNA products of a cell, one may rapidly discover polymorphisms and splice variants particular to that cell. RESULTS: We present the utility of massively parallel sequencing by synthesis for profiling the transcriptome of a human prostate cancer cell-line, LNCaP, that has been treated with the synthetic androgen, R1881. Through the generation of approximately 20 megabases (MB) of EST data, we detect transcription from over 10,000 gene loci, 25 previously undescribed alternative splicing events involving known exons, and over 1,500 high quality single nucleotide discrepancies with the reference human sequence. Further, we map nearly 10,000 ESTs to positions on the genome where no transcription is currently predicted to occur. We also characterize various obstacles with using sequencing by synthesis for transcriptome analysis and propose solutions to these problems. CONCLUSION: The use of high-throughput sequencing-by-synthesis methods for transcript profiling allows the specific and sensitive detection of many of a cell's transcripts, and also allows the discovery of high quality base discrepancies, and alternative splice variants. Thus, this technology may provide an effective means of understanding various disease states, discovering novel targets for disease treatment, and discovery of novel transcripts

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central