9 research outputs found

    Analysis of HIV-1 quasispecies sequences generated by High Throughput Sequencing (HTS) using HIVE

    Get PDF
    The high level of genetic variability of Human Immunodeficiency Virus type 1 (HIV-1) is caused by the low fidelity of its replication machinery. This leads to evolution of swarm-like viral populations often described as quasispecies. High throughput sequencing (HTS) technology provides higher resolution over Sanger sequencing, enabling detection of low frequency variant genomes. However, quasispecies analysis is still a challenge due to the systematic noise, introduced by HTS technology. This leads to the increase in type I errors (also known as false positives) and the underlying genetic diversity, which can lead to mathematically insolvable type II errors (also known as false negatives). We have developed a pipeline using the tools in the High-performance Integrated Virtual Environment (HIVE), an HTS platform designed for big data analysis and management, to analyze viral populations within each sample and identify their subtype classification and recombination patterns of recombinants. RNA was extracted from 70 plasma samples of chronic HIV-1 infected patients. The 3’ half genomes of HIV-1 were amplified using RT-PCR and PCR products were sequenced using Illumina MiSeq. The paired end reads for each sample were assembled using Geneious software and analyzed for presence of HIV-1 quasispecies using HIVE tools. Subtype analysis of 70 samples using Geneious software identified 17 A1s, 4 Bs, 30 Cs, 1 D, 6 CRF02_AG, and 12 unique recombinant forms (URFs). Additionally, we found up to 178 ambiguous bases in the consensus sequences from 41 viral samples (58.6%), suggesting the presence of viral subpopulations. However, Geneious could not determine the major viral populations in each sample. We analyzed the same HTS reads using the HIV-1 quasispecies analysis pipeline and found one predominant population in 11 samples (15.7 %), two to ten distinct populations in 45 samples (64.3%), 11-20 in 13 samples (18.16%), and 26 in one sample (1.4 %). Interestingly, two equally major viral populations that were not detected by Geneious were identified in five samples (7.1%) by HIVE. The HIV-1 quasispecies analysis pipeline is reliable and more sensitive in its ability to identify distinct viral populations and the recombination patterns not identified by the Geneious software

    BioXpress: an integrated RNA-seq-derived gene expression database for pan-cancer analysis.

    Get PDF
    BioXpress is a gene expression and cancer association database in which the expression levels are mapped to genes using RNA-seq data obtained from The Cancer Genome Atlas, International Cancer Genome Consortium, Expression Atlas and publications. The BioXpress database includes expression data from 64 cancer types, 6361 patients and 17 469 genes with 9513 of the genes displaying differential expression between tumor and normal samples. In addition to data directly retrieved from RNA-seq data repositories, manual biocuration of publications supplements the available cancer association annotations in the database. All cancer types are mapped to Disease Ontology terms to facilitate a uniform pan-cancer analysis. The BioXpress database is easily searched using HUGO Gene Nomenclature Committee gene symbol, UniProtKB/RefSeq accession or, alternatively, can be queried by cancer type with specified significance filters. This interface along with availability of pre-computed downloadable files containing differentially expressed genes in multiple cancers enables straightforward retrieval and display of a broad set of cancer-related genes

    Malignant adenomyoepithelioma of the breast: a case report with review of literature

    No full text
    Adenomyoepitheliomas are uncommon breast tumours. By definition they have a prominent component of myoepithelial cells, in addition to glandular elements lined by epithelial cells. Malignant adenomyoepithelioma of the breast is even more rare, characterised by malignant proliferation of epithelial and myoepithelial cells that show characteristic histological and immunohistochemical features. Only 11 cases have been reported to date. A case of malignant adenomyoepithelioma of the breast is reported

    Streamlined Subpopulation, Subtype, and Recombination Analysis of HIV-1 Half-Genome Sequences Generated by High-Throughput Sequencing

    Get PDF
    Copyright © 2020 Hora et al. High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3\u27-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (∼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS.IMPORTANCE The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS)

    Baseline human gut microbiota profile in healthy people and standard reporting template

    No full text
    A comprehensive knowledge of the types and ratios of microbes that inhabit the healthy human gut is necessary before any kind of pre-clinical or clinical study can be performed that attempts to alter the microbiome to treat a condition or improve therapy outcome. To address this need we present an innovative scalable comprehensive analysis workflow, a healthy human reference microbiome list and abundance profile (GutFeelingKB), and a novel Fecal Biome Population Report (FecalBiome) with clinical applicability. GutFeelingKB provides a list of 157 organisms (8 phyla, 18 classes, 23 orders, 38 families, 59 genera and 109 species) that forms the baseline biome and therefore can be used as healthy controls for studies related to dysbiosis. This list can be expanded to 863 organisms if closely related proteomes are considered. The incorporation of microbiome science into routine clinical practice necessitates a standard report for comparison of an individual\u27s microbiome to the growing knowledgebase of “normal” microbiome data. The FecalBiome and the underlying technology of GutFeelingKB address this need. The knowledgebase can be useful to regulatory agencies for the assessment of fecal transplant and other microbiome products, as it contains a list of organisms from healthy individuals. In addition to the list of organisms and their abundances, this study also generated a collection of assembled contiguous sequences (contigs) of metagenomics dark matter. In this study, metagenomic dark matter represents sequences that cannot be mapped to any known sequence but can be assembled into contigs of 10,000 nucleotides or higher. These sequences can be used to create primers to study potential novel organisms
    corecore