3 research outputs found
Xander: employing a novel method for efficient gene-targeted metagenomic assembly
BackgroundMetagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes.ResultsWe present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly approaches, and a protein profile hidden Markov model (HMM) for the gene of interest, as used in standard annotation approaches. These are used to create a novel combined weighted assembly graph. Xander performs both assembly and annotation concomitantly using information incorporated in this graph. We demonstrate the utility of this approach by assembling contigs for one phylogenetic marker gene and for two functional marker genes, first on Human Microbiome Project (HMP)-defined community Illumina data and then on 21 rhizosphere soil metagenomic datasets from three different crops totaling over 800 Gbp of unassembled data. We compared our method to a recently published bulk metagenome assembly method and a recently published gene-targeted assembler and found our method produced more, longer, and higher quality gene sequences.ConclusionXander combines gene assignment with the rapid assembly of full-length or near full-length functional genes from metagenomic data without requiring bulk assembly or post-processing to find genes of interest. HMMs used for assembly can be tailored to the targeted genes, allowing flexibility to improve annotation over generic annotation pipelines. This method is implemented as open source software and is available at https://github.com/rdpstaff/Xander_assembler
Recommended from our members
Cognitive Phenotypes of HIV Defined Using a Novel Data-driven Approach
The current study applied data-driven methods to identify and explain novel cognitive phenotypes of HIV. Methods: 388 people with HIV (PWH) with an average age of 46 (15.8) and median plasma CD4+ T-cell count of 555 copies/mL (79% virally suppressed) underwent cognitive testing and 3T neuroimaging. Demographics, HIV disease variables, and health comorbidities were recorded within three months of cognitive testing/neuroimaging. Hierarchical clustering was employed to identify cognitive phenotypes followed by ensemble machine learning to delineate the features that determined membership in the cognitive phenotypes. Hierarchical clustering identified five cognitive phenotypes. Cluster 1 (n=97) was comprised of individuals with normative performance on all cognitive tests. The remaining clusters were defined by impairment on action fluency (Cluster 2; n=46); verbal learning/memory (Cluster 3; n=73); action fluency and verbal learning/memory (Cluster 4; n=56); and action fluency, verbal learning/memory, and tests of executive function (Cluster 5; n=114). HIV detectability was most common in Cluster 5. Machine learning revealed that polysubstance use, race, educational attainment, and volumes of the precuneus, cingulate, nucleus accumbens, and thalamus differentiated membership in the normal vs. impaired clusters. The determinants of persistent cognitive impairment among PWH receiving suppressive treatment are multifactorial nature. Viral replication after ART plays a role in the causal pathway, but psychosocial factors (race inequities, substance use) merit increased attention as critical determinants of cognitive impairment in the context of ART. Results underscore the need for comprehensive person-centered interventions that go beyond adherence to patient care to achieve optimal cognitive health among PWH