Search CORE

69 research outputs found

The khmer software package: enabling efficient nucleotide sequence analysis

Author: Alameldin Hussein
Awad Sherine
Boucher Elmar
Brown C. Titus
Caldwell Adam
Cartwright Reed
Charbonneau Amanda
Constantinides Bede
Crusoe Michael
Edvenson Greg
Fay Scott
Fenton Jacob
Fenzl Thomas
Fish Jordan
Garcia-Gutierrez Leonor
Garland Phillip
Gluck Jonathan
González Iván
Guermond Sarah
Guo Jiarong
Gupta Aditi
Herr Joshua
Howe Adina
Howe Adina
Hyer Alex
Härpfer Andreas
Irber Luiz
Kidd Rhys
Lin David
Lippi Justin
Mansour Tamer
McA'Nulty Pamela
McDonald Erin
Mizzi Jessica
Murray Kevin
Nahum Joshua
Nanlohy Kaben
Nederbragt Alexander
Ortiz-Zuazaga Humberto
Ory Jeramia
Pell Jason
Pepe-Ranney Charles
Russ Zachary
Schwarz Erich
Scott Camille
Seaman Josiah
Sievert Scott
Simpson Jared
Skennerton Connor
Spencer James
Srinivasan Ramakrishnan
Standage Daniel
Stapleton James
Stein Joe
Steinman Susan
Taylor Benjamin
Tremble Will
Wiencko Heather
Wright Michael
Wyss Brian
Zhang Qingpeng
zyme en
Publication venue: Iowa State University Digital Repository
Publication date: 25/09/2015
Field of study

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/

Digital Repository @ Iowa State University (ISU)

PubMed Central

eScholarship - University of California

The khmer software package: enabling efficient nucleotide sequence analysis [version 1; referees: 2 approved, 1 approved with reservations]

Author: Alameldin Hussien F.
Awad Sherine
Boucher Elmar
Brown C. Titus
Caldwell Adam
Cartwright Reed
Charbonneau Amanda
Constantinides Bede
Crusoe Michael R.
Edvenson Greg
Fay Scott
Fenton Jacob
Fenzl Thomas
Fish Jordan
Garcia-Gutierrez Leonor
Garland Phillip
Gluck Jonathan
González Iván
Guermond Sarah
Guo Jiarong
Gupta Aditi
Herr Joshua R.
Howe Adina
Hyer Alex
Härpfer Andreas
Irber Luiz
Kidd Rhys
Lin David
Lippi Justin
Mansour Tamer
McA\u27Nulty Pamela
McDonald Eric
Mizzi Jessica
Murray Kevin D.
Nahum Joshua R.
Nanlohy Kaben
Nederbragt Alexander Johan
Ortiz-Zuazaga Humberto
Ory Jeramia
Pell Jason
Pepe-Ranney Charles
Russ Zachary N.
Schwarz Erich
Scott Camille
Seaman Josiah
Sievert 38 Scott
Simpson Jared
Skennerton Connor T.
Spencer James
Srinivasan Ramakrishnan
Standage Daniel
Stapleton James A.
Stein Joe
Steinman Susan R.
Taylor Benjamin
Trimble Will
Wiencko Heather L.
Wright Michael
Wyss Brian
Zhang Qingpeng
zyme en
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 15/10/2015
Field of study

NCBI’s virus discovery codeathon: building “FIVE” —the Federated Index of Viral Experiments API index

Viruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources. To this end, during this three-day continuation of the Virus Hunting Toolkit codeathon series (VHT-2), a new integrated and federated viral index was elaborated. This Federated Index of Viral Experiments (FIVE) integrates pre-existing and novel functional and taxonomy annotations and virus–host pairings. Variability in the context of viral genomic diversity is often overlooked in virus databases. As a proof-of-concept, FIVE was the first attempt to include viral genome variation for HIV, the most well-studied human pathogen, through viral genome diversity graphs. As per the publication of this manuscript, FIVE is the first implementation of a virus-specific federated index of such scope. FIVE is coded in BigQuery for optimal access of large quantities of data and is publicly accessible. Many projects of database or index federation fail to provide easier alternatives to access or query information. To this end, a Python API query system was developed to enhance the accessibility of FIVE

Multidisciplinary Digital Publishing Institute

Enlighten