Search CORE

8,617 research outputs found

A clone-free, single molecule map of the domestic cow (Bos taurus) genome.

Author: Bechner Michael
Goldstein Steve
Hernandez-Ortiz Juan
Medrano Juan F
Pape Louise
Patino Diego
Place Michael
Potamousis Konstantinos
Ravindran Prabu
Rincon Gonzalo
Schwartz David C
Zhou Shiguo
Publication venue: eScholarship, University of California
Publication date: 28/08/2015
Field of study

BackgroundThe cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation.ResultsThe optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts).ConclusionAlignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Image-Processing Techniques for the Creation of Presentation-Quality Astronomical Images

Author: English J.
Jayanne English
Kirk Pu'uohau-Pummill
Lisa M. Frattare
Malin D.
Travis A. Rector
Zoltan G. Levay
Publication venue: 'University of Chicago Press'
Publication date: 01/01/2004
Field of study

The quality of modern astronomical data, the power of modern computers and the agility of current image-processing software enable the creation of high-quality images in a purely digital form. The combination of these technological advancements has created a new ability to make color astronomical images. And in many ways it has led to a new philosophy towards how to create them. A practical guide is presented on how to generate astronomical images from research data with powerful image-processing programs. These programs use a layering metaphor that allows for an unlimited number of astronomical datasets to be combined in any desired color scheme, creating an immense parameter space to be explored using an iterative approach. Several examples of image creation are presented. A philosophy is also presented on how to use color and composition to create images that simultaneously highlight scientific detail and are aesthetically appealing. This philosophy is necessary because most datasets do not correspond to the wavelength range of sensitivity of the human eye. The use of visual grammar, defined as the elements which affect the interpretation of an image, can maximize the richness and detail in an image while maintaining scientific accuracy. By properly using visual grammar, one can imply qualities that a two-dimensional image intrinsically cannot show, such as depth, motion and energy. In addition, composition can be used to engage viewers and keep them interested for a longer period of time. The use of these techniques can result in a striking image that will effectively convey the science within the image, to scientists and to the public.Comment: 104 pages, 38 figures, submitted to A

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Discovery of large genomic inversions using long range information.

Author: Alkan Can
Amemiya Chris T
Antonacci Francesca
Chiatante Giorgia
Eichler Evan E
Eslami Rasekh Marzieh
Miroballo Mattia
Tang Joyce
Ventura Mario
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

BackgroundAlthough many algorithms are now available that aim to characterize different classes of structural variation, discovery of balanced rearrangements such as inversions remains an open problem. This is mainly due to the fact that breakpoints of such events typically lie within segmental duplications or common repeats, which reduces the mappability of short reads. The algorithms developed within the 1000 Genomes Project to identify inversions are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies.ResultsHere we propose a novel algorithm, VALOR, to discover large inversions using new sequencing methods that provide long range information such as 10X Genomics linked-read sequencing, pooled clone sequencing, or other similar technologies that we commonly refer to as long range sequencing. We demonstrate the utility of VALOR using both pooled clone sequencing and 10X Genomics linked-read sequencing generated from the genome of an individual from the HapMap project (NA12878). We also provide a comprehensive comparison of VALOR against several state-of-the-art structural variation discovery algorithms that use whole genome shotgun sequencing data.ConclusionsIn this paper, we show that VALOR is able to accurately discover all previously identified and experimentally validated large inversions in the same genome with a low false discovery rate. Using VALOR, we also predicted a novel inversion, which we validated using fluorescent in situ hybridization. VALOR is available at https://github.com/BilkentCompGen/VALOR

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Springer OAI

SourcererCC: Scaling Code Clone Detection to Big Code

Author: Lopes Cristina V.
Roy Chanchal K.
Saini Vaibhav
Sajnani Hitesh
Svajlenko Jeffrey
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/12/2015
Field of study

Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. SourcererCC uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone. We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks, (1) a large benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (250MLOC) using a standard workstation.Comment: Accepted for publication at ICSE'16 (preprint, unrevised

arXiv.org e-Print Archive

Crossref

A deeply branching thermophilic bacterium with an ancient acetyl-CoA pathway dominates a subsurface ecosystem

Author: A Stamatakis
A Teske
AK Kaster
AL Brioukhanov
Atsushi Toyoda
AY Mulkidjanian
B Boussau
D Wu
DJ Tobler
DS Kelley
E Biegel
E Biegel
E Pierce
EF DeLong
EG Nisbert
F Nikitin
FD Ciccarelli
G Proskurowski
G Wächtershäuser
Gab-Joo Chee
H Hirayama
H Kimura
H Noguchi
H Takami
H Takami
H Takami
Hideki Noguchi
Hideto Takami
I Uchiyama
I Uchiyama
Ikuo Uchiyama
JA Baross
Jack Anthony Gilbert
JB Corliss
JR Torre
K Takai
K Takai
K Takai
KC Costa
Ken Takai
M Toei
M Wu
M Wu
Masahira Hattori
MB Scott
MS Rappé
N Saitou
O White
P Hugenholtz
R Caspi
R Chenna
RF Say
RT Thauer
S Guindon
S Kato
Shinro Nishi
T Murata
T Nunoura
T Nunoura
T Sakiyama
T Shibuya
Takehiko Itoh
Takuro Nunoura
TM Lowe
V Müller
W Martin
W Martin
W Martin
Wataru Arai
Yoshihiro Takaki
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

<div>A nearly complete genome sequence of Candidatus ‘Acetothermum autotrophicum’, a presently uncultivated bacterium in candidate division OP1, was revealed by metagenomic analysis of a subsurface thermophilic microbial mat community. Phylogenetic analysis based on the concatenated sequences of proteins common among 367 prokaryotes suggests that Ca. ‘A. autotrophicum’ is one of the earliest diverging bacterial lineages. It possesses a folate-dependent Wood-Ljungdahl (acetyl-CoA) pathway of CO2 fixation, is predicted to have an acetogenic lifestyle, and possesses the newly discovered archaeal-autotrophic type of bifunctional fructose 1,6-bisphosphate aldolase/phosphatase. A phylogenetic analysis of the core gene cluster of the acethyl-CoA pathway, shared by acetogens, methanogens, some sulfur- and iron-reducers and dechlorinators, supports the hypothesis that the core gene cluster of Ca. ‘A. autotrophicum’ is a particularly ancient bacterial pathway. The habitat, physiology and phylogenetic position of Ca. ‘A. autotrophicum’ support the view that the first bacterial and archaeal lineages were H2-dependent acetogens and methanogenes living in hydrothermal environments. </div

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

JAMSTEC Repository

FigShare

Substrate-specific clades of active marine methylotrophs associated with a phytoplankton bloom in a temperate coastal environment

Author: Boden Rich
Moussard Hélène
Murrell J. C. (J. Colin)
Neufeld Josh D.
Schäfer Hendrik
Publication venue: 'American Society for Microbiology'
Publication date: 10/10/2008
Field of study

Marine microorganisms that consume one-carbon (C1) compounds are poorly described, despite their impact on global climate via an influence on aquatic and atmospheric chemistry. This study investigated marine bacterial communities involved in the metabolism of C1 compounds. These communities were of relevance to surface seawater and atmospheric chemistry in the context of a bloom that was dominated by phytoplankton known to produce dimethylsulfoniopropionate. In addition to using 16S rRNA gene fingerprinting and clone libraries to characterize samples taken from a bloom transect in July 2006, seawater samples from the phytoplankton bloom were incubated with 13C-labeled methanol, monomethylamine, dimethylamine, methyl bromide, and dimethyl sulfide to identify microbial populations involved in the turnover of C1 compounds, using DNA stable isotope probing. The [13C]DNA samples from a single time point were characterized and compared using denaturing gradient gel electrophoresis (DGGE), fingerprint cluster analysis, and 16S rRNA gene clone library analysis. Bacterial community DGGE fingerprints from 13C-labeled DNA were distinct from those obtained with the DNA of the nonlabeled community DNA and suggested some overlap in substrate utilization between active methylotroph populations growing on different C1 substrates. Active methylotrophs were affiliated with Methylophaga spp. and several clades of undescribed Gammaproteobacteria that utilized methanol, methylamines (both monomethylamine and dimethylamine), and dimethyl sulfide. rRNA gene sequences corresponding to populations assimilating 13C-labeled methyl bromide and other substrates were associated with members of the Alphaproteobacteria (e.g., the family Rhodobacteraceae), the Cytophaga-Flexibacter-Bacteroides group, and unknown taxa. This study expands the known diversity of marine methylotrophs in surface seawater and provides a comprehensive data set for focused cultivation and metagenomic analyses in the future

Crossref

PubMed Central

Warwick Research Archives Portal Repository

University of East Anglia digital repository

Recommended from our members

Relationship between latent and rebound viruses in a clinical trial of anti-HIV-1 antibody 3BNC117.

Author: Barton John P
Belblidia Shiraz A
Burke Leah
Butler Allison L
Caskey Marina
Cohen Yehuda Z
Dizon Juan P
Gulick Roy M
Jankovic Mila
Krassnig Lisa
Lorenzi Julio CC
Lu Ching-Lan
Mendoza Pilar
Millard Katrina
Nussenzweig Michel C
Oliveira Thiago Y
Pai Joy
Seaman Michael S
Shimeliovich Irina
Sleckman Christopher
Witmer-Pack Maggi
Publication venue: eScholarship, University of California
Publication date: 01/09/2018
Field of study

A clinical trial was performed to evaluate 3BNC117, a potent anti-HIV-1 antibody, in infected individuals during suppressive antiretroviral therapy and subsequent analytical treatment interruption (ATI). The circulating reservoir was evaluated by quantitative and qualitative viral outgrowth assay (Q2VOA) at entry and after 6 mo. There were no significant quantitative changes in the size of the reservoir before ATI, and the composition of circulating reservoir clones varied in a manner that did not correlate with 3BNC117 sensitivity. 3BNC117 binding site amino acid variants found in rebound viruses preexisted in the latent reservoir. However, only 3 of 217 rebound viruses were identical to 868 latent viruses isolated by Q2VOA and near full-length sequencing. Instead, 63% of the rebound viruses appeared to be recombinants, even in individuals with 3BNC117-resistant reservoir viruses. In conclusion, viruses emerging during ATI in individuals treated with 3BNC117 are not the dominant species found in the circulating latent reservoir, but frequently appear to represent recombinants of latent viruses

eScholarship - University of California