Search CORE

2,102 research outputs found

Computational comparison of two mouse draft genomes and the human golden path

Author: Wang J.
Xuan Z.
Zhang M. Q.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/12/2002
Field of study

BACKGROUND: The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, from the public mouse genome sequencing consortium and Celera Genomics, were obtained using different clone libraries and different assembly methods. RESULTS: We present here a critical comparison of the two latest mouse genome assemblies. The utility of the combined genomes is further demonstrated by comparing them with the human 'golden path' and through a subsequent analysis of a resulting conserved sequence element (CSE) database, which allows us to identify over 6,000 potential novel genes and to derive independent estimates of the number of human protein-coding genes. CONCLUSION: The Celera and public mouse assemblies differ in about 10% of the mouse genome. Each assembly has advantages over the other: Celera has higher accuracy in base-pairs and overall higher coverage of the genome; the public assembly, however, has higher sequence quality in some newly finished bacterial artificial chromosome clone (BAC) regions and the data are freely accessible. Perhaps most important, by combining both assemblies, we can get a better annotation of the human genome; in particular, we can obtain the most complete set of CSEs, one third of which are related to known genes and some others are related to other functional genomic regions. More than half the CSEs are of unknown function. From the CSEs, we estimate the total number of human protein-coding genes to be about 40,000. This searchable publicly available online CSEdb will expedite new discoveries through comparative genomics

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Springer

PubMed Central

Boosting with stumps for predicting transcription start sites

Author: Xuan Zhenyu
Zhang Michael Q
Zhao Xiaoyue
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Promoter prediction is a difficult but important problem in gene finding, and it is critical for elucidating the regulation of gene expression. We introduce a new promoter prediction program, CoreBoost, which applies a boosting technique with stumps to select important small-scale as well as large-scale features. CoreBoost improves greatly on locating transcription start sites. We also demonstrate that by further utilizing some tissue-specific information, better accuracy can be achieved

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Experimental and numerical observation of dark and bright breathers in the band gap of a diatomic electrical lattice

Author: Chen Xuan-Lin
Cuevas-Maraver Jesús
English Lars Q.
Kevrekidis Panayotis G.
Li Weilun
Palmero Acebedo Faustino
Publication venue: 'American Physical Society (APS)'
Publication date: 11/11/2018
Field of study

We observe dark and bright intrinsic localized modes (ILMs), also known as discrete breathers, experimentally and numerically in a diatomic-like electrical lattice. The experimental generation of dark ILMs by driving a dissipative lattice with spatially homogenous amplitude is, to our knowledge, unprecedented. In addition, the experimental manifestation of bright breathers within the band gap is also novel in this system. In experimental measurements the dark modes appear just below the bottom of the top branch in frequency. As the frequency is then lowered further into the band gap, the dark ILMs persist, until the nonlinear localization pattern reverses and bright ILMs appear on top of the finite background. Deep into the band gap, only a single bright structure survives in a lattice of 32 nodes. The vicinity of the bottom band also features bright and dark self-localized excitations. These results pave the way for a more systematic study of dark breathers and their bifurcations in diatomic-like chains.VI Plan Propio of the University of Seville, Spain (VI PPITUS)AEI/FEDER, UE MAT2016- 79866-

arXiv.org e-Print Archive

idUS. Depósito de Investigación Universidad de Sevilla

Geochemical and Hf–Nd isotope data of Nanhua rift sedimentary and volcaniclastic rocks indicate a Neoproterozoic continental flood basalt provenance

Author: Li Q.
Li X.
Li Zheng-Xiang
Wang Xuan-Ce
Zhang Q.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

espace@Curtin

Using quality scores and longer reads improves accuracy of Solexa read mapping

Author: Smith Andrew D
Xuan Zhenyu
Zhang Michael Q
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Second-generation sequencing has the potential to revolutionize genomics and impact all areas of biomedical science. New technologies will make re-sequencing widely available for such applications as identifying genome variations or interrogating the oligonucleotide content of a large sample (<it>e.g</it>. ChIP-sequencing). The increase in speed, sensitivity and availability of sequencing technology brings demand for advances in computational technology to perform associated analysis tasks. The Solexa/Illumina 1G sequencer can produce tens of millions of reads, ranging in length from ~25–50 nt, in a single experiment. Accurately mapping the reads back to a reference genome is a critical task in almost all applications. Two sources of information that are often ignored when mapping reads from the Solexa technology are the 3' ends of longer reads, which contain a much higher frequency of sequencing errors, and the base-call quality scores. Results To investigate whether these sources of information can be used to improve accuracy when mapping reads, we developed the RMAP tool, which can map reads having a wide range of lengths and allows base-call quality scores to determine which positions in each read are more important when mapping. We applied RMAP to analyze data re-sequenced from two human BAC regions for varying read lengths, and varying criteria for use of quality scores. RMAP is freely available for downloading at <url>http://rulai.cshl.edu/rmap/</url>. Conclusion Our results indicate that significant gains in Solexa read mapping performance can be achieved by considering the information in 3' ends of longer reads, and appropriately using the base-call quality scores. The RMAP tool we have developed will enable researchers to effectively exploit this information in targeted re-sequencing projects.</p

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Dispersive Analysis on the $f_0(600)$ and $f_0(980)$ Resonances in $\gamma\gamma\to\pi^+\pi^-, \pi^0\pi^0$ Processes

Author: H. Q. Zheng
H. Q. Zheng
J. K. Bienlein
M. Boglione
M. P. Locher
M. R. Pennington
Ou Zhang
T. Barnes
Xuan-Gong Wang
Yu Mao
Z. Y. Zhou
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2009
Field of study

We estimate the di-photon coupling of

f_0(600)

f_0(980)

and

f_2(1270)

resonances in a coupled channel dispersive approach. The

f_0(600)

di-photon coupling is also reinvestigated using a single channel

T

matrix for

\pi\pi

scattering with better analyticity property, and it is found to be significantly smaller than that of a

\bar qq

state. Especially we also estimate the di-photon coupling of the third sheet pole located near

\bar KK

threshold, denoted as

f_0^{III}(980)

. It is argued that this third sheet pole may be originated from a coupled channel Breit-Wigner description of the

f_0(980)

resonance.Comment: 24 pages and 13 eps figures. A nuerical bug in previous version is fixed. Some results changed. References and new figures added. Version to appear in Phys. Rev.

arXiv.org e-Print Archive

Crossref

TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies

Author: Liu Lihua
Xuan Zhenyu
Zhang Michael Q.
Zhao Fang
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

In order to understand gene regulation, accurate and comprehensive knowledge of transcriptional regulatory elements is essential. Here, we report our efforts in building a mammalian Transcriptional Regulatory Element Database (TRED) with associated data analysis functions. It collects cis- and trans-regulatory elements and is dedicated to easy data access and analysis for both single-gene-based and genome-scale studies. Distinguishing features of TRED include: (i) relatively complete genome-wide promoter annotation for human, mouse and rat; (ii) availability of gene transcriptional regulation information including transcription factor binding sites and experimental evidence; (iii) data accuracy is ensured by hand curation; (iv) efficient user interface for easy and flexible data retrieval; and (v) implementation of on-the-fly sequence analysis tools. TRED can provide good training datasets for further genome-wide cis-regulatory element prediction and annotation, assist detailed functional studies and facilitate the decipher of gene regulatory networks (http://rulai.cshl.edu/TRED)

CiteSeerX

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central