63,971 research outputs found
Statistical significance of normalized global alignment
The comparison of homologous proteins from different species is a first step toward a function assignment and a reconstruction of the species evolution. Though local alignment is mostly used for this purpose, global alignment is important for constructing multiple alignments or phylogenetic trees. However, statistical significance of global alignments is not completely clear, lacking a specific statistical model to describe alignments or depending on computationally expensive methods like Z-score. Recently we presented a normalized global alignment, defined as the best compromise between global alignment cost and length, and showed that this new technique led to better classification results than Z-score at a much lower computational cost. However, it is necessary to analyze the statistical significance of the normalized global alignment in order to be considered a completely functional algorithm for protein alignment.
Experiments with unrelated proteins extracted from the SCOP ASTRAL database showed that normalized global alignment scores can be fitted to a log-normal distribution. This fact, obtained without any theoretical support, can be used to derive statistical significance of normalized global alignments. Results are summarized in a table with fitted parameters for different scoring schemes
A new analysis of quasar polarisation alignments
We propose a new method to analyse the alignment of optical polarisation
vectors from quasars. This method leads to a definition of intrinsic preferred
axes and to a determination of the probability that the
distribution of polarisation directions is random. This probability is found to
be as low as 0.003% for one of the regions of redshift.Comment: 20 pages, 9 figure
Polarization alignments of radio quasars in JVAS/CLASS surveys
We test the hypothesis that the polarization vectors of flat-spectrum radio
sources (FSRS) in the JVAS/CLASS 8.4-GHz surveys are randomly oriented on the
sky. The sample with robust polarization measurements is made of objects
and redshift information is known for of them. We performed two
statistical analyses: one in two dimensions and the other in three dimensions
when distance is available. We find significant large-scale alignments of
polarization vectors for samples containing only quasars (QSO) among the
varieties of FSRS's. While these correlations prove difficult to explain either
by a physical effect or by biases in the dataset, the fact that the QSO's which
have significantly aligned polarization vectors are found in regions of the sky
where optical polarization alignments were previously found is striking.Comment: 13 pages, 9 figures, submitted to MNRA
Statistical Power, the Bispectrum and the Search for Non-Gaussianity in the CMB Anisotropy
We use simulated maps of the cosmic microwave background anisotropy to
quantify the ability of different statistical tests to discriminate between
Gaussian and non-Gaussian models. Despite the central limit theorem on large
angular scales, both the genus and extrema correlation are able to discriminate
between Gaussian models and a semi-analytic texture model selected as a
physically motivated non-Gaussian model. When run on the COBE 4-year CMB maps,
both tests prefer the Gaussian model. Although the bispectrum has comparable
statistical power when computed on the full sky, once a Galactic cut is imposed
on the data the bispectrum loses the ability to discriminate between models.
Off-diagonal elements of the bispectrum are comparable to the diagonal elements
for the non-Gaussian texture model and must be included to obtain maximum
statistical power.Comment: Accepted for publication in ApJ; 20 pages, 6 figures, uses AASTeX
v5.
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies
Existing sequence alignment algorithms use heuristic scoring schemes which
cannot be used as objective distance metrics. Therefore one relies on measures
like the p- or log-det distances, or makes explicit, and often simplistic,
assumptions about sequence evolution. Information theory provides an
alternative, in the form of mutual information (MI) which is, in principle, an
objective and model independent similarity measure. MI can be estimated by
concatenating and zipping sequences, yielding thereby the "normalized
compression distance". So far this has produced promising results, but with
uncontrolled errors. We describe a simple approach to get robust estimates of
MI from global pairwise alignments. Using standard alignment algorithms, this
gives for animal mitochondrial DNA estimates that are strikingly close to
estimates obtained from the alignment free methods mentioned above. Our main
result uses algorithmic (Kolmogorov) information theory, but we show that
similar results can also be obtained from Shannon theory. Due to the fact that
it is not additive, normalized compression distance is not an optimal metric
for phylogenetics, but we propose a simple modification that overcomes the
issue of additivity. We test several versions of our MI based distance measures
on a large number of randomly chosen quartets and demonstrate that they all
perform better than traditional measures like the Kimura or log-det (resp.
paralinear) distances. Even a simplified version based on single letter Shannon
entropies, which can be easily incorporated in existing software packages, gave
superior results throughout the entire animal kingdom. But we see the main
virtue of our approach in a more general way. For example, it can also help to
judge the relative merits of different alignment algorithms, by estimating the
significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia
Large Scale Cosmological Anomalies and Inhomogeneous Dark Energy
A wide range of large scale observations hint towards possible modifications
on the standard cosmological model which is based on a homogeneous and
isotropic universe with a small cosmological constant and matter. These
observations, also known as "cosmic anomalies" include unexpected Cosmic
Microwave Background perturbations on large angular scales, large dipolar
peculiar velocity flows of galaxies ("bulk flows"), the measurement of
inhomogenous values of the fine structure constant on cosmological scales
("alpha dipole") and other effects. The presence of the observational anomalies
could either be a large statistical fluctuation in the context of {\lcdm} or it
could indicate a non-trivial departure from the cosmological principle on
Hubble scales. Such a departure is very much constrained by cosmological
observations for matter. For dark energy however there are no significant
observational constraints for Hubble scale inhomogeneities. In this brief
review I discuss some of the theoretical models that can naturally lead to
inhomogeneous dark energy, their observational constraints and their potential
to explain the large scale cosmic anomalies.Comment: 42 pages, 15 figures, Invited Review published in 'Galaxies' at
http://www.mdpi.com/2075-4434/2/1/2
An optimized TOPS+ comparison method for enhanced TOPS models
This article has been made available through the Brunel Open Access Publishing Fund.Background
Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure comparison based on our TOPS+ model (Topological descriptions of Protein Structures Enhanced with Ligand Information). In this paper we show how these results can be significantly improved using parameter optimization, and we call the resulting optimised TOPS+ method as advanced TOPS+ comparison method i.e. advTOPS+.
Results
We have developed a TOPS+ string model as an improvement to the TOPS [1-3] graph model by considering loops as secondary structure elements (SSEs) in addition to helices and strands, representing ligands as first class objects, and describing interactions between SSEs, and SSEs and ligands, by incoming and outgoing arcs, annotating SSEs with the interaction direction and type. Benchmarking results of an all-against-all pairwise comparison using a large dataset of 2,620 non-redundant structures from the PDB40 dataset [4] demonstrate the biological significance, in terms of SCOP classification at the superfamily level, of our TOPS+ comparison method.
Conclusions
Our advanced TOPS+ comparison shows better performance on the PDB40 dataset [4] compared to our basic TOPS+ method, giving 90 percent accuracy for SCOP alpha+beta; a 6 percent increase in accuracy compared to the TOPS and basic TOPS+ methods. It also outperforms the TOPS, basic TOPS+ and SSAP comparison methods on the Chew-Kedem dataset [5], achieving 98 percent accuracy. Software Availability: The TOPS+ comparison server is available at http://balabio.dcs.gla.ac.uk/mallika/WebTOPS/.This article is available through the Brunel Open Access Publishing Fun
- …