1,086 research outputs found

    A network-based dynamical ranking system for competitive sports

    Full text link
    From the viewpoint of networks, a ranking system for players or teams in sports is equivalent to a centrality measure for sports networks, whereby a directed link represents the result of a single game. Previously proposed network-based ranking systems are derived from static networks, i.e., aggregation of the results of games over time. However, the score of a player (or team) fluctuates over time. Defeating a renowned player in the peak performance is intuitively more rewarding than defeating the same player in other periods. To account for this factor, we propose a dynamic variant of such a network-based ranking system and apply it to professional men's tennis data. We derive a set of linear online update equations for the score of each player. The proposed ranking system predicts the outcome of the future games with a higher accuracy than the static counterparts.Comment: 6 figure

    Shape-based peak identification for ChIP-Seq

    Get PDF
    We present a new algorithm for the identification of bound regions from ChIP-seq experiments. Our method for identifying statistically significant peaks from read coverage is inspired by the notion of persistence in topological data analysis and provides a non-parametric approach that is robust to noise in experiments. Specifically, our method reduces the peak calling problem to the study of tree-based statistics derived from the data. We demonstrate the accuracy of our method on existing datasets, and we show that it can discover previously missed regions and can more clearly discriminate between multiple binding events. The software T-PIC (Tree shape Peak Identification for ChIP-Seq) is available at http://math.berkeley.edu/~vhower/tpic.htmlComment: 12 pages, 6 figure

    Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

    Get PDF
    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

    Developing and applying heterogeneous phylogenetic models with XRate

    Get PDF
    Modeling sequence evolution on phylogenetic trees is a useful technique in computational biology. Especially powerful are models which take account of the heterogeneous nature of sequence evolution according to the "grammar" of the encoded gene features. However, beyond a modest level of model complexity, manual coding of models becomes prohibitively labor-intensive. We demonstrate, via a set of case studies, the new built-in model-prototyping capabilities of XRate (macros and Scheme extensions). These features allow rapid implementation of phylogenetic models which would have previously been far more labor-intensive. XRate's new capabilities for lineage-specific models, ancestral sequence reconstruction, and improved annotation output are also discussed. XRate's flexible model-specification capabilities and computational efficiency make it well-suited to developing and prototyping phylogenetic grammar models. XRate is available as part of the DART software package: http://biowiki.org/DART .Comment: 34 pages, 3 figures, glossary of XRate model terminolog

    Evolutionary Toggling of Vpx/Vpr Specificity Results in Divergent Recognition of the Restriction Factor SAMHD1

    Get PDF
    SAMHD1 is a host restriction factor that blocks the ability of lentiviruses such as HIV-1 to undergo reverse transcription in myeloid cells and resting T-cells. This restriction is alleviated by expression of the lentiviral accessory proteins Vpx and Vpr (Vpx/Vpr), which target SAMHD1 for proteasome-mediated degradation. However, the precise determinants within SAMHD1 for recognition by Vpx/Vpr remain unclear. Here we show that evolution of Vpx/Vpr in primate lentiviruses has caused the interface between SAMHD1 and Vpx/Vpr to alter during primate lentiviral evolution. Using multiple HIV-2 and SIV Vpx proteins, we show that Vpx from the HIV-2 and SIVmac lineage, but not Vpx from the SIVmnd2 and SIVrcm lineage, require the C-terminus of SAMHD1 for interaction, ubiquitylation, and degradation. On the other hand, the N-terminus of SAMHD1 governs interactions with Vpx from SIVmnd2 and SIVrcm, but has little effect on Vpx from HIV-2 and SIVmac. Furthermore, we show here that this difference in SAMHD1 recognition is evolutionarily dynamic, with the importance of the N- and C-terminus for interaction of SAMHD1 with Vpx and Vpr toggling during lentiviral evolution. We present a model to explain how the head-to-tail conformation of SAMHD1 proteins favors toggling of the interaction sites by Vpx/Vpr during this virus-host arms race. Such drastic functional divergence within a lentiviral protein highlights a novel plasticity in the evolutionary dynamics of viral antagonists for restriction factors during lentiviral adaptation to its hosts. © 2013 Fregoso et al

    Ferritins: furnishing proteins with iron

    Get PDF
    Ferritins are a superfamily of iron oxidation, storage and mineralization proteins found throughout the animal, plant, and microbial kingdoms. The majority of ferritins consist of 24 subunits that individually fold into 4-α-helix bundles and assemble in a highly symmetric manner to form an approximately spherical protein coat around a central cavity into which an iron-containing mineral can be formed. Channels through the coat at inter-subunit contact points facilitate passage of iron ions to and from the central cavity, and intrasubunit catalytic sites, called ferroxidase centers, drive Fe2+ oxidation and O2 reduction. Though the different members of the superfamily share a common structure, there is often little amino acid sequence identity between them. Even where there is a high degree of sequence identity between two ferritins there can be major differences in how the proteins handle iron. In this review we describe some of the important structural features of ferritins and their mineralized iron cores and examine in detail how three selected ferritins oxidise Fe2+ in order to explore the mechanistic variations that exist amongst ferritins. We suggest that the mechanistic differences reflect differing evolutionary pressures on amino acid sequences, and that these differing pressures are a consequence of different primary functions for different ferritins

    Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence.</p> <p>Results</p> <p>Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the <it>pseudo</it>-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator.</p> <p>Conclusions</p> <p>This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score.</p

    The Cryptic African Wolf: Canis aureus lupaster Is Not a Golden Jackal and Is Not Endemic to Egypt

    Get PDF
    The Egyptian jackal (Canis aureus lupaster) has hitherto been considered a large, rare subspecies of the golden jackal (C. aureus). It has maintained its taxonomical status to date, despite studies demonstrating morphological similarities to the grey wolf (C. lupus). We have analyzed 2055 bp of mitochondrial DNA from C. a. lupaster and investigated the similarity to C. aureus and C. lupus. Through phylogenetic comparison with all wild wolf-like canids (based on 726 bp of the Cytochrome b gene) we conclusively (100% bootstrap support) place the Egyptian jackal within the grey wolf species complex, together with the Holarctic wolf, the Indian wolf and the Himalayan wolf. Like the two latter taxa, C. a. lupaster seems to represent an ancient wolf lineage which most likely colonized Africa prior to the northern hemisphere radiation. We thus refer to C. a. lupaster as the African wolf. Furthermore, we have detected C. a. lupaster individuals at two localities in the Ethiopian highlands, extending the distribution by at least 2,500 km southeast. The only grey wolf species to inhabit the African continent is a cryptic species for which the conservation status urgently needs assessment

    Spatio-Temporal Characteristics of Global Warming in the Tibetan Plateau during the Last 50 Years Based on a Generalised Temperature Zone - Elevation Model

    Get PDF
    Temperature is one of the primary factors influencing the climate and ecosystem, and examining its change and fluctuation could elucidate the formation of novel climate patterns and trends. In this study, we constructed a generalised temperature zone elevation model (GTEM) to assess the trends of climate change and temporal-spatial differences in the Tibetan Plateau (TP) using the annual and monthly mean temperatures from 1961-2010 at 144 meteorological stations in and near the TP. The results showed the following: (1) The TP has undergone robust warming over the study period, and the warming rate was 0.318°C/decade. The warming has accelerated during recent decades, especially in the last 20 years, and the warming has been most significant in the winter months, followed by the spring, autumn and summer seasons. (2) Spatially, the zones that became significantly smaller were the temperature zones of -6°C and -4°C, and these have decreased 499.44 and 454.26 thousand sq km from 1961 to 2010 at average rates of 25.1% and 11.7%, respectively, over every 5-year interval. These quickly shrinking zones were located in the northwestern and central TP. (3) The elevation dependency of climate warming existed in the TP during 1961-2010, but this tendency has gradually been weakening due to more rapid warming at lower elevations than in the middle and upper elevations of the TP during 1991-2010. The higher regions and some low altitude valleys of the TP were the most significantly warming regions under the same categorizing criteria. Experimental evidence shows that the GTEM is an effective method to analyse climate changes in high altitude mountainous regions
    corecore