395 research outputs found

    MINER: software for phylogenetic motif identification

    Get PDF
    MINER is web-based software for phylogenetic motif (PM) identification. PMs are sequence regions (fragments) that conserve the overall familial phylogeny. PMs have been shown to correspond to a wide variety of catalytic regions, substrate-binding sites and protein interfaces, making them ideal functional site predictions. The MINER output provides an intuitive interface for interactive PM sequence analysis and structural visualization. The web implementation of MINER is freely available at . Source code is available to the academic community on request

    Predicting functional sites with an automated algorithm suitable for heterogeneous datasets

    Get PDF
    BACKGROUND: In a previous report (La et al., Proteins, 2005), we have demonstrated that the identification of phylogenetic motifs, protein sequence fragments conserving the overall familial phylogeny, represent a promising approach for sequence/function annotation. Across a structurally and functionally heterogeneous dataset, phylogenetic motifs have been demonstrated to correspond to a wide variety of functional site archetypes, including those defined by surface loops, active site clefts, and less exposed regions. However, in our original demonstration of the technique, phylogenetic motif identification is dependent upon a manually determined similarity threshold, prohibiting large-scale application of the technique. RESULTS: In this report, we present an algorithmic approach that determines thresholds without human subjectivity. The approach relies on significant raw data preprocessing to improve signal detection. Subsequently, Partition Around Medoids Clustering (PAMC) of the similarity scores assesses sequence fragments where functional annotation remains in question. The accuracy of the approach is confirmed through comparisons to our previous (manual) results and structural analyses. Triosephosphate isomerase and arginyl-tRNA synthetase are discussed as exemplar cases. A quantitative functional site prediction assessment algorithm indicates that the phylogenetic motif predictions, which require sequence information only, are nearly as good as those from evolutionary trace methods that do incorporate structure. CONCLUSION: The automated threshold detection algorithm has been incorporated into MINER, our web-based phylogenetic motif identification server. MINER is freely available on the web at . Pre-calculated functional site predictions of the COG database and an implementation of the threshold detection algorithm, in the R statistical language, can also be accessed at the website

    Searching for evolutionary distant RNA homologs within genomic sequences using partition function posterior probabilities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of RNA homologs within genomic stretches is difficult when pairwise sequence identity is low or unalignable flanking residues are present. In both cases structure-sequence or profile/family-sequence alignment programs become difficult to apply because of unreliable RNA structures or family alignments. As such, local sequence-sequence alignment programs are frequently used instead. We have recently demonstrated that maximal expected accuracy alignments using partition function match probabilities (implemented in Probalign) are significantly better than contemporary methods on heterogeneous length protein sequence datasets, thus suggesting an affinity for local alignment.</p> <p>Results</p> <p>We create a pairwise RNA-genome alignment benchmark from RFAM families with average pairwise sequence identity up to 60%. Each dataset contains a query RNA aligned to a target RNA (of the same family) embedded in a genomic sequence at least 5K nucleotides long. To simulate common conditions when exact ends of an ncRNA are unknown, each query RNA has 5' and 3' genomic flanks of size 50, 100, and 150 nucleotides. We subsequently compare the error of the Probalign program (adjusted for local alignment) to the commonly used local alignment programs HMMER, SSEARCH, and BLAST, and the popular ClustalW program with zero end-gap penalties. Parameters were optimized for each program on a small subset of the benchmark. Probalign has overall highest accuracies on the full benchmark. It leads by 10% accuracy over SSEARCH (the next best method) on 5 out of 22 families. On datasets restricted to maximum of 30% sequence identity, Probalign's overall median error is 71.2% vs. 83.4% for SSEARCH (P-value < 0.05). Furthermore, on these datasets Probalign leads SSEARCH by at least 10% on five families; SSEARCH leads Probalign by the same margin on two of the fourteen families. We also demonstrate that the Probalign mean posterior probability, compared to the normalized SSEARCH Z-score, is a better discriminator of alignment quality. All datasets and software are available online.</p> <p>Conclusion</p> <p>We demonstrate, for the first time, that partition function match probabilities used for expected accuracy alignment, as done in Probalign, provide statistically significant improvement over current approaches for identifying distantly related RNA sequences in larger genomic segments.</p

    The Influence of Sexual Assault and Fear of Crime on Judgments of Rational Discrimination

    Get PDF
    Female undergraduates rated the rationality of using gender stereotypes in several potentially dangerous situations. We tested whether sexual assault history and fear of crime moderated perceptions of the use of gender stereotypes in public and private settings. Primary results revealed differences in ratings among victims and nonvictims of sexual assault as a function of type of setting. Additionally, fear of crime increased ratings of rationality in nighttime public situations. The implications of these results are discussed in the context of the “rational discrimination” phenomenon (Khan & Lambert, 2001)

    Analysis of non-unique solutions in mean field games

    Get PDF
    This thesis investigates cases when solutions to a mean field game (MFG) are non-unique. The symmetric Markov perfect information N-player game is considered and restricted to finite states and continuous time. The players' transitions are random with a parameter determined by their control. There is a unique joint distribution of the players for the symmetric Markov perfect equilibrium, but there can be multiple solutions to the MFG equations. This thesis focuses on understanding the behaviors of the many MFG solutions for the 2-state case. This thesis explores methods to determine which MFG solution represents the fluid limit trajectories of the N-player system for large populations. This thesis investigates the MFG map which acts on the MFG distributions and outputs a prediction of the population's distribution based on the expected response of any given player. The MFG solutions are exactly the fixed points of the MFG map. The MFG solution that approximates large population trajectories is conjectured to be the only stable point for the MFG map. There is a second concept investigated, social cost, which is the average accumulated cost per player. But as is shown, the social cost is not a good indicator of which MFG solution approximates large population trajectories. A set, called the bifurcation set, is defined by there being some possibility of multiple trajectories of a large population. Another important set is the indifference set, which indicates when the transition rate of the players to a state is positively reinforced by an increase of the empirical distribution of that state. However, numerical results are given, indicating that the fluid limit trajectory may relate to stability of the MFG map. It appears the MFG map is difficult to handle in many ways; stability of the mapping is difficult to show, even in a simple example and there are numerical anomalies such that non-fixed points appear to be numerically stable under rigorous tests

    The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning with Linear Value Function Approximation

    Full text link
    Function approximation is widely used in reinforcement learning to handle the computational difficulties associated with very large state spaces. However, function approximation introduces errors which may lead to instabilities when using approximate dynamic programming techniques to obtain the optimal policy. Therefore, techniques such as lookahead for policy improvement and m-step rollout for policy evaluation are used in practice to improve the performance of approximate dynamic programming with function approximation. We quantitatively characterize, for the first time, the impact of lookahead and m-step rollout on the performance of approximate dynamic programming (DP) with function approximation: (i) without a sufficient combination of lookahead and m-step rollout, approximate DP may not converge, (ii) both lookahead and m-step rollout improve the convergence rate of approximate DP, and (iii) lookahead helps mitigate the effect of function approximation and the discount factor on the asymptotic performance of the algorithm. Our results are presented for two approximate DP methods: one which uses least-squares regression to perform function approximation and another which performs several steps of gradient descent of the least-squares objective in each iteration.Comment: 36 pages, 4 figure
    • …
    corecore