438 research outputs found

    Improving location prediction services for new users with probabilistic latent semantic analysis

    No full text
    Location prediction systems that attempt to determine the mobility patterns of individuals in their daily lives have become increasingly common in recent years. Approaches to this prediction task include eigenvalue decomposition [5], non-linear time series analysis of arrival times [10], and variable order Markov models [1]. However, these approachesall assume sufficient sets of training data. For new users, by definition, this data is typically not available, leading to poor predictive performance. Given that mobility is a highly personal behaviour, this represents a significant barrier to entry. Against this background, we present a novel framework to enhance prediction using information about the mobility habits of existing users. At the core of the framework is a hierarchical Bayesian model, a type of probabilistic semantic analysis [7], representing the intuition that the temporal features of the new user’s location habits are likely to be similar to those of an existing user in the system. We evaluate this framework on the real life location habits of 38 users in the Nokia Lausanne dataset, showing that accuracy is improved by 16%, relative to the state of the art, when predicting the next location of new users

    Recurring cluster and operon assembly for Phenylacetate degradation genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A large number of theories have been advanced to explain why genes involved in the same biochemical processes are often co-located in genomes. Most of these theories have been dismissed because empirical data do not match the expectations of the models. In this work we test the hypothesis that cluster formation is most likely due to a selective pressure to gradually co-localise protein products and that operon formation is not an inevitable conclusion of the process.</p> <p>Results</p> <p>We have selected an exemplar well-characterised biochemical pathway, the phenylacetate degradation pathway, and we show that its complex history is only compatible with a model where a selective advantage accrues from moving genes closer together. This selective pressure is likely to be reasonably weak and only twice in our dataset of 102 genomes do we see independent formation of a complete cluster containing all the catabolic genes in the pathway. Additionally, <it>de novo </it>clustering of genes clearly occurs repeatedly, even though recombination should result in the random dispersal of such genes in their respective genomes. Interspecies gene transfer has frequently replaced <it>in situ </it>copies of genes resulting in clusters that have similar content but very different evolutionary histories.</p> <p>Conclusion</p> <p>Our model for cluster formation in prokaryotes, therefore, consists of a two-stage selection process. The first stage is selection to move genes closer together, either because of macromolecular crowding, chromatin relaxation or transcriptional regulation pressure. This proximity opportunity sets up a separate selection for co-transcription.</p

    The tree of genomes: An empirical comparison of genome-phylogeny reconstruction methods

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the past decade or more, the emphasis for reconstructing species phylogenies has moved from the analysis of a single gene to the analysis of multiple genes and even completed genomes. The simplest method of scaling up is to use familiar analysis methods on a larger scale and this is the most popular approach. However, duplications and losses of genes along with horizontal gene transfer (HGT) can lead to a situation where there is only an indirect relationship between gene and genome phylogenies. In this study we examine five widely-used approaches and their variants to see if indeed they are more-or-less saying the same thing. In particular, we focus on Conditioned Reconstruction as it is a method that is designed to work well even if HGT is present.</p> <p>Results</p> <p>We confirm a previous suggestion that this method has a systematic bias. We show that no two methods produce the same results and most current methods of inferring genome phylogenies produce results that are significantly different to other methods.</p> <p>Conclusion</p> <p>We conclude that genome phylogenies need to be interpreted differently, depending on the method used to construct them.</p

    On the desirability of models for inferring genome phylogenies

    Get PDF
    Genomes are clearly suited for inferring common ancestry and for understanding ancestorâdescendent relationships and interspecies gene transfer. Genomic evolutionary models can tell us a great deal about the processes that drive genome evolution, the mutational and selective pressures that lead to the genesis of biochemical pathways and operons, and the nature and extent of lateral gene transfer (LGT). Simultaneously, a robust phylogeny can be constructed that depicts the evolutionary relationships of the organisms in which the genomes are found. Several approaches have been employed to infer species phylogenies at the genome level. In general terms, these can be divided into ad hoc summary statistics based on genome content, the use of concatenated alignments and the use of consensus methods (i.e. phylogenetic supertrees [1]

    Maintenance mass of mature beef cows

    Get PDF
    Weights, heights and compositional data were collected on cattle resulting from a four breed diallel crossbreeding study. Breeds involved were: Angus, Hereford, Holstein, and Brown Swiss. Cattle were raised and maintained at two Iowa State University Research farms;Objectives of the study were: to examine the growth pattern of a population of cows differing widely in productive characteristics and size, to examine the relative tissue growth of steers from the same population, and hypothesize about the effect of size and composition differences in the cow population and the subsequent effect on production efficiency;Growth pattern was quantified through nonlinear regression of cow weights from birth through six years of age;Asymptotic mature weight was significantly affected by breed of sire, breed of dam, and calfhood management. Crossbred cows were 3.4% heavier at maturity than straightbred cows. Cows with dairy dams and beef sires were 2.7% heavier at maturity than the reciprocal cross. Rate of maturing was 3.7% greater for crossbred cows than straightbred cows. Cows with beef sires and dairy dams matured 3.9% more rapidly than the reciprocal cross. Correlation of maturing rate and mature size was -.60;Large breed differences were evident in height measurements. Heterosis for height at birth, 180 days, 365 days and maturity was 0.0%, 0.8%, 1.2% and 0.3%, respectively;Large breed and management differences were evident in the tissue composition of steers. Leaner breed combinations were generally associated with larger mature cow size;The study indicates that differences in mass of metabolically active tissue are greater than would be indicated by mature cow weight alone. Maintenance mass of larger cows of dairy breeding appear underestimated relative to the smaller, fatter beef breeds

    The causes of protein evolutionary rate variation

    Get PDF
    The rate of protein evolution varies more than 1000-fold and, for the past 30 years, it was thought that the rate was determined by protein function. Drummond and co-workers have now shown that a single factor underlying mRNA expression, protein abundance and synonymous codon usage is the chief causal agent of protein evolutionary rate in yeast. It will be interesting to see whether this is shown to be a universal rule for all biological systems

    MultiPhyl: a high-throughput phylogenomics webserver using distributed computing

    Get PDF
    With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php

    New methods ring changes for the tree of life

    Get PDF
    Relationships among prokaryotes and the origin of eukaryotes have both proven controversial, with results depending upon the gene sequences and methods used. Extensive horizontal gene transfer is one possible reason why inferring such deep phylogenetic relationships is difficult. In two recent papers, Lake and Rivera introduce new methods that can be used to reconstruct the genomic tree in the presence of horizontal gene transfers, but which suggest that a ring rather than a tree is a better representation of some parts of the history of life on Earth

    New methods ring changes for the tree of life

    Get PDF
    Relationships among prokaryotes and the origin of eukaryotes have both proven controversial, with results depending upon the gene sequences and methods used. Extensive horizontal gene transfer is one possible reason why inferring such deep phylogenetic relationships is difficult. In two recent papers, Lake and Rivera introduce new methods that can be used to reconstruct the genomic tree in the presence of horizontal gene transfers, but which suggest that a ring rather than a tree is a better representation of some parts of the history of life on Earth
    corecore