145 research outputs found

    Retrospective Higher-Order Markov Processes for User Trails

    Full text link
    Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains, in contrast, have too many parameters and suffer from overfitting the training data. Fitting these parameters with regularization and smoothing only offers mild improvements. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the number of parameters is linear in the order of the Markov chain and the model can be fit to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and business locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy

    HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web

    Full text link
    When users interact with the Web today, they leave sequential digital trails on a massive scale. Examples of such human trails include Web navigation, sequences of online restaurant reviews, or online music play lists. Understanding the factors that drive the production of these trails can be useful for e.g., improving underlying network structures, predicting user clicks or enhancing recommendations. In this work, we present a general approach called HypTrails for comparing a set of hypotheses about human trails on the Web, where hypotheses represent beliefs about transitions between states. Our approach utilizes Markov chain models with Bayesian inference. The main idea is to incorporate hypotheses as informative Dirichlet priors and to leverage the sensitivity of Bayes factors on the prior for comparing hypotheses with each other. For eliciting Dirichlet priors from hypotheses, we present an adaption of the so-called (trial) roulette method. We demonstrate the general mechanics and applicability of HypTrails by performing experiments with (i) synthetic trails for which we control the mechanisms that have produced them and (ii) empirical trails stemming from different domains including website navigation, business reviews and online music played. Our work expands the repertoire of methods available for studying human trails on the Web.Comment: Published in the proceedings of WWW'1

    Modelos determinísticos de gestão de ativo/passivo: uma aplicação no Brasil

    Get PDF
    Este artigo apresenta uma aplicação de modelos de otimização do tipo Gestão Ativo/Passivo (Asset/ Liability Management ou ALM) no Brasil. Esses modelos, ao contrário dos modelos tradicionais de maximização de ganhos sujeitos a limitações de risco, buscam otimizar a aplicação de recursos de uma entidade, dadas as características de seus passivos. A idéia central do trabalho é aplicar e adaptar alguns dos modelos existentes de otimização de carteiras do tipo Asset/Liability Management, apresentados na literatura, à realidade brasileira. A aplicabilidade dos modelos será analisada com base em um plano de aposentadoria complementar pertencente a um Fundo Multipatrocinado.This paper presents an application of Asset Liability Management (ALM) optimization models in Brazil. As opposed to traditional profit maximization models, which are subject to risk limitations, these models seek to optimize the riskreward relation.This paper aims to apply and adapt some existing asset/liability portfolio optimization models, presented in literature, to the Brazilian reality. Some concepts about Brazilian pension funds are discussed and the applicability of the models is analyzed on the basis of data from a Brazilian pension fund

    Multimedia Vocabularies on the Semantic Web

    Get PDF
    This document gives an overview on the state-of-the-art of multimedia metadata formats. Initially, practical relevant vocabularies for developers of Semantic Web applications are listed according to their modality scope. In the second part of this document, the focus is set on the integration of the multimedia vocabularies into the Semantic Web, that is to say, formal representations of the vocabularies are discussed

    Distribution and paleoenvironmental framework of middle Miocene marine vertebrates along the western side of the lower Ica Valley (East Pisco Basin, Peru).

    Get PDF
    We report 130 vertebrate fossils preserved as bony elements and the co-occurring assemblage of fish teeth and spines from the lower strata of the Pisco Formation exposed along the western side of the lower Ica Valley (East Pisco Basin, Peru). Geological mapping at 1:10,000 scale reveals that all these fossils originate from the Langhian–Serravallian P0 allomember. In the study area, P0 is up to ∼40 m thick and features a sandy lower portion, reflecting shoreface deposition, that fines upwards into a package of offshore silts. Marine vertebrates only occur in the lower sandy layers and include whales, dolphins, reptiles, birds, and bony and cartilaginous fishes. The reconstructed paleoenvironment is consistent with a warm-water, marginal marine setting with a strong connection to the open ocean. This work helps to elucidate the rich yet still poorly understood middle Miocene portions of the Pisco Formation, and highlights the need to conserve this outstanding Fossil-Lagerstätte

    Reversible Data Perturbation Techniques for Multi-level Privacy-preserving Data Publication

    Get PDF
    The amount of digital data generated in the Big Data age is increasingly rapidly. Privacy-preserving data publishing techniques based on differential privacy through data perturbation provide a safe release of datasets such that sensitive information present in the dataset cannot be inferred from the published data. Existing privacy-preserving data publishing solutions have focused on publishing a single snapshot of the data with the assumption that all users of the data share the same level of privilege and access the data with a fixed privacy level. Thus, such schemes do not directly support data release in cases when data users have different levels of access on the published data. While a straight-forward approach of releasing a separate snapshot of the data for each possible data access level can allow multi-level access, it can result in a higher storage cost requiring separate storage space for each instance of the published data. In this paper, we develop a set of reversible data perturbation techniques for large bipartite association graphs that use perturbation keys to control the sequential generation of multiple snapshots of the data to offer multi-level access based on privacy levels. The proposed schemes enable multi-level data privacy, allowing selective de-perturbation of the published data when suitable access credentials are provided. We evaluate the techniques through extensive experiments on a large real-world association graph dataset and our experiments show that the proposed techniques are efficient, scalable and effectively support multi-level data privacy on the published data

    Facies analysis, stratigraphy and marine vertebrate assemblage of the lower Miocene Chilcatay Formation at Ullujaya (Pisco basin, Peru)

    Get PDF
    This paper is the first integrated account of the sedimentology, stratigraphy and vertebrate paleontology for the marine strata of the Chilcatay Formation exposed at Ullujaya, Pisco basin (southern Peru). An allostratigraphic framework for the investigated strata was established using geological mapping (1:4,000 scale) and conventional sedimentary facies analysis and resulted in recognition of two unconformity-bounded allomembers (designated Ct1 and Ct2 in ascending order). The chronostratigraphic framework is well constrained by integration of micropaleontological data and isotope geochronology and indicates deposition during the early Miocene. The marine vertebrate fossil assemblage is largely dominated by cetaceans (odontocetes), whereas isolated teeth and spines indicate a well-diversified elasmobranch assemblage. Our field surveys, conducted to evaluate the paleontological sensitivity of the investigated strata, indicate that vertebrate remains only came from a rather restricted stratigraphic interval of the Ct1 allomember and reveal the high potential for these sediments to yield abundant and scientifically significant fossil assemblages
    • …
    corecore