150 research outputs found
Retrospective Higher-Order Markov Processes for User Trails
Users form information trails as they browse the web, checkin with a
geolocation, rate items, or consume media. A common problem is to predict what
a user might do next for the purposes of guidance, recommendation, or
prefetching. First-order and higher-order Markov chains have been widely used
methods to study such sequences of data. First-order Markov chains are easy to
estimate, but lack accuracy when history matters. Higher-order Markov chains,
in contrast, have too many parameters and suffer from overfitting the training
data. Fitting these parameters with regularization and smoothing only offers
mild improvements. In this paper we propose the retrospective higher-order
Markov process (RHOMP) as a low-parameter model for such sequences. This model
is a special case of a higher-order Markov chain where the transitions depend
retrospectively on a single history state instead of an arbitrary combination
of history states. There are two immediate computational advantages: the number
of parameters is linear in the order of the Markov chain and the model can be
fit to large state spaces. Furthermore, by providing a specific structure to
the higher-order chain, RHOMPs improve the model accuracy by efficiently
utilizing history states without risks of overfitting the data. We demonstrate
how to estimate a RHOMP from data and we demonstrate the effectiveness of our
method on various real application datasets spanning geolocation data, review
sequences, and business locations. The RHOMP model uniformly outperforms
higher-order Markov chains, Kneser-Ney regularization, and tensor
factorizations in terms of prediction accuracy
HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web
When users interact with the Web today, they leave sequential digital trails
on a massive scale. Examples of such human trails include Web navigation,
sequences of online restaurant reviews, or online music play lists.
Understanding the factors that drive the production of these trails can be
useful for e.g., improving underlying network structures, predicting user
clicks or enhancing recommendations. In this work, we present a general
approach called HypTrails for comparing a set of hypotheses about human trails
on the Web, where hypotheses represent beliefs about transitions between
states. Our approach utilizes Markov chain models with Bayesian inference. The
main idea is to incorporate hypotheses as informative Dirichlet priors and to
leverage the sensitivity of Bayes factors on the prior for comparing hypotheses
with each other. For eliciting Dirichlet priors from hypotheses, we present an
adaption of the so-called (trial) roulette method. We demonstrate the general
mechanics and applicability of HypTrails by performing experiments with (i)
synthetic trails for which we control the mechanisms that have produced them
and (ii) empirical trails stemming from different domains including website
navigation, business reviews and online music played. Our work expands the
repertoire of methods available for studying human trails on the Web.Comment: Published in the proceedings of WWW'1
Modelos determinÃsticos de gestão de ativo/passivo: uma aplicação no Brasil
Este artigo apresenta uma aplicação de modelos de otimização do tipo Gestão Ativo/Passivo (Asset/ Liability Management ou ALM) no Brasil. Esses modelos, ao contrário dos modelos tradicionais de maximização de ganhos sujeitos a limitações de risco, buscam otimizar a aplicação de recursos de uma entidade, dadas as caracterÃsticas de seus passivos. A idéia central do trabalho é aplicar e adaptar alguns dos modelos existentes de otimização de carteiras do tipo Asset/Liability Management, apresentados na literatura, à realidade brasileira. A aplicabilidade dos modelos será analisada com base em um plano de aposentadoria complementar pertencente a um Fundo Multipatrocinado.This paper presents an application of Asset Liability Management (ALM) optimization models in Brazil. As opposed to traditional profit maximization models, which are subject to risk limitations, these models seek to optimize the riskreward relation.This paper aims to apply and adapt some existing asset/liability portfolio optimization models, presented in literature, to the Brazilian reality. Some concepts about Brazilian pension funds are discussed and the applicability of the models is analyzed on the basis of data from a Brazilian pension fund
Multimedia Vocabularies on the Semantic Web
This document gives an overview on the state-of-the-art of multimedia metadata formats. Initially, practical relevant vocabularies for developers of Semantic Web applications are listed according to their modality scope. In the second part of this document, the focus is set on the integration of the multimedia vocabularies into the Semantic Web, that is to say, formal representations of the vocabularies are discussed
Distribution and paleoenvironmental framework of middle Miocene marine vertebrates along the western side of the lower Ica Valley (East Pisco Basin, Peru).
We report 130 vertebrate fossils preserved as bony elements and the co-occurring assemblage of fish teeth and spines from the lower strata of the Pisco Formation exposed along the western side of the lower Ica Valley (East Pisco Basin, Peru). Geological mapping at 1:10,000 scale reveals that all these fossils originate from the Langhian–Serravallian P0 allomember. In the study area, P0 is up to ∼40 m thick and features a sandy lower portion, reflecting shoreface deposition, that fines upwards into a package of offshore silts. Marine vertebrates only occur in the lower sandy layers and include whales, dolphins, reptiles, birds, and bony and cartilaginous fishes. The reconstructed paleoenvironment is consistent with a warm-water, marginal marine setting with a strong connection to the open ocean. This work helps to elucidate the
rich yet still poorly understood middle Miocene portions of the Pisco Formation, and highlights the need to conserve this outstanding Fossil-Lagerstätte
Reversible Data Perturbation Techniques for Multi-level Privacy-preserving Data Publication
The amount of digital data generated in the Big Data age is increasingly rapidly. Privacy-preserving data publishing techniques based on differential privacy through data perturbation provide a safe release of datasets such that sensitive information present in the dataset cannot be inferred from the published data. Existing privacy-preserving data publishing solutions have focused on publishing a single snapshot of the data with the assumption that all users of the data share the same level of privilege and access the data with a fixed privacy level. Thus, such schemes do not directly support data release in cases when data users have different levels of access on the published data. While a straight-forward approach of releasing a separate snapshot of the data for each possible data access level can allow multi-level access, it can result in a higher storage cost requiring separate storage space for each instance of the published data. In this paper, we develop a set of reversible data perturbation techniques for large bipartite association graphs that use perturbation keys to control the sequential generation of multiple snapshots of the data to offer multi-level access based on privacy levels. The proposed schemes enable multi-level data privacy, allowing selective de-perturbation of the published data when suitable access credentials are provided. We evaluate the techniques through extensive experiments on a large real-world association graph dataset and our experiments show that the proposed techniques are efficient, scalable and effectively support multi-level data privacy on the published data
Facies analysis, stratigraphy and marine vertebrate assemblage of the lower Miocene Chilcatay Formation at Ullujaya (Pisco basin, Peru)
This paper is the first integrated account of the sedimentology, stratigraphy and vertebrate paleontology for the marine strata of the Chilcatay Formation exposed at Ullujaya, Pisco basin (southern Peru). An allostratigraphic framework for the investigated strata was established using geological mapping (1:4,000 scale) and conventional sedimentary facies analysis and resulted in recognition of two unconformity-bounded allomembers (designated Ct1 and Ct2 in ascending order). The chronostratigraphic framework is well constrained by integration of micropaleontological data and isotope geochronology and indicates deposition during the early Miocene.
The marine vertebrate fossil assemblage is largely dominated by cetaceans (odontocetes), whereas isolated teeth and spines indicate a well-diversified elasmobranch assemblage. Our field surveys, conducted to evaluate the paleontological sensitivity of the investigated strata, indicate that vertebrate remains only came from a rather restricted stratigraphic interval of the Ct1 allomember and reveal the high potential for these sediments to yield abundant and scientifically significant fossil assemblages
Facies analysis, stratigraphy and marine vertebrate assemblage of the lower Miocene Chilcatay Formation at Ullujaya (Pisco basin, Peru)
This paper is the first integrated account of the sedimentology, stratigraphy and vertebrate paleontology for the marine strata of the Chilcatay Formation exposed at Ullujaya, Pisco basin (southern Peru). An allostratigraphic framework for the investigated strata was established using geological mapping (1:4,000 scale) and conventional sedimentary facies analysis and resulted in recognition of two unconformity-bounded allomembers (designated Ct1 and Ct2 in ascending order). The chronostratigraphic framework is well constrained by integration of micropaleontological data and isotope geochronology and indicates deposition during the early Miocene.
The marine vertebrate fossil assemblage is largely dominated by cetaceans (odontocetes), whereas isolated teeth and spines indicate a well-diversified elasmobranch assemblage. Our field surveys, conducted to evaluate the paleontological sensitivity of the investigated strata, indicate that vertebrate remains only came from a rather restricted stratigraphic interval of the Ct1 allomember and reveal the high potential for these sediments to yield abundant and scientifically significant fossil assemblages
- …