37,033 research outputs found

    Vista: A Visually, Socially, and Temporally-aware Model for Artistic Recommendation

    Full text link
    Understanding users' interactions with highly subjective content---like artistic images---is challenging due to the complex semantics that guide our preferences. On the one hand one has to overcome `standard' recommender systems challenges, such as dealing with large, sparse, and long-tailed datasets. On the other, several new challenges present themselves, such as the need to model content in terms of its visual appearance, or even social dynamics, such as a preference toward a particular artist that is independent of the art they create. In this paper we build large-scale recommender systems to model the dynamics of a vibrant digital art community, Behance, consisting of tens of millions of interactions (clicks and `appreciates') of users toward digital art. Methodologically, our main contributions are to model (a) rich content, especially in terms of its visual appearance; (b) temporal dynamics, in terms of how users prefer `visually consistent' content within and across sessions; and (c) social dynamics, in terms of how users exhibit preferences both towards certain art styles, as well as the artists themselves.Comment: 8 pages, 3 figure

    A Class of Temporal Hierarchical Exponential Random Graph Models for Longitudinal Network Data

    Full text link
    As a representation of relational data over time series, longitudinal networks provide opportunities to study link formation processes. However, networks at scale often exhibits community structure (i.e. clustering), which may confound local structural effects if it is not considered appropriately in statistical analysis. To infer the (possibly) evolving clusters and other network structures (e.g. degree distribution and/or transitivity) within each community, simultaneously, we propose a class of statistical models named Temporal Hierarchical Exponential Random Graph Models (THERGM). Our generative model imposes a Markovian transition matrix for nodes to change their membership, and assumes they join new community in a preferential attachment way. For those remaining in the same cluster, they follow a specific temporal ERG model (TERGM). While a direct MCMC based Bayesian estimation is computational infeasible, we propose a two-stage strategy. At the first stage, a specific dynamic latent space model will be used as the working model for clustering. At the second stage, estimated memberships are taken as given to fit a TERG model in each cluster. We evaluate our methods on simulated data in terms of the mis-clustering rate, as well as the goodness of fit and link prediction accuracy

    Community Specific Temporal Topic Discovery from Social Media

    Full text link
    Studying temporal dynamics of topics in social media is very useful to understand online user behaviors. Most of the existing work on this subject usually monitors the global trends, ignoring variation among communities. Since users from different communities tend to have varying tastes and interests, capturing community-level temporal change can improve the understanding and management of social content. Additionally, it can further facilitate the applications such as community discovery, temporal prediction and online marketing. However, this kind of extraction becomes challenging due to the intricate interactions between community and topic, and intractable computational complexity. In this paper, we take a unified solution towards the community-level topic dynamic extraction. A probabilistic model, CosTot (Community Specific Topics-over-Time) is proposed to uncover the hidden topics and communities, as well as capture community-specific temporal dynamics. Specifically, CosTot considers text, time, and network information simultaneously, and well discovers the interactions between community and topic over time. We then discuss the approximate inference implementation to enable scalable computation of model parameters, especially for large social data. Based on this, the application layer support for multi-scale temporal analysis and community exploration is also investigated. We conduct extensive experimental studies on a large real microblog dataset, and demonstrate the superiority of proposed model on tasks of time stamp prediction, link prediction and topic perplexity.Comment: 12 pages, 16 figures, submitted to VLDB 201

    Improving Latent User Models in Online Social Media

    Full text link
    Modern social platforms are characterized by the presence of rich user-behavior data associated with the publication, sharing and consumption of textual content. Users interact with content and with each other in a complex and dynamic social environment while simultaneously evolving over time. In order to effectively characterize users and predict their future behavior in such a setting, it is necessary to overcome several challenges. Content heterogeneity and temporal inconsistency of behavior data result in severe sparsity at the user level. In this paper, we propose a novel mutual-enhancement framework to simultaneously partition and learn latent activity profiles of users. We propose a flexible user partitioning approach to effectively discover rare behaviors and tackle user-level sparsity. We extensively evaluate the proposed framework on massive datasets from real-world platforms including Q&A networks and interactive online courses (MOOCs). Our results indicate significant gains over state-of-the-art behavior models ( 15% avg ) in a varied range of tasks and our gains are further magnified for users with limited interaction data. The proposed algorithms are amenable to parallelization, scale linearly in the size of datasets, and provide flexibility to model diverse facets of user behavior

    Understanding Urban Dynamics via Context-aware Tensor Factorization with Neighboring Regularization

    Full text link
    Recent years have witnessed the world-wide emergence of mega-metropolises with incredibly huge populations. Understanding residents mobility patterns, or urban dynamics, thus becomes crucial for building modern smart cities. In this paper, we propose a Neighbor-Regularized and context-aware Non-negative Tensor Factorization model (NR-cNTF) to discover interpretable urban dynamics from urban heterogeneous data. Different from many existing studies concerned with prediction tasks via tensor completion, NR-cNTF focuses on gaining urban managerial insights from spatial, temporal, and spatio-temporal patterns. This is enabled by high-quality Tucker factorizations regularized by both POI-based urban contexts and geographically neighboring relations. NR-cNTF is also capable of unveiling long-term evolutions of urban dynamics via a pipeline initialization approach. We apply NR-cNTF to a real-life data set containing rich taxi GPS trajectories and POI records of Beijing. The results indicate: 1) NR-cNTF accurately captures four kinds of city rhythms and seventeen spatial communities; 2) the rapid development of Beijing, epitomized by the CBD area, indeed intensifies the job-housing imbalance; 3) the southern areas with recent government investments have shown more healthy development tendency. Finally, NR-cNTF is compared with some baselines on traffic prediction, which further justifies the importance of urban contexts awareness and neighboring regulations

    A Review of Dynamic Network Models with Latent Variables

    Full text link
    We present a selective review of statistical modeling of dynamic networks. We focus on models with latent variables, specifically, the latent space models and the latent class models (or stochastic blockmodels), which investigate both the observed features and the unobserved structure of networks. We begin with an overview of the static models, and then we introduce the dynamic extensions. For each dynamic model, we also discuss its applications that have been studied in the literature, with the data source listed in Appendix. Based on the review, we summarize a list of open problems and challenges in dynamic network modeling with latent variables

    Modeling Implicit Communities using Spatio-Temporal Point Processes from Geo-tagged Event Traces

    Full text link
    The location check-ins of users through various location-based services such as Foursquare, Twitter, and Facebook Places, etc., generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlapping) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and prediction tasks. Given only time-stamped geo-tagged traces of users, can we find out these implicit communities, and characteristics of the underlying influence network? Can we use this network to improve the next location prediction task? In this paper, we focus on the problem of community detection as well as capturing the underlying diffusion process and propose a model COLAB based on Spatio-temporal point processes in continuous time but discrete space of locations that simultaneously models the implicit communities of users based on their check-in activities, without making use of their social network connections. COLAB captures the semantic features of the location, user-to-user influence along with spatial and temporal preferences of users. To learn the latent community of users and model parameters, we propose an algorithm based on stochastic variational inference. To the best of our knowledge, this is the first attempt at jointly modeling the diffusion process with activity-driven implicit communities. We demonstrate COLAB achieves up to 27% improvements in location prediction task over recent deep point-process based methods on geo-tagged event traces collected from Foursquare check-ins.Comment: 17 page

    Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

    Full text link
    Spatio-temporal data are ubiquitous in the agricultural, ecological, and environmental sciences, and their study is important for understanding and predicting a wide variety of processes. One of the difficulties with modeling spatial processes that change in time is the complexity of the dependence structures that must describe how such a process varies, and the presence of high-dimensional complex data sets and large prediction domains. It is particularly challenging to specify parameterizations for nonlinear dynamic spatio-temporal models (DSTMs) that are simultaneously useful scientifically and efficient computationally. Statisticians have developed deep hierarchical models that can accommodate process complexity as well as the uncertainties in the predictions and inference. However, these models can be expensive and are typically application specific. On the other hand, the machine learning community has developed alternative "deep learning" approaches for nonlinear spatio-temporal modeling. These models are flexible yet are typically not implemented in a probabilistic framework. The two paradigms have many things in common and suggest hybrid approaches that can benefit from elements of each framework. This overview paper presents a brief introduction to the deep hierarchical DSTM (DH-DSTM) framework, and deep models in machine learning, culminating with the deep neural DSTM (DN-DSTM). Recent approaches that combine elements from DH-DSTMs and echo state network DN-DSTMs are presented as illustrations.Comment: 26 pages, including 6 figures and reference

    Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction

    Full text link
    Identifying changes in model parameters is fundamental in machine learning and statistics. However, standard changepoint models are limited in expressiveness, often addressing unidimensional problems and assuming instantaneous changes. We introduce change surfaces as a multidimensional and highly expressive generalization of changepoints. We provide a model-agnostic formalization of change surfaces, illustrating how they can provide variable, heterogeneous, and non-monotonic rates of change across multiple dimensions. Additionally, we show how change surfaces can be used for counterfactual prediction. As a concrete instantiation of the change surface framework, we develop Gaussian Process Change Surfaces (GPCS). We demonstrate counterfactual prediction with Bayesian posterior mean and credible sets, as well as massive scalability by introducing novel methods for additive non-separable kernels. Using two large spatio-temporal datasets we employ GPCS to discover and characterize complex changes that can provide scientific and policy relevant insights. Specifically, we analyze twentieth century measles incidence across the United States and discover previously unknown heterogeneous changes after the introduction of the measles vaccine. Additionally, we apply the model to requests for lead testing kits in New York City, discovering distinct spatial and demographic patterns

    Mixed Effects Modeling for Areal Data that Exhibit Multivariate-Spatio-Temporal Dependencies

    Full text link
    There are many data sources available that report related variables of interest that are also referenced over geographic regions and time; however, there are relatively few general statistical methods that one can readily use that incorporate these multivariate-spatio-temporal dependencies. As such, we introduce the multivariate-spatio-temporal mixed effects model (MSTM) to analyze areal data with multivariate-spatio-temporal dependencies. The proposed MSTM extends the notion of Moran's I basis functions to the multivariate-spatio-temporal setting. This extension leads to several methodological contributions including extremely effective dimension reduction, a dynamic linear model for multivariate-spatio-temporal areal processes, and the reduction of a high-dimensional parameter space using a novel parameter model. Several examples are used to demonstrate that the MSTM provides an extremely viable solution to many important problems found in different and distinct corners of the spatio-temporal statistics literature including: modeling nonseparable and nonstationary covariances, combing data from multiple repeated surveys, and analyzing massive multivariate-spatio-temporal datasets
    • …
    corecore