133 research outputs found

    Location-aware online learning for top-k recommendation

    Get PDF
    We address the problem of recommending highly volatile items for users, both with potentially ambiguous location that may change in time. The three main ingredients of our method include (1) using online machine learning for the highly volatile items; (2) learning the personalized importance of hierarchical geolocation (for example, town, region, country, continent); finally (3) modeling temporal relevance by counting recent items with an exponential decay in recency.For (1), we consider a time-aware setting, where evaluation is cumbersome by traditional measures since we have different top recommendations at different times. We describe a time-aware framework based on individual item discounted gain. For (2), we observe that trends and geolocation turns out to be more important than personalized user preferences: user-item and content-item matrix factorization improves in combination with our geo-trend learning methods, but in itself, they are greatly inferior to our location based models. In fact, since our best performing methods are based on spatiotemporal data, they are applicable in the user cold start setting as well and perform even better than content based cold start methods. Finally for (3), we estimate the probability that the item will be viewed by its previous views to obtain a powerful model that combines item popularity and recency.To generate realistic data for measuring our new methods, we rely on Twitter messages with known GPS location and consider hashtags as items that we recommend the users to be included in their next message. © 2016 Elsevier B.V

    Site Selection Using Geo-Social Media: A Study For Eateries In Lisbon

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesThe rise in the influx of multicultural societies, studentification, and overall population growth has positively impacted the local economy of eateries in Lisbon, Portugal. However, this has also increased retail competition, especially in tourism. The overall increase in multicultural societies has also led to an increase in multiple smaller hotspots of human-urban attraction, making the concept of just one downtown in the city a little vague. These transformations of urban cities pose a big challenge for upcoming retail and eateries store owners in finding the most optimal location to set up their shops. An optimal site selection strategy should recommend new locations that can maximize the revenues of a business. Unfortunately, with dynamically changing human-urban interactions, traditional methods like relying on census data or surveys to understand neighborhoods and their impact on businesses are no more reliable or scalable. This study aims to address this gap by using geo-social data extracted from social media platforms like Twitter, Flickr, Instagram, and Google Maps, which then acts as a proxy to the real population. Seven variables are engineered at a neighborhood level using this data: business interest, age, gender, spatial competition, spatial proximity to stores, homogeneous neighborhoods, and percentage of the native population. A Random Forest based binary classification method is then used to predict whether a Point of Interest (POI) can be a part of any neighborhood n. The results show that using only these 7 variables, an F1-Score of 83% can be achieved in classifying whether a neighborhood is good for an “eateries” POI. The methodology used in this research is made to work with open data and be generic and reproducible to any city worldwide

    Extracting News Events from Microblogs

    Full text link
    Twitter stream has become a large source of information for many people, but the magnitude of tweets and the noisy nature of its content have made harvesting the knowledge from Twitter a challenging task for researchers for a long time. Aiming at overcoming some of the main challenges of extracting the hidden information from tweet streams, this work proposes a new approach for real-time detection of news events from the Twitter stream. We divide our approach into three steps. The first step is to use a neural network or deep learning to detect news-relevant tweets from the stream. The second step is to apply a novel streaming data clustering algorithm to the detected news tweets to form news events. The third and final step is to rank the detected events based on the size of the event clusters and growth speed of the tweet frequencies. We evaluate the proposed system on a large, publicly available corpus of annotated news events from Twitter. As part of the evaluation, we compare our approach with a related state-of-the-art solution. Overall, our experiments and user-based evaluation show that our approach on detecting current (real) news events delivers a state-of-the-art performance

    Temporal Signature Modeling and Analysis

    Get PDF
    A vast amount of digital satellite and aerial images are collected over time, which calls for techniques to extract useful high-level information, such as recognizable events. One part of this thesis proposes a framework for streaming analysis of the time series, which can recognize events without supervision and memorize them by building the temporal contexts. The memorized historical data is then used to predict the future and detect anomalies. A new incremental clustering method is proposed to recognize the event without training. A memorization method of double localization, including relative and absolute localization, is proposed to model the temporal context. Finally, the predictive model is built based on the method of memorization. The Edinburgh Pedestrian Dataset , which offers about 1000 observed trajectories of pedestrians detected in camera images each working day for several months, is used as an example to illustrate the framework. Although there is a large amount of image data captured, most of them are not available to the public. The other part of this thesis developed a method of generating spatial-spectral-temporal synthetic images by enhancing the capacity of a current tool called DIRISG (Digital Imaging and Remote Sensing Image Generation). Currently, DIRSIG can only model limited temporal signatures. In order to observe general temporal changes in a process within the scene, a process model, which links the observable signatures of interest temporally, should be developed and incorporated into DIRSIG. The sub process models could be categorized into two types. One is that the process model drives the property of each facet of the object changing over time, and the other one is to drive the geometry location of the object in the scene changing as a function of time. Two example process models are used to show how process models can be incorporated into DIRSIG

    Can we predict a riot? Disruptive event detection using Twitter

    Get PDF
    In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases

    Unsupervised learning on social data

    Get PDF
    corecore