5 research outputs found

    Location-Specific Tweet Detection and Topic Summarization in Twitter

    Get PDF
    Abstract-Automatic detection of tweets that provide Location-specific information will be extremely useful in conveying geo-location based knowledge to the users. However, there is a significant challenge in retrieving such tweets due to the sparsity of geo-tag information, the short textual nature of tweets, and the lack of pre-defined set of topics. In this paper, we develop a novel framework to identify and summarize tweets that are specific to a location. First, we propose a weighting scheme called Location Centric Word Co-occurrence (LCWC) that uses the content of the tweets and the network information of the twitterers to identify tweets that are location-specific. We evaluate the proposed model using a set of annotated tweets and compare the performance with other weighting schemes studied in the literature. This paper reports three key findings: (a) top trending tweets from a location are poor descriptors of location-specific tweets, (b) ranking tweets purely based on users' geo-location cannot ascertain the location specificity of tweets, and (c) users' network information plays an important role in determining the location-specific characteristics of the tweets. Finally, we train a topic model based on Latent Dirichlet Allocation (LDA) using a large collection of local news database and tweet-based Urls to predict the topics from the location-specific tweets and present them using an interactive web-based interface

    Big Social Data and GIS: Visualize Predictive Crime

    Get PDF
    Social media is a desirable Big Data source used to examine the relationship between crime and social behavior. Observation of this connection is enriched within a geographic information system (GIS) rooted in environmental criminology theory, and produces several different results to substantiate such a claim. This paper presents the construction and implementation of a GIS artifact producing visualization and statistical outcomes to develop evidence that supports predictive crime analysis. An information system research prototype guides inquiry and uses crime as the dependent variable and a social media tweet corpus, operationalized via natural language processing, as the independent variable. This inescapable realization of social media as a predictive crime variable is prudent; researchers and practitioners will better appreciate its capability. Inclusive visual and statistical results are novel, represent state-of-the-art predictive analysis, increase the baseline R2 value by 7.26%, and support future predictive crime-based research when front-run with real-time social media

    AN EXTENDABLE VISUALIZATION AND USER INTERFACE DESIGN FOR TIME-VARYING MULTIVARIATE GEOSCIENCE DATA

    Get PDF
    Geoscience data has unique and complex data structures, and its visualization has been challenging due to a lack of effective data models and visual representations to tackle the heterogeneity of geoscience data. In today’s big data era, the needs of visualizing geoscience data become urgent, especially driven by its potential value to human societies, such as environmental disaster prediction, urban growth simulation, and so on. In this thesis, I created a novel geoscience data visualization framework and applied interface automata theory to geoscience data visualization tasks. The framework can support heterogeneous geoscience data and facilitate data operations. The interface automata can generate a series of interactions that can efficiently impress users, which also provides an intuitive method for visualizing and analysis geoscience data. Except clearly guided users to the specific visualization, interface automata can also enhance user experience by eliminating automation surprising, and the maintenance overhead is also reduced. The new framework was applied to INSIGHT, a scientific hydrology visualization and analysis system that was developed by the Nebraska Department of Natural Resources (NDNR). Compared to the existing INSIGHT solution, the new framework has brought many advantages that do not exist in the existing solution, which proved that the framework is efficient and extendable for visualizing geoscience data. Adviser: Hongfeng Y

    AN EXTENDABLE VISUALIZATION AND USER INTERFACE DESIGN FOR TIME-VARYING MULTIVARIATE GEOSCIENCE DATA

    Get PDF
    Geoscience data has unique and complex data structures, and its visualization has been challenging due to a lack of effective data models and visual representations to tackle the heterogeneity of geoscience data. In today’s big data era, the needs of visualizing geoscience data become urgent, especially driven by its potential value to human societies, such as environmental disaster prediction, urban growth simulation, and so on. In this thesis, I created a novel geoscience data visualization framework and applied interface automata theory to geoscience data visualization tasks. The framework can support heterogeneous geoscience data and facilitate data operations. The interface automata can generate a series of interactions that can efficiently impress users, which also provides an intuitive method for visualizing and analysis geoscience data. Except clearly guided users to the specific visualization, interface automata can also enhance user experience by eliminating automation surprising, and the maintenance overhead is also reduced. The new framework was applied to INSIGHT, a scientific hydrology visualization and analysis system that was developed by the Nebraska Department of Natural Resources (NDNR). Compared to the existing INSIGHT solution, the new framework has brought many advantages that do not exist in the existing solution, which proved that the framework is efficient and extendable for visualizing geoscience data. Adviser: Hongfeng Y

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
    corecore