249,518 research outputs found

    A framework for clustering and adaptive topic tracking on evolving text and social media data streams.

    Get PDF
    Recent advances and widespread usage of online web services and social media platforms, coupled with ubiquitous low cost devices, mobile technologies, and increasing capacity of lower cost storage, has led to a proliferation of Big data, ranging from, news, e-commerce clickstreams, and online business transactions to continuous event logs and social media expressions. These large amounts of online data, often referred to as data streams, because they get generated at extremely high throughputs or velocity, can make conventional and classical data analytics methodologies obsolete. For these reasons, the issues of management and analysis of data streams have been researched extensively in recent years. The special case of social media Big Data brings additional challenges, particularly because of the unstructured nature of the data, specifically free text. One classical approach to mine text data has been Topic Modeling. Topic Models are statistical models that can be used for discovering the abstract ``topics\u27\u27 that may occur in a corpus of documents. Topic models have emerged as a powerful technique in machine learning and data science, providing a great balance between simplicity and complexity. They also provide sophisticated insight without the need for real natural language understanding. However they have not been designed to cope with the type of text data that is abundant on social media platforms, but rather for traditional medium size corpora consisting of longer documents, adhering to a specific language and typically spanning a stable set of topics. Unlike traditional document corpora, social media messages tend to be very short, sparse, noisy, and do not adhere to a standard vocabulary, linguistic patterns, or stable topic distributions. They are also generated at high velocity that impose high demands on topic modeling; and their evolving or dynamic nature, makes any set of results from topic modeling quickly become stale in the face of changes in the textual content and topics discussed within social media streams. In this dissertation, we propose an integrated topic modeling framework built on top of an existing stream-clustering framework called Stream-Dashboard, which can extract, isolate, and track topics over any given time period. In this new framework, Stream Dashboard first clusters the data stream points into homogeneous groups. Then data from each group is ushered to the topic modeling framework which extracts finer topics from the group. The proposed framework tracks the evolution of the clusters over time to detect milestones corresponding to changes in topic evolution, and to trigger an adaptation of the learned groups and topics at each milestone. The proposed approach to topic modeling is different from a generic Topic Modeling approach because it works in a compartmentalized fashion, where the input document stream is split into distinct compartments, and Topic Modeling is applied on each compartment separately. Furthermore, we propose extensions to existing topic modeling and stream clustering methods, including: an adaptive query reformulation approach to help focus on the topic discovery with time; a topic modeling extension with adaptive hyper-parameter and with infinite vocabulary; an adaptive stream clustering algorithm incorporating the automated estimation of dynamic, cluster-specific temporal scales for adaptive forgetting to help facilitate clustering in a fast evolving data stream. Our experimental results show that the proposed adaptive forgetting clustering algorithm can mine better quality clusters; that our proposed compartmentalized framework is able to mine topics of better quality compared to competitive baselines; and that the proposed framework can automatically adapt to focus on changing topics using the proposed query reformulation strategy

    Insights from sentiment analysis to leverage local tourism business in restaurants

    Get PDF
    Yu, T., Rita, P., Moro, S., & Oliveira, C. (2021). Insights from sentiment analysis to leverage local tourism business in restaurants. International Journal of Culture, Tourism, and Hospitality Research. https://doi.org/10.1108/IJCTHR-02-2021-0037 ----------------------------------------------------------- Funding Information: The work by S. Moro and C.Oliveira was partially funded by national funds through FCT ‐ Fundação para a Ciência e Tecnologia, I.P., under the project FCT UIDB/04466/2020. The work by P. Rita was partially funded by national funds through FCT ‐ Fundação para a Ciência e Tecnologia, I.P., under the project FCT UIDB/04152/2020 ‐ Centro de Investigação em Gestão de Informação (MagIC). Publisher Copyright: © 2021, Emerald Publishing Limited.Purpose: Social media has become the main venue for users to express their opinions and feelings, generating a vast number of available and valuable data to be scrutinized by researchers and marketers. This paper aims to extend previous studies analyzing social media reviews through text mining and sentiment analysis to provide useful recommendations for management in the restaurant industry. Design/methodology/approach: The Lexalytics, a text mining artificial intelligence tool, is applied to analyze the text of the online reviews of the restaurants in a touristic Dutch village extracted from the most frequently used social media platforms focusing on the four restaurant quality factors, namely, food and beverage, service, atmosphere and value. Findings: The findings of this research are presented by the identified key themes with comparisons of the customers’ review sentiment between a selected restaurant, Zwaantje, vis-à-vis its bench-mark restaurants set by a specific approach under the abovementioned quality dimensions, in which the food and beverage and service are the most commented by customers. Results demonstrate that text mining can generate insights from different aspects and that the proposed approach is valuable to restaurant management. Originality/value: The paper provides a relatively big scale in numbers and resources of social media reviews to further explore the most important service dimensions in the restaurant industry in a specific tourist area. It also offers a useful framework to apply the text mining business intelligence tool by comparison of peers for local small business restaurant practitioners to improve their management skills beyond manually reading social media reviews.authorsversionepub_ahead_of_prin

    AAPOR Report on Big Data

    Get PDF
    In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.There is a great potential in Big Data but there are some fundamental challenges that have to be resolved before its full potential can be realized. In this report we give examples of different types of Big Data and their potential for survey research. We also describe the Big Data process and discuss its main challenges

    A Framework for Integrating Transportation Into Smart Cities

    Get PDF
    In recent years, economic, environmental, and political forces have quickly given rise to “Smart Cities” -- an array of strategies that can transform transportation in cities. Using a multi-method approach to research and develop a framework for smart cities, this study provides a framework that can be employed to: Understand what a smart city is and how to replicate smart city successes; The role of pilot projects, metrics, and evaluations to test, implement, and replicate strategies; and Understand the role of shared micromobility, big data, and other key issues impacting communities. This research provides recommendations for policy and professional practice as it relates to integrating transportation into smart cities
    corecore