118 research outputs found

    Unsupervised hierarchical clustering approach for tourism market segmentation based on crowdsourced mobile phone data

    Get PDF
    Understanding tourism related behavior and traveling patterns is an essential element of transportation system planning and tourism management at tourism destinations. Traditionally, tourism market segmentation is conducted to recognize tourist’s profiles for which personalized services can be provided. Today, the availability of wearable sensors, such as smartphones, holds the potential to tackle data collection problems of paper-based surveys and deliver relevant mobility data in a timely and cost-effective way. In this paper, we develop and implement a hierarchical clustering approach for smartphone geo-localized data to detect meaningful tourism related market segments. For these segments, we provide detailed insights into their characteristics and related mobility behavior. The applicability of the proposed approach is demonstrated on a use case in the Province of Zeeland in the Netherlands. We collected data from 1505 users during five months using the Zeeland app. The proposed approach resulted in two major clusters and four sub-clusters which we were able to interpret based on their spatio-temporal patterns and the recurrence of their visiting patterns to the region

    Density-based spatial clustering and ordering points approach for characterizations of tourist behaviour

    Get PDF
    Knowledge about the spots where tourist activity is undertaken, including which segments from the tourist market visit them, is valuable information for tourist service managers. Nowadays, crowdsourced smartphones applications are used as part of tourist surveys looking for knowledge about the tourist in all phases of their journey. However, the representativeness of this type of source, or how to validate the outcomes, are part of the issues that still need to be solved. In this research, a method to discover hotspots using clustering techniques and give to these hotspots a data-driven interpretation is proposed. The representativeness of the dataset and the validation of the results against existing statistics is assessed. The method was evaluated using 124,725 trips, which have been gathered by 1505 devices. The results show that the proposed approach successfully detects hotspots related with the most common activities developed by overnight tourists and repeat visitors in the region under study

    The emerging landscape of Social Media Data Collection: anticipating trends and addressing future challenges

    Full text link
    [spa] Las redes sociales se han convertido en una herramienta poderosa para crear y compartir contenido generado por usuarios en todo internet. El amplio uso de las redes sociales ha llevado a generar una enorme cantidad de información, presentando una gran oportunidad para el marketing digital. A través de las redes sociales, las empresas pueden llegar a millones de consumidores potenciales y capturar valiosos datos de los consumidores, que se pueden utilizar para optimizar estrategias y acciones de marketing. Los beneficios y desafíos potenciales de utilizar las redes sociales para el marketing digital también están creciendo en interés entre la comunidad académica. Si bien las redes sociales ofrecen a las empresas la oportunidad de llegar a una gran audiencia y recopilar valiosos datos de los consumidores, el volumen de información generada puede llevar a un marketing sin enfoque y consecuencias negativas como la sobrecarga social. Para aprovechar al máximo el marketing en redes sociales, las empresas necesitan recopilar datos confiables para propósitos específicos como vender productos, aumentar la conciencia de marca o fomentar el compromiso y para predecir los comportamientos futuros de los consumidores. La disponibilidad de datos de calidad puede ayudar a construir la lealtad a la marca, pero la disposición de los consumidores a compartir información depende de su nivel de confianza en la empresa o marca que lo solicita. Por lo tanto, esta tesis tiene como objetivo contribuir a la brecha de investigación a través del análisis bibliométrico del campo, el análisis mixto de perfiles y motivaciones de los usuarios que proporcionan sus datos en redes sociales y una comparación de algoritmos supervisados y no supervisados para agrupar a los consumidores. Esta investigación ha utilizado una base de datos de más de 5,5 millones de colecciones de datos durante un período de 10 años. Los avances tecnológicos ahora permiten el análisis sofisticado y las predicciones confiables basadas en los datos capturados, lo que es especialmente útil para el marketing digital. Varios estudios han explorado el marketing digital a través de las redes sociales, algunos centrándose en un campo específico, mientras que otros adoptan un enfoque multidisciplinario. Sin embargo, debido a la naturaleza rápidamente evolutiva de la disciplina, se requiere un enfoque bibliométrico para capturar y sintetizar la información más actualizada y agregar más valor a los estudios en el campo. Por lo tanto, las contribuciones de esta tesis son las siguientes. En primer lugar, proporciona una revisión exhaustiva de la literatura sobre los métodos para recopilar datos personales de los consumidores de las redes sociales para el marketing digital y establece las tendencias más relevantes a través del análisis de artículos significativos, palabras clave, autores, instituciones y países. En segundo lugar, esta tesis identifica los perfiles de usuario que más mienten y por qué. Específicamente, esta investigación demuestra que algunos perfiles de usuario están más inclinados a cometer errores, mientras que otros proporcionan información falsa intencionalmente. El estudio también muestra que las principales motivaciones detrás de proporcionar información falsa incluyen la diversión y la falta de confianza en las medidas de privacidad y seguridad de los datos. Finalmente, esta tesis tiene como objetivo llenar el vacío en la literatura sobre qué algoritmo, supervisado o no supervisado, puede agrupar mejor a los consumidores que proporcionan sus datos en las redes sociales para predecir su comportamiento futuro

    On the use of multi-sensor digital traces to discover spatio-temporal human behavioral patterns

    Get PDF
    134 p.La tecnología ya es parte de nuestras vidas y cada vez que interactuamos con ella, ya sea en una llamada telefónica, al realizar un pago con tarjeta de crédito o nuestra actividad en redes sociales, se almacenan trazas digitales. En esta tesis nos interesan aquellas trazas digitales que también registran la geolocalización de las personas al momento de realizar sus actividades diarias. Esta información nos permite conocer cómo las personas interactúan con la ciudad, algo muy valioso en planificación urbana,gestión de tráfico, políticas publicas e incluso para tomar acciones preventivas frente a desastres naturales.Esta tesis tiene por objetivo estudiar patrones de comportamiento humano a partir de trazas digitales. Para ello se utilizan tres conjuntos de datos masivos que registran la actividad de usuarios anonimizados en cuanto a llamados telefónicos, compras en tarjetas de crédito y actividad en redes sociales (check-ins,imágenes, comentarios y tweets). Se propone una metodología que permite extraer patrones de comportamiento humano usando modelos de semántica latente, Latent Dirichlet Allocation y DynamicTopis Models. El primero para detectar patrones espaciales y el segundo para detectar patrones espaciotemporales. Adicionalmente, se propone un conjunto de métricas para contar con un métodoobjetivo de evaluación de patrones obtenidos

    A New Picture of the City: Volunteered Geographic Image Information and the Cities

    Get PDF
    The urbanisation process continuously influences human life, causing long-term challenges for the planning and management of urban areas. In recent years, with the emergence of new forms of data and advances in techniques, the ways of managing and governing this process have evolved and formed a new research field: urban analytics. A growing number of human behaviours can be traced through quantities of data, which enables attributes of the urban environment to be managed more efficiently, potentially beneficial to complex decision-making processes by stakeholders. As such, how to extract useful information from new data and provide more suitable methods requires careful consideration. The question of how human activity relates to the built environment has been an important topic in the sensing of cities. Existing ways to perceive the city either focus on environmental aspects that cover historical, social, or cultural dimensions of urban space through surveys, interviews, or mobility data (e.g., social media data), or extract visible features from georeferenced images to gain perceptions of the city. However, both approaches are often disconnected and lack dynamic consideration. The main aim of this thesis is to address these challenges and gaps within urban analytics. It develops a methodological framework to leverage user-generated geotagged images and modern analytical techniques to obtain insights. Such framework is designed to mine spatial, temporal and image attributes of the Flickr images, which combines multiple dimensions including spatiotemporal dynamic analysis, computer vision models, summary statistics, and varying machine learning algorithms that allow understanding of human interactions with the built environment. The overall analysis and results enrich our current understanding of how user-generated urban pictures represent but also shape the city. This is especially important given the growing popularity of volunteered geographic information and urban analytics over the last decade. Their rapid growth has facilitated debates worldwide, but there is still a large potential of volunteered geographic information such as geotagged image information which has been underestimated in most circumstances. The findings presented in this thesis offer richer evidence that aims to help the improvement of strategic planning systems, and empowering policymakers to make smarter decisions in terms of urban governance

    Spatial and Temporal Sentiment Analysis of Twitter data

    Get PDF
    The public have used Twitter world wide for expressing opinions. This study focuses on spatio-temporal variation of georeferenced Tweets’ sentiment polarity, with a view to understanding how opinions evolve on Twitter over space and time and across communities of users. More specifically, the question this study tested is whether sentiment polarity on Twitter exhibits specific time-location patterns. The aim of the study is to investigate the spatial and temporal distribution of georeferenced Twitter sentiment polarity within the area of 1 km buffer around the Curtin Bentley campus boundary in Perth, Western Australia. Tweets posted in campus were assigned into six spatial zones and four time zones. A sentiment analysis was then conducted for each zone using the sentiment analyser tool in the Starlight Visual Information System software. The Feature Manipulation Engine was employed to convert non-spatial files into spatial and temporal feature class. The spatial and temporal distribution of Twitter sentiment polarity patterns over space and time was mapped using Geographic Information Systems (GIS). Some interesting results were identified. For example, the highest percentage of positive Tweets occurred in the social science area, while science and engineering and dormitory areas had the highest percentage of negative postings. The number of negative Tweets increases in the library and science and engineering areas as the end of the semester approaches, reaching a peak around an exam period, while the percentage of negative Tweets drops at the end of the semester in the entertainment and sport and dormitory area. This study will provide some insights into understanding students and staff ’s sentiment variation on Twitter, which could be useful for university teaching and learning management

    Combating Attacks and Abuse in Large Online Communities

    Get PDF
    Internet users today are connected more widely and ubiquitously than ever before. As a result, various online communities are formed, ranging from online social networks (Facebook, Twitter), to mobile communities (Foursquare, Waze), to content/interests based networks (Wikipedia, Yelp, Quora). While users are benefiting from the ease of access to information and social interactions, there is a growing concern for users' security and privacy against various attacks such as spam, phishing, malware infection and identity theft. Combating attacks and abuse in online communities is challenging. First, today’s online communities are increasingly dependent on users and user-generated content. Securing online systems demands a deep understanding of the complex and often unpredictable human behaviors. Second, online communities can easily have millions or even billions of users, which requires the corresponding security mechanisms to be highly scalable. Finally, cybercriminals are constantly evolving to launch new types of attacks. This further demands high robustness of security defenses. In this thesis, we take concrete steps towards measuring, understanding, and defending against attacks and abuse in online communities. We begin with a series of empirical measurements to understand user behaviors in different online services and the uniquesecurity and privacy challenges that users are facing with. This effort covers a broad set of popular online services including social networks for question and answering (Quora), anonymous social networks (Whisper), and crowdsourced mobile communities (Waze). Despite the differences of specific online communities, our study provides a first look at their user activity patterns based on empirical data, and reveals the need for reliable mechanisms to curate user content, protect privacy, and defend against emerging attacks. Next, we turn our attention to attacks targeting online communities, with focus on spam campaigns. While traditional spam is mostly generated by automated software, attackers today start to introduce "human intelligence" to implement attacks. This is maliciouscrowdsourcing (or crowdturfing) where a large group of real-users are organized to carry out malicious campaigns, such as writing fake reviews or spreading rumors on social media. Using collective human efforts, attackers can easily bypass many existing defenses (e.g.,CAPTCHA). To understand the ecosystem of crowdturfing, we first use measurements to examine their detailed campaign organization, workers and revenue. Based on insights from empirical data, we develop effective machine learning classifiers to detect crowdturfingactivities. In the meantime, considering the adversarial nature of crowdturfing, we also build practical adversarial models to simulate how attackers can evade or disrupt machine learning based defenses. To aid in this effort, we next explore using user behavior models to detect a wider range of attacks. Instead of making assumptions about attacker behavior, our idea is to model normal user behaviors and capture (malicious) behaviors that are deviated from norm. In this way, we can detect previously unknown attacks. Our behavior model is based on detailed clickstream data, which are sequences of click events generated by users when using the service. We build a similarity graph where each user is a node and the edges are weightedby clickstream similarity. By partitioning this graph, we obtain "clusters" of users with similar behaviors. We then use a small set of known good users to "color" these clusters to differentiate the malicious ones. This technique has been adopted by real-world social networks (Renren and LinkedIn), and already detected unexpected attacks. Finally, we extend clickstream model to understanding more-grained behaviors of attackers (and real users), and tracking how user behavior changes over time. In summary, this thesis illustrates a data-driven approach to understanding and defending against attacks and abuse in online communities. Our measurements have revealed new insights about how attackers are evolving to bypass existing security defenses today. Inaddition, our data-driven systems provide new solutions for online services to gain a deep understanding of their users, and defend them from emerging attacks and abuse
    corecore