711 research outputs found

    Reflecting Human Knowledge of Place and Route-Choice Behavior Using Big Data

    Get PDF
    Exploring human knowledge of geographical space and related behavior not only helps in understanding human-environment interactions and dynamic geographic processes, but also advances Geographic Information Systems (GIS) toward a human-centric paradigm to make daily life more efficient. Today’s relatively easy acquisition of various big data provides an unprecedented opportunity for geographers to answer research questions that previously could not be adequately addressed. However, new challenges also arise regarding data quality and bias as well as change in methodology for dealing with big data that are different from traditional data types. Representing people’s perception of place and studying driver’s route-choice behavior are two of the many applications of big data in answering research questions about human knowledge and behavior in the fields of GIS and transportation. Incorporating three papers, this dissertation focuses on these two different applications to achieve the following objectives: 1) examine the degree to which a geographic place’s spatial extent can be estimated from human-generated geotagged photos; 2) address the challenge of geotagged photos’ uneven spatial distribution in place estimation and explore an approach that can better derive a place’s spatial extent; 3) develop a method that can properly estimate the spatial extent of a place that has multiple disjoint regions while considering geotagged photos’ uneven distribution; 4) explore useful spatiotemporal patterns of taxi drivers’ route-choice behavior in a dynamic urban environment. This dissertation makes three major contributions to big data applications’ systematic theory: 1) proposes an effective approach to handling the uneven spatial distribution problem of geotagged photos as a type of volunteered geographic data by modeling their representativeness; 2) develops methods that can properly derive the vague spatial extent of a place with or without disjoint regions; and 3) explores taxi drivers’ route-choice patterns in different situations that can inform future transportation decisions and policy-making processes

    Visual analytics of location-based social networks for decision support

    Get PDF
    Recent advances in technology have enabled people to add location information to social networks called Location-Based Social Networks (LBSNs) where people share their communication and whereabouts not only in their daily lives, but also during abnormal situations, such as crisis events. However, since the volume of the data exceeds the boundaries of human analytical capabilities, it is almost impossible to perform a straightforward qualitative analysis of the data. The emerging field of visual analytics has been introduced to tackle such challenges by integrating the approaches from statistical data analysis and human computer interaction into highly interactive visual environments. Based on the idea of visual analytics, this research contributes the techniques of knowledge discovery in social media data for providing comprehensive situational awareness. We extract valuable hidden information from the huge volume of unstructured social media data and model the extracted information for visualizing meaningful information along with user-centered interactive interfaces. We develop visual analytics techniques and systems for spatial decision support through coupling modeling of spatiotemporal social media data, with scalable and interactive visual environments. These systems allow analysts to detect and examine abnormal events within social media data by integrating automated analytical techniques and visual methods. We provide comprehensive analysis of public behavior response in disaster events through exploring and examining the spatial and temporal distribution of LBSNs. We also propose a trajectory-based visual analytics of LBSNs for anomalous human movement analysis during crises by incorporating a novel classification technique. Finally, we introduce a visual analytics approach for forecasting the overall flow of human crowds

    Using Flickr to identify and connect tourism Points of Interest: The case of Lisbon, Porto and Faro

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsUnderstanding the movement of tourists helps not only the management of cities but also to enhance the most attractive places. The growth of people in social media allows us to have greater access to information about user preferences, reviews, and shared moments. Information can be used to study tourist activity. Here, it is used geo-tagged photographs from the social media platform Flickr, to identify the locations of tourists’ Points of Interest in Lisbon, Porto and Faro and quantify their relationship from the user’s co-occurrence in the identified points. The results show that, using standard clustering methods, it is possible to identify likely candidate Points of Interest. The association of the Points of Interest from users’ social media activity (i.e., posting of photos) results in a non-trivial network that breaks geographical proximity. It was found that, in all the cities under study, historical places (such as churches and cathedrals), viewpoints and beaches are captured

    Exploring human mobility patterns based on geotagged Flickr photos

    Get PDF
    Predicting human mobility behaviour has long been a topic of scientific interest. Such studies generally rely on tracking human movements through a range of data collection methodologies such as using GPS trackers, cellular network data etc. Some of this data may be confidential or hard to acquire. This thesis explores if existing publicly available data on online photo sharing platforms can be used to determine human mobility patterns with reasonable accuracy. We choose the Flickr website as the data collection medium as it has an extensive user base actively sharing photos many of which, have geo tags embedded in them which are preserved by Flickr. Our analysis reveals that while the data from Flickr is sparse and discontinuous making it unsuitable for reliable mobility prediction, typical human mobility trends based on time of day, day of week and month of the year can still be extracted. Such interesting patterns could be potentially used in traffic engineering domains or for user profiling purposes. More specifically, we describe how to obtain a subset of frequent active users and their information from Flickr, and the sliding window mechanism to filter the active periods of the users. Later we explain the various statistical methods applied on the filtered subset of data to identify the categories in which users could be classified, mainly short distance travellers and long distance travellers. The short distance travellers are considered for mobility trends prediction

    Development of Context-Aware Recommenders of Sequences of Touristic Activities

    Get PDF
    En els últims anys, els sistemes de recomanació s'han fet omnipresents a la xarxa. Molts serveis web, inclosa la transmissió de pel·lícules, la cerca web i el comerç electrònic, utilitzen sistemes de recomanació per facilitar la presa de decisions. El turisme és una indústria molt representada a la xarxa. Hi ha diversos serveis web (e.g. TripAdvisor, Yelp) que es beneficien de la integració de sistemes recomanadors per ajudar els turistes a explorar destinacions turístiques. Això ha augmentat la investigació centrada en la millora dels recomanadors turístics per resoldre els principals problemes als quals s'enfronten. Aquesta tesi proposa nous algorismes per a sistemes recomanadors turístics que aprenen les preferències dels turistes a partir dels seus missatges a les xarxes socials per suggerir una seqüència d'activitats turístiques que s'ajustin a diversos contextes i incloguin activitats afins. Per aconseguir-ho, proposem mètodes per identificar els turistes a partir de les seves publicacions a Twitter, identificant les activitats experimentades en aquestes publicacions i perfilant turistes similars en funció dels seus interessos, informació contextual i períodes d'activitat. Aleshores, els perfils d'usuari es combinen amb un algorisme de mineria de regles d'associació per capturar relacions implícites entre els punts d'interès de cada perfil. Finalment, es fa un rànquing de regles i un procés de selecció d'un conjunt d'activitats recomanables. Es va avaluar la precisió de les recomanacions i l'efecte del perfil d'usuari. A més, ordenem el conjunt d'activitats mitjançant un algorisme multi-objectiu per enriquir l'experiència turística. També realitzem una segona fase d'anàlisi dels fluxos turístics a les destinacions que és beneficiós per a les organitzacions de gestió de destinacions, que volen entendre la mobilitat turística. En general, els mètodes i algorismes proposats en aquesta tesi es mostren útils en diversos aspectes dels sistemes de recomanació turística.En los últimos años, los sistemas de recomendación se han vuelto omnipresentes en la web. Muchos servicios web, incluida la transmisión de películas, la búsqueda en la web y el comercio electrónico, utilizan sistemas de recomendación para ayudar a la toma de decisiones. El turismo es una industria altament representada en la web. Hay varios servicios web (e.g. TripAdvisor, Yelp) que se benefician de la inclusión de sistemas recomendadores para ayudar a los turistas a explorar destinos turísticos. Esto ha aumentado la investigación centrada en mejorar los recomendadores turísticos y resolver los principales problemas a los que se enfrentan. Esta tesis propone nuevos algoritmos para sistemas recomendadores turísticos que aprenden las preferencias de los turistas a partir de sus mensajes en redes sociales para sugerir una secuencia de actividades turísticas que se alinean con diversos contextos e incluyen actividades afines. Para lograr esto, proponemos métodos para identificar a los turistas a partir de sus publicaciones en Twitter, identificar las actividades experimentadas en estas publicaciones y perfilar turistas similares en función de sus intereses, contexto información y periodos de actividad. Luego, los perfiles de usuario se combinan con un algoritmo de minería de reglas de asociación para capturar relaciones entre los puntos de interés que aparecen en cada perfil. Finalmente, un proceso de clasificación de reglas y selección de actividades produce un conjunto de actividades recomendables. Se evaluó la precisión de las recomendaciones y el efecto de la elaboración de perfiles de usuario. Ordenamos además el conjunto de actividades utilizando un algoritmo multi-objetivo para enriquecer la experiencia turística. También llevamos a cabo un análisis de los flujos turísticos en los destinos, lo que es beneficioso para las organizaciones de gestión de destinos, que buscan entender la movilidad turística. En general, los métodos y algoritmos propuestos en esta tesis se muestran útiles en varios aspectos de los sistemas de recomendación turística.In recent years, recommender systems have become ubiquitous on the web. Many web services, including movie streaming, web search and e-commerce, use recommender systems to aid human decision-making. Tourism is one industry that is highly represented on the web. There are several web services (e.g. TripAdvisor, Yelp) that benefit from integrating recommender systems to aid tourists in exploring tourism destinations. This has increased research focused on improving tourism recommender systems and solving the main issues they face. This thesis proposes new algorithms for tourism recommender systems that learn tourist preferences from their social media data to suggest a sequence of touristic activities that align with various contexts and include affine activities. To accomplish this, we propose methods for identifying tourists from their frequent Twitter posts, identifying the activities experienced in these posts, and profiling similar tourists based on their interests, contextual information, and activity periods. User profiles are then combined with an association rule mining algorithm for capturing implicit relationships between points of interest apparent in each profile. Finally, a rule ranking and activity selection process produces a set of recommendable activities. The recommendations were evaluated for accuracy and the effect of user profiling. We further order the set of activities using a multi-objective algorithm to enrich the tourist experience. We also carry out a second-stage analysis of tourist flows at destinations which is beneficial to destination management organisations seeking to understand tourist mobility. Overall, the methods and algorithms proposed in this thesis are shown to be useful in various aspects of tourism recommender systems

    Mining, Modeling and Predicting Mobility

    Get PDF
    Mobility is a central aspect of our life, and our movements reveal much more about us than simply our whereabouts. In this thesis, we are interested in mobility and study it from three different perspectives: the modeling perspective, the information-theoretic perspective, and the data mining perspective. For the modeling perspective, we represent mobility as a probabilistic process described by both observable and latent variables, and we introduce formally the notion of individual and collective dimensions in mobility models. Ideally, we should take advantage of both dimensions to learn accurate mobility models, but the nature of data might limit us. We take a data-driven approach to study three scenarios, which differ on the nature of mobility data, and present, for each scenario, a mobility model that is tailored for it. The first scenario is individual-specific as we have mobility data about individuals but are unable to cross reference data from them. In the second scenario, we introduce the collective model that we use to overcome the sparsity of individual traces, and for which we assume that individuals in the same group exhibit similar mobility patterns. Finally, we present the ideal scenario, for which we can take advantage of both the individual and collective dimensions, and analyze collective mobility patterns in order to create individual models. In the second part of the thesis, we take an information-theoretic approach in order to quantify mobility uncertainty and its evolution with location updates. We discretize the userâs world to obtain a map that we represent as a mobility graph. We model mobility as a random walk on this graph âequivalent to a Markov chain âand quantify trajectory uncertainty as the entropy of the distribution over possible trajectories. In this setting, a location update amounts to conditioning on a particular state of the Markov chain, which requires the computation of the entropy of conditional Markov trajectories. Our main result enables us to compute this entropy through a transformation of the original Markov chain. We apply our framework to real-world mobility datasets and show that the influence of intermediate locations on trajectory entropy depends on the nature of these locations. We build on this finding and design a segmentation algorithm that uncovers intermediate destinations along a trajectory. The final perspective from which we analyze mobility is the data mining perspective: we go beyond simple mobility and analyze geo-tagged data that is generated by online social medias and that describes the whole user experience. We postulate that mining geo-tagged data enables us to obtain a rich representation of the user experience and all that surrounds its mobility. We propose a hierarchical probabilistic model that enables us to uncover specific descriptions of geographical regions, by analyzing the geo-tagged content generated by online social medias. By applying our method to a dataset of 8 million geo-tagged photos, we are able to associate with each neighborhood the tags that describe it specifically, and to find the most unique neighborhoods in a city

    Recommending places blased on the wisdom-of-the-crowd

    Get PDF
    The collective opinion of a great number of users, popularly known as wisdom of the crowd, has been seen as powerful tool for solving problems. As suggested by Surowiecki in his books [134], large groups of people are now considered smarter than an elite few, regardless of how brilliant at solving problems or coming to wise decisions they are. This phenomenon together with the availability of a huge amount of data on the Web has propitiated the development of solutions which employ the wisdom-of-the-crowd to solve a variety of problems in different domains, such as recommender systems [128], social networks [100] and combinatorial problems [152, 151]. The vast majority of data on the Web has been generated in the last few years by billions of users around the globe using their mobile devices and web applications, mainly on social networks. This information carries astonishing details of daily activities ranging from urban mobility and tourism behavior, to emotions and interests. The largest social network nowadays is Facebook, which in December 2015 had incredible 1.31 billion mobile active users, 4.5 billion “likes” generated daily. In addition, every 60 seconds 510 comments are posted, 293, 000 statuses are updated, and 136,000 photos are uploaded1. This flood of data has brought great opportunities to discover individual and collective preferences, and use this information to offer services to meet people’s needs, such as recommending relevant and interesting items (e.g. news, places, movies). Furthermore, it is now possible to exploit the experiences of groups of people as a collective behavior so as to augment the experience of other. This latter illustrates the important scenario where the discovery of collective behavioral patterns, the wisdom-of-the-crowd, may enrich the experience of individual users. In this light, this thesis has the objective of taking advantage of the wisdom of the crowd in order to better understand human mobility behavior so as to achieve the final purpose of supporting users (e.g. people) by providing intelligent and effective recommendations. We accomplish this objective by following three main lines of investigation as discussed below. In the first line of investigation we conduct a study of human mobility using the wisdom-of- the-crowd, culminating in the development of an analytical framework that offers a methodology to understand how the points of interest (PoIs) in a city are related to each other on the basis of the displacement of people. We experimented our methodology by using the PoI network topology to identify new classes of points of interest based on visiting patterns, spatial displacement from one PoI to another as well as popularity of the PoIs. Important relationships between PoIs are mined by discovering communities (groups) of PoIs that are closely related to each other based on user movements, where different analytical metrics are proposed to better understand such a perspective. The second line of investigation exploits the wisdom-of-the-crowd collected through user-generated content to recommend itineraries in tourist cities. To this end, we propose an unsupervised framework, called TripBuilder, that leverages large collections of Flickr photos, as the wisdom-of- the-crowd, and points of interest from Wikipedia in order to support tourists in planning their visits to the cities. We extensively experimented our framework using real data, thus demonstrating the effectiveness and efficiency of the proposal. Based on the theoretical framework, we designed and developed a platform encompassing the main features required to create personalized sightseeing tours. This platform has received significant interest within the research community, since it is recognized as crucial to understand the needs of tourists when they are planning a visit to a new city. Consequently this led to outstanding scientific results. In the third line of investigation, we exploit the wisdom-of-the-crowd to leverage recommendations of groups of people (e.g. friends) who can enjoy an item (e.g. restaurant) together. We propose GroupFinder to address the novel user-item group formation problem aimed at recommending the best group of friends for a pair. The proposal combines user-item relevance information with the user’s social network (ego network), while trying to balance the satisfaction of all the members of the group for the item with the intra-group relationships. Algorithmic solutions are proposed and experimented in the location-based recommendation domain by using four publicly available Location-Based Social Network (LBSN) datasets, showing that our solution is effective and outperforms strong baselines

    A Trajectory-Based Recommender System for Tourism

    Full text link

    Geographic Feature Mining: Framework and Fundamental Tasks for Geographic Knowledge Discovery from User-generated Data

    Get PDF
    We live in a data-rich environment where massive amounts of data such as text messages, articles, images, and search queries are continuously generated by users. In this environment, new opportunities to discover and utilize knowledge about the real-world arise, such as the extraction and description of places and events from social media records, the organization of documents by spatio-temporal topics, and the prediction of epidemics by search engine queries. Major challenges addressed in these data- and application-specific works arise from the unstructured and complex nature of the data, and the high level of uncertainty and sparsity of the attributes. Despite the evident progress in utilizing specific data sources for different applications, there remains a lack of common concepts and techniques on how to exploit the data as high-quality sensors of geographic space in a general manner. However, such a general point of view allows to address the common challenges and to define fundamental building blocks to deal with problems in fields like information retrieval, recommender systems, market research, health surveillance, and social sciences. In this thesis, we develop concepts and techniques to utilize various kinds of user-generated data as a steady source of information about geographic processes and entities (together called geographic phenomena). For this, we introduce a novel conceptual data mining framework, called geographic feature mining, that provides the foundation to discover and extract highly informative and discriminative dimensions of geographic space in a unifying and systematic fashion. This is achieved by representing the qualitative and geographic information in the records as geographic feature signals, each constituting a potential dimensions to describe geographic space. The mining process then determines highly informative features or feature combinations from the candidate sets that can be used as a steady source of auxiliary information for domain-specific applications. In developing the framework, we make contributions to several fundamental problems: (1) We introduce a novel probabilistic model to extract high-quality geographic feature signals. The signals are robust to noise and background distributions, and the model allows to exploit diverse kinds of qualitative and geographic information in the records. This flexibility is achieved by utilizing a Bayesian network model and the robustness by choosing appropriate prior distributions. (2) We address the problem of categorizing and selecting geographic features based on their spatio-temporal type, such as feature signals having landmark, regional, or global semantics. For this, we introduce representations of the signals by interaction characteristics and evaluate their performance in clustering and data summarization tasks. (3) To extract a small number of highly informative feature combinations that reflect geographic phenomena, we introduce a model that extracts latent geographic features from the candidate signals using dimensionality reduction. We show that this model outperforms document-centric topic models with respect to the informativeness of the extracted phenomena, and we exhaustively evaluate how different statistical properties of the approaches affect the characteristics of the resulting feature combinations
    corecore