    Data analytics 2016: proceedings of the fifth international conference on data analytics

    “WARES”, a Web Analytics Recommender System

    Il est difficile d'imaginer des entreprises modernes sans analyse, c'est une tendance dans les entreprises modernes, même les petites entreprises et les entrepreneurs individuels commencent à utiliser des outils d'analyse d'une manière ou d'une autre pour leur entreprise. Pas étonnant qu'il existe un grand nombre d'outils différents pour les différents domaines, ils varient dans le but de simples statistiques d'amis et de visites pour votre page Facebook à grands et sophistiqués dans le cas des systèmes conçus pour les grandes entreprises, ils pourraient être shareware ou payés. Parfois, vous devez passer une formation spéciale, être un spécialiste certifiés, ou même avoir un diplôme afin d'être en mesure d'utiliser l'outil d'analyse. D'autres outils offrent une interface d’utilisateur simple, avec des tableaux de bord, pour satisfaire leur compréhension d’information pour tous ceux qui les ont vus pour la première fois. Ce travail sera consacré aux outils d'analyse Web. Quoi qu'il en soit pour tous ceux qui pensent à utiliser l'analyse pour ses propres besoins se pose une question: "quel outil doit je utiliser, qui convient à mes besoins, et comment payer moins et obtenir un gain maximum". Dans ce travail je vais essayer de donner une réponse sur cette question en proposant le système de recommandation pour les outils analytiques web –WARES, qui aideront l'utilisateur avec cette tâche "simple". Le système WARES utilise l'approche hybride, mais surtout, utilise des techniques basées sur le contenu pour faire des suggestions. Le système utilise certains ratings initiaux faites par utilisateur, comme entrée, pour résoudre le problème du “démarrage à froid”, offrant la meilleure solution possible en fonction des besoins des utilisateurs. Le besoin de consultations coûteuses avec des experts ou de passer beaucoup d'heures sur Internet, en essayant de trouver le bon outil. Le système lui–même devrait effectuer une recherche en ligne en utilisant certaines données préalablement mises en cache dans la base de données hors ligne, représentée comme une ontologie d'outils analytiques web existants extraits lors de la recherche en ligne précédente.It is hard to imagine modern business without analytics; it is a trend in modern business, even small companies and individual entrepreneurs start using analytics tools, in one way or another, for their business. Not surprising that there exist many different tools for different domains, they vary in purpose from simple friends and visits statistic for your Facebook page, to big and sophisticated systems designed for the big corporations, they could be free or paid. Sometimes you need to pass special training, be a certified specialist, or even have a degree to be able to use analytics tool, other tools offers simple user interface with dashboards for easy understanding and availability for everyone who saw them for the first time. Anyway, for everyone who is thinking about using analytics for his/her own needs stands a question: “what tool should I use, which one suits my needs and how to pay less and get maximum gain”. In this work, I will try to give an answer to this question by proposing a recommender tool, which will help the user with this “simple task”. This paper is devoted to the creation of WARES, as reduction from Web Analytics REcommender System. Proposed recommender system uses hybrid approach, but mostly, utilize content–based techniques for making suggestions, while using some user’s ratings as an input for “cold start” search. System produces recommendations depending on user’s needs, also allowing quick adjustments in selection without need of expensive consultations with experts or spending lots of hours for Internet search, trying to find out the right tool. The system itself should perform as an online search using some pre–cached data in offline database, represented as an ontology of existing web analytics tools, extracted during the previous online search

    Time-delayed collective flow diffusion models for inferring latent people flow from aggregated data at limited locations

    The rapid adoption of wireless sensor devices has made it easier to record location information of people in a variety of spaces (e.g., exhibition halls). Location information is often aggregated due to privacy and/or cost concerns. The aggregated data we use as input consist of the numbers of incoming and outgoing people at each location and at each time step. Since the aggregated data lack tracking information of individuals, determining the flow of people between locations is not straightforward. In this article, we address the problem of inferring latent people flows, that is, transition populations between locations, from just aggregated population data gathered from observed locations. Existing models assume that everyone is always in one of the observed locations at every time step; this, however, is an unrealistic assumption, because we do not always have a large enough number of sensor devices to cover the large-scale spaces targeted. To overcome this drawback, we propose a probabilistic model with flow conservation constraints that incorporate travel duration distributions between observed locations. To handle noisy settings, we adopt noisy observation models for the numbers of incoming and outgoing people, where the noise is regarded as a factor that may disturb flow conservation, e.g., people may appear in or disappear from the predefined space of interest. We develop an approximate expectation-maximization (EM) algorithm that simultaneously estimates transition populations and model parameters. Our experiments demonstrate the effectiveness of the proposed model on real-world datasets of pedestrian data in exhibition halls, bike trip data and taxi trip data in New York City

    Analyzing Granger causality in climate data with time series classification methods

    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Development of Context-Aware Recommenders of Sequences of Touristic Activities

    En els últims anys, els sistemes de recomanació s'han fet omnipresents a la xarxa. Molts serveis web, inclosa la transmissió de pel·lícules, la cerca web i el comerç electrònic, utilitzen sistemes de recomanació per facilitar la presa de decisions. El turisme és una indústria molt representada a la xarxa. Hi ha diversos serveis web (e.g. TripAdvisor, Yelp) que es beneficien de la integració de sistemes recomanadors per ajudar els turistes a explorar destinacions turístiques. Això ha augmentat la investigació centrada en la millora dels recomanadors turístics per resoldre els principals problemes als quals s'enfronten. Aquesta tesi proposa nous algorismes per a sistemes recomanadors turístics que aprenen les preferències dels turistes a partir dels seus missatges a les xarxes socials per suggerir una seqüència d'activitats turístiques que s'ajustin a diversos contextes i incloguin activitats afins. Per aconseguir-ho, proposem mètodes per identificar els turistes a partir de les seves publicacions a Twitter, identificant les activitats experimentades en aquestes publicacions i perfilant turistes similars en funció dels seus interessos, informació contextual i períodes d'activitat. Aleshores, els perfils d'usuari es combinen amb un algorisme de mineria de regles d'associació per capturar relacions implícites entre els punts d'interès de cada perfil. Finalment, es fa un rànquing de regles i un procés de selecció d'un conjunt d'activitats recomanables. Es va avaluar la precisió de les recomanacions i l'efecte del perfil d'usuari. A més, ordenem el conjunt d'activitats mitjançant un algorisme multi-objectiu per enriquir l'experiència turística. També realitzem una segona fase d'anàlisi dels fluxos turístics a les destinacions que és beneficiós per a les organitzacions de gestió de destinacions, que volen entendre la mobilitat turística. En general, els mètodes i algorismes proposats en aquesta tesi es mostren útils en diversos aspectes dels sistemes de recomanació turística.En los últimos años, los sistemas de recomendación se han vuelto omnipresentes en la web. Muchos servicios web, incluida la transmisión de películas, la búsqueda en la web y el comercio electrónico, utilizan sistemas de recomendación para ayudar a la toma de decisiones. El turismo es una industria altament representada en la web. Hay varios servicios web (e.g. TripAdvisor, Yelp) que se benefician de la inclusión de sistemas recomendadores para ayudar a los turistas a explorar destinos turísticos. Esto ha aumentado la investigación centrada en mejorar los recomendadores turísticos y resolver los principales problemas a los que se enfrentan. Esta tesis propone nuevos algoritmos para sistemas recomendadores turísticos que aprenden las preferencias de los turistas a partir de sus mensajes en redes sociales para sugerir una secuencia de actividades turísticas que se alinean con diversos contextos e incluyen actividades afines. Para lograr esto, proponemos métodos para identificar a los turistas a partir de sus publicaciones en Twitter, identificar las actividades experimentadas en estas publicaciones y perfilar turistas similares en función de sus intereses, contexto información y periodos de actividad. Luego, los perfiles de usuario se combinan con un algoritmo de minería de reglas de asociación para capturar relaciones entre los puntos de interés que aparecen en cada perfil. Finalmente, un proceso de clasificación de reglas y selección de actividades produce un conjunto de actividades recomendables. Se evaluó la precisión de las recomendaciones y el efecto de la elaboración de perfiles de usuario. Ordenamos además el conjunto de actividades utilizando un algoritmo multi-objetivo para enriquecer la experiencia turística. También llevamos a cabo un análisis de los flujos turísticos en los destinos, lo que es beneficioso para las organizaciones de gestión de destinos, que buscan entender la movilidad turística. En general, los métodos y algoritmos propuestos en esta tesis se muestran útiles en varios aspectos de los sistemas de recomendación turística.In recent years, recommender systems have become ubiquitous on the web. Many web services, including movie streaming, web search and e-commerce, use recommender systems to aid human decision-making. Tourism is one industry that is highly represented on the web. There are several web services (e.g. TripAdvisor, Yelp) that benefit from integrating recommender systems to aid tourists in exploring tourism destinations. This has increased research focused on improving tourism recommender systems and solving the main issues they face. This thesis proposes new algorithms for tourism recommender systems that learn tourist preferences from their social media data to suggest a sequence of touristic activities that align with various contexts and include affine activities. To accomplish this, we propose methods for identifying tourists from their frequent Twitter posts, identifying the activities experienced in these posts, and profiling similar tourists based on their interests, contextual information, and activity periods. User profiles are then combined with an association rule mining algorithm for capturing implicit relationships between points of interest apparent in each profile. Finally, a rule ranking and activity selection process produces a set of recommendable activities. The recommendations were evaluated for accuracy and the effect of user profiling. We further order the set of activities using a multi-objective algorithm to enrich the tourist experience. We also carry out a second-stage analysis of tourist flows at destinations which is beneficial to destination management organisations seeking to understand tourist mobility. Overall, the methods and algorithms proposed in this thesis are shown to be useful in various aspects of tourism recommender systems

    A context aware recommender system for tourism with ambient intelligence

    Recommender system (RS) holds a significant place in the area of the tourism sector. The major factor of trip planning is selecting relevant Points of Interest (PoI) from tourism domain. The RS system supposed to collect information from user behaviors, personality, preferences and other contextual information. This work is mainly focused on user’s personality, preferences and analyzing user psychological traits. The work is intended to improve the user profile modeling, exposing relationship between user personality and PoI categories and find the solution in constraint satisfaction programming (CSP). It is proposed the architecture according to ambient intelligence perspective to allow the best possible tourist place to the end-user. The key development of this RS is representing the model in CSP and optimizing the problem. We implemented our system in Minizinc solver with domain restrictions represented by user preferences. The CSP allowed user preferences to guide the system toward finding the optimal solutions; RESUMO O sistema de recomendação (RS) detém um lugar significativo na área do sector do turismo. O principal fator do planeamento de viagens é selecionar pontos de interesse relevantes (PoI) do domínio do turismo. O sistema de recomendação (SR) deve recolher informações de comportamentos, personalidade, preferências e outras informações contextuais do utilizador. Este trabalho centra-se principalmente na personalidade, preferências do utilizador e na análise de traços fisiológicos do utilizador. O trabalho tem como objetivo melhorar a modelação do perfil do utilizador, expondo a relação entre a personalidade deste e as categorias dos POI, assim como encontrar uma solução com programação por restrições (CSP). Propõe-se a arquitetura de acordo com a perspetiva do ambiente inteligente para conseguir o melhor lugar turístico possível para o utilizador final. A principal contribuição deste SR é representar o modelo como CSP e tratá-lo como problema de otimização. Implementámos o nosso sistema com o solucionador em Minizinc com restrições de domínio representadas pelas preferências dos utilizadores. O CSP permitiu que as preferências dos utilizadores guiassem o sistema para encontrar as soluções ideais