1,373 research outputs found

    Data mining and fusion

    No full text

    Privacy-Friendly Mobility Analytics using Aggregate Location Data

    Get PDF
    Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In this paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a privacy-preserving aggregation protocol, so that the server can only recover aggregates -- i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we present and evaluate a mobile app prototype, called Mobility Data Donors (MDD), in terms of computation, communication, and energy overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201

    Automatic generation of natural language descriptions of visual data: describing images and videos using recurrent and self-attentive models

    Get PDF
    Humans are faced with a constant flow of visual stimuli, e.g., from the environment or when looking at social media. In contrast, visually-impaired people are often incapable to perceive and process this advantageous and beneficial information that could help maneuver them through everyday situations and activities. However, audible feedback such as natural language can give them the ability to better be aware of their surroundings, thus enabling them to autonomously master everyday's challenges. One possibility to create audible feedback is to produce natural language descriptions for visual data such as still images and then read this text to the person. Moreover, textual descriptions for images can be further utilized for text analysis (e.g., sentiment analysis) and information aggregation. In this work, we investigate different approaches and techniques for the automatic generation of natural language of visual data such as still images and video clips. In particular, we look at language models that generate textual descriptions with recurrent neural networks: First, we present a model that allows to generate image captions for scenes that depict interactions between humans and branded products. Thereby, we focus on the correct identification of the brand name in a multi-task training setting and present two new metrics that allow us to evaluate this requirement. Second, we explore the automatic answering of questions posed for an image. In fact, we propose a model that generates answers from scratch instead of predicting an answer from a limited set of possible answers. In comparison to related works, we are therefore able to generate rare answers, which are not contained in the pool of frequent answers. Third, we review the automatic generation of doctors' reports for chest X-ray images. That is, we introduce a model that can cope with a dataset bias of medical datasets (i.e., abnormal cases are very rare) and generates reports with a hierarchical recurrent model. We also investigate the correlation between the distinctiveness of the report and the score in traditional metrics and find a discrepancy between good scores and accurate reports. Then, we examine self-attentive language models that improve computational efficiency and performance over the recurrent models. Specifically, we utilize the Transformer architecture. First, we expand the automatic description generation to the domain of videos where we present a video-to-text (VTT) model that can easily synchronize audio-visual features. With an extensive experimental exploration, we verify the effectiveness of our video-to-text translation pipeline. Finally, we revisit our recurrent models with this self-attentive approach

    Back to the future: Throughput prediction for cellular networks using radio KPIs

    Get PDF
    The availability of reliable predictions for cellular throughput would offer a fundamental change in the way applications are designed and operated. Numerous cellular applications, including video streaming and VoIP, embed logic that attempts to estimate achievable throughput and adapt their behaviour accordingly. We believe that providing applications with reliable predictions several seconds into the future would enable profoundly better adaptation decisions and dramatically benefit demanding applications like mobile virtual and augmented reality. The question we pose and seek to address is whether such reliable predictions are possible. We conduct a preliminary study of throughput prediction in a cellular environment using statistical machine learning techniques. An accurate prediction can be very challenging in large scale cellular environments because they are characterized by highly fluctuating channel conditions. Using simulations and real-world experiments, we study how prediction error varies as a function of prediction horizon, and granularity of available data. In particular, our simulation experiments show that the prediction error for mobile devices can be reduced significantly by combining measurements from the network with measurements from the end device. Our results indicate that it is possible to accurately predict achievable throughput up to 8 sec in the future where 50th percentile of all errors are less than 15% for mobile and 2% for static devices

    Deep Learning for Video Modelling

    Full text link
    Ce mémoire de maı̂trise présente une exploration des modèles génératifs dans le contexte de la vidéo. Ceci a demandé une étude approfondie des problèmes encourus par les chercheurs dans cette branche de la vision par ordinateur. Ce mémoire établi deux axes problématiques, celui venant des données et celui des modèles. Concernant les données, les méthodes accomplissant l’état-de-l’art dans ce domaine sont appliqués sur des bases de données qui potentiellement sous représentent les défis existant dans les vidéos de tous les jours. Ainsi, il est possible que l’innovation évolue ultimement vers des cul-de-sacs et une nouvelle bases de données est suggérées afin de résoudre ce problème. Quant aux modèles, la génération de vidéos est à la frontière des applications des procéssus génératifs. C’est un champs de recherche encore très ouvert aux découvertes de tailles car non seulement est-il devant des obstacles d’ingénieries, tant aux niveaux logiciels que physiques, mais il se trouve à être un véritable casse-tête. En apprentissage profond, la modélisation d’images statiques entre présentement dans une phase plus mature, mais qu’en est-il pour des séquences d’images et de leurs générations? De très récents modèles ont réussi d’impressionnantes générations image par image et exhibent de longues séquences sans dégradation rapide de la qualité visuelle. En analysant ceux-ci, ce mémoire propose le modèle feature flow comme un choix raisonnable à considérer pour cette tâche et espère convaincre pourquoi. La génération comme sujet d’étude elle-même a fait également l’objet d’une attention particulière à travers ce mémoire. Il augmente le déjà populaire generative adversarial networks avec un mécanisme d’inférence, adversarially learned inference. Cette version améliorée excelle aux mêmes tâches que son prédécesseur tout en offrant une représentation abstraite des données grâce au mécanisme d’inférence. Il y a espoir lors de travaux futures d’exhiber tout son potentiel, l’élevant comme un choix de modèle important.This thesis presents an exploration of generative models in the context of video generation. It focuses on an investigation of the problems faced by researchers when working on this branch of computer vision. It is argued throughout this thesis that video suffers from two main issues, namely on the data side and on the model side. Data-wise, current state-of-the-art models in this field are applied on datasets that can potentially misrepresent the true challenges with real videos and pushes model innovations in corners that could be dead ends on this task. A new dataset is proposed in light of this situation that tries to fix these problems. Model-wise, video generation is on the very frontier of generative applications. It represents an area still very open for breakthrough since not only is it faced with engineering, hardware and software obstacles, it also offers a real puzzle for models. If deep learning modelling for static images is entering a more mature phase, how does one transition to a sequence of images and moreover generate them? Very recent models have yielded impressive next frame generations and are able to show long sequences of frames that do not rapidly degrade. This thesis proposes the feature flow model as a natural choice to consider when doing this task and hope to reasonably argue as to why. Generation as an object of study itself has also been given attention throughout this thesis. It augments the already popular generative adversarial networks with an inference mechanism, adversarially learned inference. This upgraded version excels at the same tasks than its predecessor while offering an abstract representation of its data through the inference procedure. There is hope for a display of its full potential in future works setting it as a strong model choice
    • …
    corecore