33,471 research outputs found

    Multimodal Classification of Urban Micro-Events

    Get PDF
    In this paper we seek methods to effectively detect urban micro-events. Urban micro-events are events which occur in cities, have limited geographical coverage and typically affect only a small group of citizens. Because of their scale these are difficult to identify in most data sources. However, by using citizen sensing to gather data, detecting them becomes feasible. The data gathered by citizen sensing is often multimodal and, as a consequence, the information required to detect urban micro-events is distributed over multiple modalities. This makes it essential to have a classifier capable of combining them. In this paper we explore several methods of creating such a classifier, including early, late, hybrid fusion and representation learning using multimodal graphs. We evaluate performance on a real world dataset obtained from a live citizen reporting system. We show that a multimodal approach yields higher performance than unimodal alternatives. Furthermore, we demonstrate that our hybrid combination of early and late fusion with multimodal embeddings performs best in classification of urban micro-events

    Personalized Video Recommendation Using Rich Contents from Videos

    Full text link
    Video recommendation has become an essential way of helping people explore the massive videos and discover the ones that may be of interest to them. In the existing video recommender systems, the models make the recommendations based on the user-video interactions and single specific content features. When the specific content features are unavailable, the performance of the existing models will seriously deteriorate. Inspired by the fact that rich contents (e.g., text, audio, motion, and so on) exist in videos, in this paper, we explore how to use these rich contents to overcome the limitations caused by the unavailability of the specific ones. Specifically, we propose a novel general framework that incorporates arbitrary single content feature with user-video interactions, named as collaborative embedding regression (CER) model, to make effective video recommendation in both in-matrix and out-of-matrix scenarios. Our extensive experiments on two real-world large-scale datasets show that CER beats the existing recommender models with any single content feature and is more time efficient. In addition, we propose a priority-based late fusion (PRI) method to gain the benefit brought by the integrating the multiple content features. The corresponding experiment shows that PRI brings real performance improvement to the baseline and outperforms the existing fusion methods

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin
    • …
    corecore