98,002 research outputs found

    A machine learning-based method for the large-scale evaluation of the qualities of the urban environment

    Get PDF
    Given the present size of modern cities, it is beyond the perceptual capacity of most people to develop a good knowledge about the qualities of the urban space at every street corner. Correspondingly, for planners, it is also difficult to accurately answer questions such as ‘where the quality of the physical environment is the most dilapidated in the city that regeneration should be given first consideration’ and ‘in fast urbanising cities, how is the city appearance changing’. To address this issue, in the present study, we present a computer vision method that contains three machine learning models for the large-scale and automatic evaluation on the qualities of the urban environment by leveraging state-of-the-art machine learning techniques and wide-coverage street view images. From various physical qualities that have been identified by previous research to be important for the urban visual experience, we choose two key qualities, the construction and maintenance quality of building facade and the continuity of street wall, to be measured in this research. To test the validity of the proposed method, we compare the machine scores with public rating scores collected on-site from 752 passers-by at 56 locations in the city. We show that the machine learning models can produce a medium-to-good estimation of people's real experience, and the modelling results can be applied in many ways by researchers, planners and local residents.This research is funded by the National Natural Science Foundation of China (Grant No. 51478232), Independent Research Project of Tsinghua University (Grant No. 20131089262) and a scholarship from the China Scholarship Council (CSC No. 201306210039)

    Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments

    Full text link
    We present a self-supervised approach to ignoring "distractors" in camera images for the purposes of robustly estimating vehicle motion in cluttered urban environments. We leverage offline multi-session mapping approaches to automatically generate a per-pixel ephemerality mask and depth map for each input image, which we use to train a deep convolutional network. At run-time we use the predicted ephemerality and depth as an input to a monocular visual odometry (VO) pipeline, using either sparse features or dense photometric matching. Our approach yields metric-scale VO using only a single camera and can recover the correct egomotion even when 90% of the image is obscured by dynamic, independently moving objects. We evaluate our robust VO methods on more than 400km of driving from the Oxford RobotCar Dataset and demonstrate reduced odometry drift and significantly improved egomotion estimation in the presence of large moving vehicles in urban traffic.Comment: International Conference on Robotics and Automation (ICRA), 2018. Video summary: http://youtu.be/ebIrBn_nc-

    Multimodal Classification of Urban Micro-Events

    Get PDF
    In this paper we seek methods to effectively detect urban micro-events. Urban micro-events are events which occur in cities, have limited geographical coverage and typically affect only a small group of citizens. Because of their scale these are difficult to identify in most data sources. However, by using citizen sensing to gather data, detecting them becomes feasible. The data gathered by citizen sensing is often multimodal and, as a consequence, the information required to detect urban micro-events is distributed over multiple modalities. This makes it essential to have a classifier capable of combining them. In this paper we explore several methods of creating such a classifier, including early, late, hybrid fusion and representation learning using multimodal graphs. We evaluate performance on a real world dataset obtained from a live citizen reporting system. We show that a multimodal approach yields higher performance than unimodal alternatives. Furthermore, we demonstrate that our hybrid combination of early and late fusion with multimodal embeddings performs best in classification of urban micro-events
    • …
    corecore