16 research outputs found

    Integrating aerial and street view images for urban land use classification

    Get PDF
    Urban land use is key to rational urban planning and management. Traditional land use classification methods rely heavily on domain experts, which is both expensive and inefficient. In this paper, deep neural network-based approaches are presented to label urban land use at pixel level using high-resolution aerial images and ground-level street view images. We use a deep neural network to extract semantic features from sparsely distributed street view images and interpolate them in the spatial domain to match the spatial resolution of the aerial images, which are then fused together through a deep neural network for classifying land use categories. Our methods are tested on a large publicly available aerial and street view images dataset of New York City, and the results show that using aerial images alone can achieve relatively high classification accuracy, the ground-level street view images contain useful information for urban land use classification, and fusing street image features with aerial images can improve classification accuracy. Moreover, we present experimental studies to show that street view images add more values when the resolutions of the aerial images are lower, and we also present case studies to illustrate how street view images provide useful auxiliary information to aerial images to boost performances

    Utilization of Deep Learning for Mapping Land Use Change Base on Geographic Information System: A Case Study of Liquefaction

    Get PDF
    This study aims to extract buildings and roads and determine the extent of changes before and after the liquefaction disaster. The research method used is automatic extraction. The data used are Google Earth images for 2017 and 2018. The data analysis technique uses the Deep Learning Geography Information System. The results showed that the extraction results of the built-up area were 23.61 ha and the undeveloped area was 147.53 ha. The total length of the road before the liquefaction disaster occurred was 35.50 km. The extraction result after the liquefaction disaster was that the area built up was 1.20 ha, while the buildings lost due to the disaster were 22.41 ha. The total road length prior to the liquefaction disaster was 35.50 km, only 11.20 km of roads were lost, 24.30 km. Deep Learning in Geographic Information Systems (GIS) is proliferating and has many advantages in all aspects of life, including technology, geography, health, education, social life, and disasters

    Street context of various demographic groups in their daily mobility

    Get PDF
    We present an urban science framework to characterize phone users’ exposure to different street context types based on network science, geographical information systems (GIS), daily individual trajectories, and street imagery. We consider street context as the inferred usage of the street, based on its buildings and construction, categorized in nine possible labels. The labels define whether the street is residential, commercial or downtown, throughway or not, and other special categories. We apply the analysis to the City of Boston, considering daily trajectories synthetically generated with a model based on call detail records (CDR) and images from Google Street View. Images are categorized both manually and using artificial intelligence (AI). We focus on the city’s four main racial/ethnic demographic groups (White, Black, Hispanic and Asian), aiming to characterize the differences in what these groups of people see during their daily activities. Based on daily trajectories, we reconstruct most common paths over the street network. We use street demand (number of times a street is included in a trajectory) to detect each group’s most relevant streets and regions. Based on their street demand, we measure the street context distribution for each group. The inclusion of images allows us to quantitatively measure the prevalence of each context and points to qualitative differences on where that context takes place. Other AI methodologies can further exploit these differences. This approach presents the building blocks to further studies that relate mobile devices’ dynamic records with the differences in urban exposure by demographic groups. The addition of AI-based image analysis to street demand can power up the capabilities of urban planning methodologies, compare multiple cities under a unified framework, and reduce the crudeness of GIS-only mobility analysis. Shortening the gap between big data-driven analysis and traditional human classification analysis can help build smarter and more equal cities while reducing the efforts necessary to study a city’s characteristics.Fil: Salgado Corrado, Ariel Olaf. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Calculo. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Calculo; ArgentinaFil: Li, Weixin. Massachusetts Institute of Technology; Estados UnidosFil: Alhasoun, Fahad. University of California at Berkeley; Estados UnidosFil: Caridi, Délida Inés. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Calculo. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Calculo; ArgentinaFil: Gonzalez, Marta. University of California at Berkeley; Estados Unido

    Network Architecture for Generating a Labeled Overhead Image

    Get PDF
    A computer-implemented process is disclosed for generating a labeled overhead image of a geographical area. A plurality of ground level images of the geographical area is retrieved. A ground level feature map is generated, via a ground level convolutional neural network, based on features extracted from the plurality of ground level images. An overhead image of the geographical area is also retrieved. A joint feature map is generated, via an overhead convolutional neural network based on the ground level feature map and features extracted from the plurality of ground level images. Geospatial function values at a plurality of pixels of the overhead image are estimated based on at least the joint feature map and the overhead image. The plurality of pixels of the overhead image is labeled according to the estimated geospatial function values

    Holistic Multi-View Building Analysis in the Wild with Projection Pooling

    Get PDF
    We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.Comment: Accepted for publication at the 35th AAAI Conference on Artificial Intelligence (AAAI 2021

    Enhanced faster region-based convolutional neural network for oil palm tree detection

    Get PDF
    Oil palm trees are important economic crops in Malaysia. One of the audit procedures is to count the number of oil palm trees for plantation management, which helps the manager predict the plantation yield and the amount of fertilizer and labor force needed. However, the current counting method for oil palm tree plantation is manually counting using GIS software, which is tedious and inefficient for large scale plantation. To overcome this problem, researchers proposed automatic counting methods based on machine learning and image processing. However, traditional machine learning and image processing methods used handcrafted feature extraction methods. It can only extract low-middle level features from the image and lack of generalization ability. It’s applicable only for one application and will need reprogramming for other applications. The widely used feature extraction methods are local binary patterns (LBP), scale-invariant feature transform (SIFT), and the histogram of oriented gradients (HOG), which usually achieve low accuracy because of their limited feature representation ability and without generalization capability. Hence, this research aims to close the research gaps by exploring the deep learning-based object detection algorithm and the classical convolutional neural network (CNN) to build an automatic deep learning-based oil palm tree detection and counting framework. This study proposed a new deep learning method based on Faster RCNN for oil palm tree detection and counting. To reduce the overfitting problem during the training, this study uses the image processing method to augment the training dataset by random flipping the image and to increase the data’s contrast and brightness. The transfer learning model of ResNet50 was used as the CNN backbone and the Faster RCNN network was retrained to get the weight for automatic oil palm tree counting. To improve the performance of Faster RCNN, feature concatation method was used to integrate the high-level and low-level feature from ResNet50. The proposed model validated the testing dataset of three palm tree regions with mature, young, and mixed mature and young palm trees. The detection results were compared with two machine learning methods of ANN, SVM, image processing-based TM method, and the original Faster RCNN model respectively. The proposed enhanced Faster RCNN model shows a promising result of oil palm tree detection and counting. It achieved an overall accuracy of 97% in the testing dataset, 97.2% in the mixed palm tree region, and 96.9% in the mature and young palm tree region, while the traditional ANN, SVM, and TM methods are less than 90%. The accuracy of comparison reveals that the proposed EFRCNN model outperforms the Faster RCNN and the traditional ANN, SVM, and TM methods. It has the potential to apply in counting a large area of oil palm tree plantation
    corecore