2,862 research outputs found
On the Application of Data Clustering Algorithm used in Information Retrieval for Satellite Imagery Segmentation
This study proposes an automated technique for segmenting satellite imagery using unsupervised learning. Autoencoders, a type of neural network, are employed for dimensionality reduction and feature extraction. The study evaluates different segmentation architectures and encoders and identifies the best performing combination as the DeepLabv3+ architecture with a ResNet-152 encoder. This approach achieves high performance scores across multiple metrics and can be beneficial in various fields, including agriculture, land use monitoring, and disaster response
The role of earth observation in an integrated deprived area mapping “system” for low-to-middle income countries
Urbanization in the global South has been accompanied by the proliferation of vast informal and marginalized urban areas that lack access to essential services and infrastructure. UN-Habitat estimates that close to a billion people currently live in these deprived and informal urban settlements, generally grouped under the term of urban slums. Two major knowledge gaps undermine the efforts to monitor progress towards the corresponding sustainable development goal (i.e., SDG 11—Sustainable Cities and Communities). First, the data available for cities worldwide is patchy and insufficient to differentiate between the diversity of urban areas with respect to their access to essential services and their specific infrastructure needs. Second, existing approaches used to map deprived areas (i.e., aggregated household data, Earth observation (EO), and community-driven data collection) are mostly siloed, and, individually, they often lack transferability and scalability and fail to include the opinions of different interest groups. In particular, EO-based-deprived area mapping approaches are mostly top-down, with very little attention given to ground information and interaction with urban communities and stakeholders. Existing top-down methods should be complemented with bottom-up approaches to produce routinely updated, accurate, and timely deprived area maps. In this review, we first assess the strengths and limitations of existing deprived area mapping methods. We then propose an Integrated Deprived Area Mapping System (IDeAMapS) framework that leverages the strengths of EO- and community-based approaches. The proposed framework offers a way forward to map deprived areas globally, routinely, and with maximum accuracy to support SDG 11 monitoring and the needs of different interest groups
Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding
Understanding intrinsic patterns and predicting spatiotemporal
characteristics of cities require a comprehensive representation of urban
neighborhoods. Existing works relied on either inter- or intra-region
connectivities to generate neighborhood representations but failed to fully
utilize the informative yet heterogeneous data within neighborhoods. In this
work, we propose Urban2Vec, an unsupervised multi-modal framework which
incorporates both street view imagery and point-of-interest (POI) data to learn
neighborhood embeddings. Specifically, we use a convolutional neural network to
extract visual features from street view images while preserving geospatial
similarity. Furthermore, we model each POI as a bag-of-words containing its
category, rating, and review information. Analog to document embedding in
natural language processing, we establish the semantic similarity between
neighborhood ("document") and the words from its surrounding POIs in the vector
space. By jointly encoding visual, textual, and geospatial information into the
neighborhood representation, Urban2Vec can achieve performances better than
baseline models and comparable to fully-supervised methods in downstream
prediction tasks. Extensive experiments on three U.S. metropolitan areas also
demonstrate the model interpretability, generalization capability, and its
value in neighborhood similarity analysis.Comment: To appear in Proceedings of the Thirty-Fourth AAAI Conference on
Artificial Intelligence (AAAI-20
An Ensemble Learning Approach for Fast Disaster Response using Social Media Analytics
Natural disaster happens, as a result of natural hazards that cause financial, environmental or human losses. Natural disasters strike unexpectedly, affecting the lives of tens of thousands of people. During the flood, social media sites were also heavily used to disseminate information about flooded areas, rescue agencies, food and relief centres. This work proposes an ensemble learning strategy for combining and analysing social media data in order to close the gap and progress in catastrophic situation. To enable scalability and broad accessibility of the dynamic streaming of multimodal data namely text, image, audio and video, this work is designed around social media data. A fusion technique was employed at the decision level, based on a database of 15 characteristics for more than 300 disasters around the world (Trained with MNIST dataset 60000 training images and 10000 testing images). This work allows the collected multimodal social media data to share a common semantic space, making individual variable prediction easier. Each merged numerical vector(tensors) of text and audio is sent into the K-CNN algorithm, which is an unsupervised learning algorithm (K-CNN), and the image and video data is given to a deep learning based Progressive Neural Artificial Search (PNAS). The trained data acts as a predictor for future incidents, allowing for the estimation of total deaths, total individuals impacted, and total damage, as well as specific suggestions for food, shelter and housing inspections. To make such a prediction, the trained model is presented a satellite image from before the accident as well as the geographic and demographic conditions, which is expected to result in a prediction accuracy of more than 85%
When Urban Region Profiling Meets Large Language Models
Urban region profiling from web-sourced data is of utmost importance for
urban planning and sustainable development. We are witnessing a rising trend of
LLMs for various fields, especially dealing with multi-modal data research such
as vision-language learning, where the text modality serves as a supplement
information for the image. Since textual modality has never been introduced
into modality combinations in urban region profiling, we aim to answer two
fundamental questions in this paper: i) Can textual modality enhance urban
region profiling? ii) and if so, in what ways and with regard to which aspects?
To answer the questions, we leverage the power of Large Language Models (LLMs)
and introduce the first-ever LLM-enhanced framework that integrates the
knowledge of textual modality into urban imagery profiling, named LLM-enhanced
Urban Region Profiling with Contrastive Language-Image Pretraining (UrbanCLIP).
Specifically, it first generates a detailed textual description for each
satellite image by an open-source Image-to-Text LLM. Then, the model is trained
on the image-text pairs, seamlessly unifying natural language supervision for
urban visual representation learning, jointly with contrastive loss and
language modeling loss. Results on predicting three urban indicators in four
major Chinese metropolises demonstrate its superior performance, with an
average improvement of 6.1% on R^2 compared to the state-of-the-art methods.
Our code and the image-language dataset will be released upon paper
notification
Prediction of Housing Price and Forest Cover Using Mosaics with Uncertain Satellite Imagery
The growing world is more expensive to estimate land use, road length, and forest cover using a plant-scaled ground monitoring system. Satellite imaging contains a significant amount of detailed uncertain information. Combining this with machine learning aids in the organization of these data and the estimation of each variable separately. The resources necessary to deploy Machine learning technologies for Remote sensing images, on the other hand, restrict their reach ability and application. Based on satellite observations which are notably underutilised in impoverished nations, while practical competence to implement SIML might be restricted. Encoded forms of images are shared across tasks, and they will be calculated and sent to an infinite number of researchers who can achieve top-tier SIML performance by training a regression analysis onto the actual data. By separating the duties, the proposed SIML solution, MOSAIKS, shapes SIML approachable and global. A Featurization stage turns remote sensing data into concise vector representations, and a regression step makes it possible to learn the correlations which are specific to its particular task which link the obtained characteristics to the set of uncertain data
- …