43,139 research outputs found
Urban Land Cover Classification with Missing Data Modalities Using Deep Convolutional Neural Networks
Automatic urban land cover classification is a fundamental problem in remote
sensing, e.g. for environmental monitoring. The problem is highly challenging,
as classes generally have high inter-class and low intra-class variance.
Techniques to improve urban land cover classification performance in remote
sensing include fusion of data from different sensors with different data
modalities. However, such techniques require all modalities to be available to
the classifier in the decision-making process, i.e. at test time, as well as in
training. If a data modality is missing at test time, current state-of-the-art
approaches have in general no procedure available for exploiting information
from these modalities. This represents a waste of potentially useful
information. We propose as a remedy a convolutional neural network (CNN)
architecture for urban land cover classification which is able to embed all
available training modalities in a so-called hallucination network. The network
will in effect replace missing data modalities in the test phase, enabling
fusion capabilities even when data modalities are missing in testing. We
demonstrate the method using two datasets consisting of optical and digital
surface model (DSM) images. We simulate missing modalities by assuming that DSM
images are missing during testing. Our method outperforms both standard CNNs
trained only on optical images as well as an ensemble of two standard CNNs. We
further evaluate the potential of our method to handle situations where only
some DSM images are missing during testing. Overall, we show that we can
clearly exploit training time information of the missing modality during
testing
Google Street View image of a house predicts car accident risk of its resident
Road traffic injuries are a leading cause of death worldwide. Proper
estimation of car accident risk is critical for appropriate allocation of
resources in healthcare, insurance, civil engineering, and other industries. We
show how images of houses are predictive of car accidents. We analyze 20,000
addresses of insurance company clients, collect a corresponding house image
using Google Street View, and annotate house features such as age, type, and
condition. We find that this information substantially improves car accident
risk prediction compared to the state-of-the-art risk model of the insurance
company and could be used for price discrimination. From this perspective,
public availability of house images raises legal and social concerns, as they
can be a proxy of ethnicity, religion and other sensitive data
Resource-Constrained Simultaneous Detection and Labeling of Objects in High-Resolution Satellite Images
We describe a strategy for detection and classification of man-made objects
in large high-resolution satellite photos under computational resource
constraints. We detect and classify candidate objects by using five pipelines
of convolutional neural network processing (CNN), run in parallel. Each
pipeline has its own unique strategy for fine tunning parameters, proposal
region filtering, and dealing with image scales. The conflicting region
proposals are merged based on region confidence and not just based on overlap
areas, which improves the quality of the final bounding-box regions selected.
We demonstrate this strategy using the recent xView challenge, which is a
complex benchmark with more than 1,100 high-resolution images, spanning 800,000
aerial objects around the world covering a total area of 1,400 square
kilometers at 0.3 meter ground sample distance. To tackle the
resource-constrained problem posed by the xView challenge, where inferences are
restricted to be on CPU with 8GB memory limit, we used lightweight CNN's
trained with the single shot detector algorithm. Our approach was competitive
on sequestered sets; it was ranked third
Object Detection in Specific Traffic Scenes using YOLOv2
object detection framework plays crucial role in autonomous driving. In this
paper, we introduce the real-time object detection framework called You Only
Look Once (YOLOv1) and the related improvements of YOLOv2. We further explore
the capability of YOLOv2 by implementing its pre-trained model to do the object
detecting tasks in some specific traffic scenes. The four artificially designed
traffic scenes include single-car, single-person, frontperson-rearcar and
frontcar-rearperson
Confidential Inference via Ternary Model Partitioning
Today's cloud vendors are competing to provide various offerings to simplify
and accelerate AI service deployment. However, cloud users always have concerns
about the confidentiality of their runtime data, which are supposed to be
processed on third-party's compute infrastructures. Information disclosure of
user-supplied data may jeopardize users' privacy and breach increasingly
stringent data protection regulations. In this paper, we systematically
investigate the life cycles of inference inputs in deep learning image
classification pipelines and understand how the information could be leaked.
Based on the discovered insights, we develop a Ternary Model Partitioning
mechanism and bring trusted execution environments to mitigate the identified
information leakages. Our research prototype consists of two co-operative
components: (1) Model Assessment Framework, a local model evaluation and
partitioning tool that assists cloud users in deployment preparation; (2)
Infenclave, an enclave-based model serving system for online confidential
inference in the cloud. We have conducted comprehensive security and
performance evaluation on three representative ImageNet-level deep learning
models with different network depths and architectural complexity. Our results
demonstrate the feasibility of launching confidential inference services in the
cloud with maximized confidentiality guarantees and low performance costs
Mini-Unmanned Aerial Vehicle-Based Remote Sensing: Techniques, Applications, and Prospects
The past few decades have witnessed the great progress of unmanned aircraft
vehicles (UAVs) in civilian fields, especially in photogrammetry and remote
sensing. In contrast with the platforms of manned aircraft and satellite, the
UAV platform holds many promising characteristics: flexibility, efficiency,
high-spatial/temporal resolution, low cost, easy operation, etc., which make it
an effective complement to other remote-sensing platforms and a cost-effective
means for remote sensing. Considering the popularity and expansion of UAV-based
remote sensing in recent years, this paper provides a systematic survey on the
recent advances and future prospectives of UAVs in the remote-sensing
community. Specifically, the main challenges and key technologies of
remote-sensing data processing based on UAVs are discussed and summarized
firstly. Then, we provide an overview of the widespread applications of UAVs in
remote sensing. Finally, some prospects for future work are discussed. We hope
this paper will provide remote-sensing researchers an overall picture of recent
UAV-based remote sensing developments and help guide the further research on
this topic
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images
We perform fine-grained land use mapping at the city scale using ground-level
images. Mapping land use is considerably more difficult than mapping land cover
and is generally not possible using overhead imagery as it requires close-up
views and seeing inside buildings. We postulate that the growing collections of
georeferenced, ground-level images suggest an alternate approach to this
geographic knowledge discovery problem. We develop a general framework that
uses Flickr images to map 45 different land-use classes for the City of San
Francisco. Individual images are classified using a novel convolutional neural
network containing two streams, one for recognizing objects and another for
recognizing scenes. This network is trained in an end-to-end manner directly on
the labeled training images. We propose several strategies to overcome the
noisiness of our user-generated data including search-based training set
augmentation and online adaptive training. We derive a ground truth map of San
Francisco in order to evaluate our method. We demonstrate the effectiveness of
our approach through geo-visualization and quantitative analysis. Our framework
achieves over 29% recall at the individual land parcel level which represents a
strong baseline for the challenging 45-way land use classification problem
especially given the noisiness of the image data
Extending Adversarial Attacks and Defenses to Deep 3D Point Cloud Classifiers
3D object classification and segmentation using deep neural networks has been
extremely successful. As the problem of identifying 3D objects has many
safety-critical applications, the neural networks have to be robust against
adversarial changes to the input data set. There is a growing body of research
on generating human-imperceptible adversarial attacks and defenses against them
in the 2D image classification domain. However, 3D objects have various
differences with 2D images, and this specific domain has not been rigorously
studied so far.
We present a preliminary evaluation of adversarial attacks on deep 3D point
cloud classifiers, namely PointNet and PointNet++, by evaluating both white-box
and black-box adversarial attacks that were proposed for 2D images and
extending those attacks to reduce the perceptibility of the perturbations in 3D
space. We also show the high effectiveness of simple defenses against those
attacks by proposing new defenses that exploit the unique structure of 3D point
clouds. Finally, we attempt to explain the effectiveness of the defenses
through the intrinsic structures of both the point clouds and the neural
network architectures. Overall, we find that networks that process 3D point
cloud data are weak to adversarial attacks, but they are also more easily
defensible compared to 2D image classifiers. Our investigation will provide the
groundwork for future studies on improving the robustness of deep neural
networks that handle 3D data.Comment: Abridged version accepted at the 2019 IEEE International Conference
on Image Processing (ICIP). Source code:
https://github.com/Daniel-Liu-c0deb0t/3D-Neural-Network-Adversarial-Attack
Semantic Change Detection with Hypermaps
Change detection is the study of detecting changes between two different
images of a scene taken at different times. By the detected change areas,
however, a human cannot understand how different the two images. Therefore, a
semantic understanding is required in the change detection research such as
disaster investigation. The paper proposes the concept of semantic change
detection, which involves intuitively inserting semantic meaning into detected
change areas. We mainly focus on the novel semantic segmentation in addition to
a conventional change detection approach. In order to solve this problem and
obtain a high-level of performance, we propose an improvement to the
hypercolumns representation, hereafter known as hypermaps, which effectively
uses convolutional maps obtained from convolutional neural networks (CNNs). We
also employ multi-scale feature representation captured by different image
patches. We applied our method to the TSUNAMI Panoramic Change Detection
dataset, and re-annotated the changed areas of the dataset via semantic
classes. The results show that our multi-scale hypermaps provided outstanding
performance on the re-annotated TSUNAMI dataset
- âŚ