191 research outputs found

    Automatic Caption Generation for Aerial Images: A Survey

    Get PDF
    Aerial images have attracted attention from researcher community since long time. Generating a caption for an aerial image describing its content in comprehensive way is less studied but important task as it has applications in agriculture, defence, disaster management and many more areas. Though different approaches were followed for natural image caption generation, generating a caption for aerial image remains a challenging task due to its special nature. Use of emerging techniques from Artificial Intelligence (AI) and Natural Language Processing (NLP) domains have resulted in generation of accepted quality captions for aerial images. However lot needs to be done to fully utilize potential of aerial image caption generation task. This paper presents detail survey of the various approaches followed by researchers for aerial image caption generation task. The datasets available for experimentation, criteria used for performance evaluation and future directions are also discussed

    Pixel2point: 3D Object Reconstruction From a Single Image Using CNN and Initial Sphere

    Get PDF

    Remote Sensing Image Scene Classification: Benchmark and State of the Art

    Full text link
    Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.Comment: This manuscript is the accepted version for Proceedings of the IEE

    Intelligent Data Analytics using Deep Learning for Data Science

    Get PDF
    Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization

    Change Detection of Marine Environments Using Machine Learning

    Get PDF
    NPS NRP Technical ReportChange Detection of Marine Environments Using Machine LearningHQMC Intelligence Department (I)This research is supported by funding from the Naval Postgraduate School, Naval Research Program (PE 0605853N/2098). https://nps.edu/nrpChief of Naval Operations (CNO)Approved for public release. Distribution is unlimited.

    Harnessing Big Data and Machine Learning for Event Detection and Localization

    Get PDF
    Anomalous events are rare and significantly deviate from expected pattern and other data instances, making them hard to predict. Correctly and timely detecting anomalous severe events can help reduce risks and losses. Many anomalous event detection techniques are studied in the literature. Recently, big data and machine learning based techniques have shown a remarkable success in a wide range of fields. It is important to tailor big data and machine learning based techniques for each application; otherwise it may result in expensive computation, slow prediction, false alarms, and improper prediction granularity.First, we aim to address the above challenges by harnessing big data and machine learning techniques for fast and reliable prediction and localization of severe events. Firstly, to improve storage failure prediction, we develop a new lightweight and high performing tensor decomposition-based method, named SEFEE, for storage error forecasting in large-scale enterprise storage systems. SEFEE employs tensor decomposition technique to capture latent spatio-temporal information embedded in storage event logs. By utilizing the latent spatio-temporal information, we can make accurate storage error forecasting without training requirements of typical machine learning techniques. The training-free method allows for live prediction of storage errors and their locations in the storage system based on previous observations that had been used in tensor decomposition pipeline to extract meaningful latent correlations. Moreover, we propose an extension to include severity of the errors as contextual information to improve the accuracy of tensor decomposition which in turn improves the prediction accuracy. We further provide detailed characterization of NetApp dataset to provide additional insight into the dynamics of typical large-scale enterprise storage systems for the community.Next, we focus on another application -- AI-driven Wildfire prediction. Wildfires cause billions of dollars in property damages and loss of lives, with harmful health threats. We aim to correctly detect and localize wildfire events in the early stage and also classify wildfire smoke based on perceived pixel density of camera images. Due to the lack of publicly available dataset for early wildfire smoke detection, we first collect and process images from the AlertWildfire camera network. The images are annotated with bounding boxes and densities for deep learning methods to use. We then adapt a transformer-based end-to-end object detection model for wildfire detection using our dataset. The dataset and detection model together form as a benchmark named the Nevada smoke detection benchmark, or Nemo for short. Nemo is the first open-source benchmark for wildfire smoke detection with the focus of the early incipient stage. We further provide a weakly supervised Nemo version to enable wider support as a benchmark