244 research outputs found

    A Trimodel SAR Semisupervised Recognition Method Based on Attention-Augmented Convolutional Networks

    Get PDF
    Semisupervised learning in synthetic aperture radars (SARs) is one of the research hotspots in the field of radar image automatic target recognition. It can efficiently deal with challenging environments where there are insufficient labeled samples and large unlabeled samples in the SAR dataset. In recent years, consistency regularization methods in semisupervised learning have shown considerable improvement in recognition accuracy and efficiency. Current consistency regularization approaches suffer from two main shortcomings: first, extracting all of the relevant information in the image target is difficult owing to the inability of conventional convolutional neural networks to capture global relational information; second, the standard teacher–student regularization methodology causes confirmation biases due to the high coupling between teacher and student models. This article adopts an innovative trimodel semisupervised method based on attention-augmented convolutional networks to address the aforementioned obstacles. Specifically, we develop an attention mechanism incorporating a novel positional embedding method based on recurrent neural networks and integrate this with a standard convolutional network as a feature extractor, to improve the network's ability to extract global feature information from images. Furthermore, we address the confirmation bias problem by introducing a classmate model to the standard teacher–student structure and utilize the model to impose a weak consistency constraint designed on the student to weaken the strong coupling between the teacher and the student. Comparative experiments on the Moving and Stationary Target Acquisition and Recognition dataset show that our method outperforms state-of-the-art semisupervised methods in terms of recognition accuracy, demonstrating its potential as a new benchmark approach for the deep learning and SAR research community

    Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements

    Get PDF
    This book is a reprint of the Special Issue entitled "Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements" that was published in Remote Sensing, MDPI. It provides insights into both core technical challenges and some selected critical applications of satellite remote sensing image analytics

    Understanding Human Actions in Video

    Full text link
    Understanding human behavior is crucial for any autonomous system which interacts with humans. For example, assistive robots need to know when a person is signaling for help, and autonomous vehicles need to know when a person is waiting to cross the street. However, identifying human actions in video is a challenging and unsolved problem. In this work, we address several of the key challenges in human action recognition. To enable better representations of video sequences, we develop novel deep learning architectures which improve representations both at the level of instantaneous motion as well as at the level of long-term context. In addition, to reduce reliance on fixed action vocabularies, we develop a compositional representation of actions which allows novel action descriptions to be represented as a sequence of sub-actions. Finally, we address the issue of data collection for human action understanding by creating a large-scale video dataset, consisting of 70 million videos collected from internet video sharing sites and their matched descriptions. We demonstrate that these contributions improve the generalization performance of human action recognition systems on several benchmark datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162887/1/stroud_1.pd

    Multi-Task Hypergraphs for Semi-supervised Learning using Earth Observations

    Full text link
    There are many ways of interpreting the world and they are highly interdependent. We exploit such complex dependencies and introduce a powerful multi-task hypergraph, in which every node is a task and different paths through the hypergraph reaching a given task become unsupervised teachers, by forming ensembles that learn to generate reliable pseudolabels for that task. Each hyperedge is part of an ensemble teacher for a given task and it is also a student of the self-supervised hypergraph system. We apply our model to one of the most important problems of our times, that of Earth Observation, which is highly multi-task and it often suffers from missing ground-truth data. By performing extensive experiments on the NASA NEO Dataset, spanning a period of 22 years, we demonstrate the value of our multi-task semi-supervised approach, by consistent improvements over strong baselines and recent work. We also show that the hypergraph can adapt unsupervised to gradual data distribution shifts and reliably recover, through its multi-task self-supervision process, the missing data for several observational layers for up to seven years.Comment: Accepted in ICCV 2023 Workshop

    Defect detection method for key area guided transmission line components based on knowledge distillation

    Get PDF
    Introduction: The aim of this paper is to address the problem of the limited number of defect images for both metal tools and insulators, as well as the small range of defect features.Methods: A defect detection method for key area-guided transmission line components based on knowledge distillation is proposed. First, the PGW (Prediction-Guided Weighting) module is introduced to improve the foreground target distillation region, and the distillation range is precisely concentrated in the position of the first k feature pixels with the highest quality score in the form of a mask. The feature knowledge of defects of hardware and insulators is used as the focus for the teacher network to guide the student network. Then, the GcBlock module is used to capture the relationship between the target defects of the hardware and the transmission lines in the background, and the overall relationship information of the image is used to promote the students’ network to learn the teacher’s network perception ability of the relationship information. Finally, the classification task mask and regression task mask generated by the PGW module, combined with the overall image relationship loss, form a distillation loss function for network training to improve the accuracy of students’ network detection accuracy.Results and Discussion: The effectiveness of the proposed method is verified by using self-build metal fittings and insulator defect data sets. The experimental results show that the student network mAP_50 (Mean Average Precision at 50) in the Faster R-CNN model with the knowledge distillation algorithm added in this paper increases by 8.44%, and the RetinaNet model increases by 2.6%. The Cascade R-CNN model improved by 5.28%

    Very High Resolution (VHR) Satellite Imagery: Processing and Applications

    Get PDF
    Recently, growing interest in the use of remote sensing imagery has appeared to provide synoptic maps of water quality parameters in coastal and inner water ecosystems;, monitoring of complex land ecosystems for biodiversity conservation; precision agriculture for the management of soils, crops, and pests; urban planning; disaster monitoring, etc. However, for these maps to achieve their full potential, it is important to engage in periodic monitoring and analysis of multi-temporal changes. In this context, very high resolution (VHR) satellite-based optical, infrared, and radar imaging instruments provide reliable information to implement spatially-based conservation actions. Moreover, they enable observations of parameters of our environment at greater broader spatial and finer temporal scales than those allowed through field observation alone. In this sense, recent very high resolution satellite technologies and image processing algorithms present the opportunity to develop quantitative techniques that have the potential to improve upon traditional techniques in terms of cost, mapping fidelity, and objectivity. Typical applications include multi-temporal classification, recognition and tracking of specific patterns, multisensor data fusion, analysis of land/marine ecosystem processes and environment monitoring, etc. This book aims to collect new developments, methodologies, and applications of very high resolution satellite data for remote sensing. The works selected provide to the research community the most recent advances on all aspects of VHR satellite remote sensing

    End-to-end Lip-reading: A Preliminary Study

    Get PDF
    Deep lip-reading is the combination of the domains of computer vision and natural language processing. It uses deep neural networks to extract speech from silent videos. Most works in lip-reading use a multi staged training approach due to the complex nature of the task. A single stage, end-to-end, unified training approach, which is an ideal of machine learning, is also the goal in lip-reading. However, pure end-to-end systems have not yet been able to perform as good as non-end-to-end systems. Some exceptions to this are the very recent Temporal Convolutional Network (TCN) based architectures. This work lays out preliminary study of deep lip-reading, with a special focus on various end-to-end approaches. The research aims to test whether a purely end-to-end approach is justifiable for a task as complex as deep lip-reading. To achieve this, the meaning of pure end-to-end is first defined and several lip-reading systems that follow the definition are analysed. The system that most closely matches the definition is then adapted for pure end-to-end experiments. Four main contributions have been made: i) An analysis of 9 different end-to-end deep lip-reading systems, ii) Creation and public release of a pipeline1 to adapt sentence level Lipreading Sentences in the Wild 3 (LRS3) dataset into word level, iii) Pure end-to-end training of a TCN based network and evaluation on LRS3 word-level dataset as a proof of concept, iv) a public online portal2 to analyse visemes and experiment live end-to-end lip-reading inference. The study is able to verify that pure end-to-end is a sensible approach and an achievable goal for deep machine lip-reading

    Remote Sensing of the Aquatic Environments

    Get PDF
    The book highlights recent research efforts in the monitoring of aquatic districts with remote sensing observations and proximal sensing technology integrated with laboratory measurements. Optical satellite imagery gathered at spatial resolutions down to few meters has been used for quantitative estimations of harmful algal bloom extent and Chl-a mapping, as well as winds and currents from SAR acquisitions. The knowledge and understanding gained from this book can be used for the sustainable management of bodies of water across our planet
    • …
    corecore