6,021 research outputs found

    Multi-Context Attention for Human Pose Estimation

    Full text link
    In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple resolutions with various semantics. The Conditional Random Field (CRF) is utilized to model the correlations among neighboring regions in the attention map. We further combine the holistic attention model, which focuses on the global consistency of the full human body, and the body part attention model, which focuses on the detailed description for different body parts. Hence our model has the ability to focus on different granularity from local salient regions to global semantic-consistent spaces. Additionally, we design novel Hourglass Residual Units (HRUs) to increase the receptive field of the network. These units are extensions of residual units with a side branch incorporating filters with larger receptive fields, hence features with various scales are learned and combined within the HRUs. The effectiveness of the proposed multi-context attention mechanism and the hourglass residual units is evaluated on two widely used human pose estimation benchmarks. Our approach outperforms all existing methods on both benchmarks over all the body parts.Comment: The first two authors contribute equally to this wor

    Image Reconstruction from Bag-of-Visual-Words

    Full text link
    The objective of this work is to reconstruct an original image from Bag-of-Visual-Words (BoVW). Image reconstruction from features can be a means of identifying the characteristics of features. Additionally, it enables us to generate novel images via features. Although BoVW is the de facto standard feature for image recognition and retrieval, successful image reconstruction from BoVW has not been reported yet. What complicates this task is that BoVW lacks the spatial information for including visual words. As described in this paper, to estimate an original arrangement, we propose an evaluation function that incorporates the naturalness of local adjacency and the global position, with a method to obtain related parameters using an external image database. To evaluate the performance of our method, we reconstruct images of objects of 101 kinds. Additionally, we apply our method to analyze object classifiers and to generate novel images via BoVW

    Remote Sensing of the Oceans

    Get PDF
    This book covers different topics in the framework of remote sensing of the oceans. Latest research advancements and brand-new studies are presented that address the exploitation of remote sensing instruments and simulation tools to improve the understanding of ocean processes and enable cutting-edge applications with the aim of preserving the ocean environment and supporting the blue economy. Hence, this book provides a reference framework for state-of-the-art remote sensing methods that deal with the generation of added-value products and the geophysical information retrieval in related fields, including: Oil spill detection and discrimination; Analysis of tropical cyclones and sea echoes; Shoreline and aquaculture area extraction; Monitoring coastal marine litter and moving vessels; Processing of SAR, HF radar and UAV measurements

    Distinctive Fire and Smoke Detection with Self-Similar

    Full text link
    Deep learning based object detection is demonstrating a preponderance in the practical artificial intelligence. However, there still are some objects that are difficult to be recognized such as fire and smoke because of their non-solid shapes. However, these objects have a mathematical fractal feature of self-similar that can relieve us from struggling with their various shapes. To this end, we propose to utilize the Hausdorff distance to evaluate the self-similarity and accordingly tailored a loss function to improve the detection accuracy of fire and smoke. Moreover, we proposed a general labeling criterion for these objects based on their geometrical features. Our experiments on commonly used baseline networks for object detection have verified that our method is valid and have improved the detecting accuracy by 2.23%

    Study on multi-SVM systems and their applications to pattern recognition

    Get PDF
    制度:新 ; 報告番号:甲3136号 ; 学位の種類:博士(工学) ; 授与年月日:2010/7/12 ; 早大学位記番号:新541

    Autonomous Vehicles: MMW Radar Backscattering Modeling of Traffic Environment, Vehicular Communication Modeling, and Antenna Designs

    Full text link
    77 GHz Millimeter-wave (mmWave) radar serves as an essential component among many sensors required for autonomous navigation. High-fidelity simulation is indispensable for nowadays’ development of advanced automotive radar systems because radar simulation can accelerate the design and testing process and help people to better understand and process the radar data. The main challenge in automotive radar simulation is to simulate the complex scattering behavior of various targets in real time, which is required for sensor fusion with other sensory simulation, e.g. optical image simulation. In this thesis, an asymptotic method based on a fast-wideband physical optics (PO) calculation is developed and applied to get high fidelity radar response of traffic scenes and generate the corresponding radar images from traffic targets. The targets include pedestrians, vehicles, and other stationary targets. To further accelerate the simulation into real time, a physics-based statistical approach is developed. The RCS of targets are fit into statistical distributions, and then the statistical parameters are summarized as functions of range and aspect angles, and other attributes of the targets. For advanced radar with multiple transmitters and receivers, pixelated-scatterer statistical RCS models are developed to represent objects as extend targets and relax the requirement for far-field condition. A real-time radar scene simulation software, which will be referred to as Michigan Automotive Radar Scene Simulator (MARSS), based on the statistical models are developed and integrated with a physical 3D scene generation software (Unreal Engine 4). One of the major challenges in radar signal processing is to detect the angle of arrival (AOA) of multiple targets. A new analytic multiple-sources AOA estimation algorithm that outperforms many well-known AOA estimation algorithms is developed and verified by experiments. Moreover, the statistical parameters of RCS from targets and radar images are used in target classification approaches based on machine learning methods. In realistic road traffic environment, foliage is commonly encountered that can potentially block the line-of-sight link. In the second part of the thesis, a non-line-of-sight (NLoS) vehicular propagation channel model for tree trunks at two vehicular communication bands (5.9 GHz and 60 GHz) is proposed. Both near-field and far-field scattering models from tree trunk are developed based on modal expansion and surface current integral method. To make the results fast accessible and retractable, a macro model based on artificial neural network (ANN) is proposed to fit the path loss calculated from the complex electromagnetic (EM) based methods. In the third part of the thesis, two broadband (bandwidth > 50%) omnidirectional antenna designs are discussed to enable polarization diversity for next-generation communication systems. The first design is a compact horizontally polarized (HP) antenna, which contains four folded dipole radiators and utilizing their mutual coupling to enhance the bandwidth. The second one is a circularly polarized (CP) antenna. It is composed of one ultra-wide-band (UWB) monopole, the compact HP antenna, and a dedicatedly designed asymmetric power divider based feeding network. It has about 53% overlapping bandwidth for both impedance and axial ratio with peak RHCP gain of 0.9 dBi.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163001/1/caixz_1.pd

    Poly-GAN: A Multi-Conditioned GAN for Multiple Tasks

    Get PDF
    We present Poly-GAN, a novel conditional GAN architecture that is motivated by different Image generation and manipulation applications like Fashion Synthesis, an application where garments are automatically placed on images of human models at an arbitrary pose, image inpainting, an application where we try to recover a damaged image using the edges or a rough sketch of the image. While different applications use different GAN setup for image generation, we propose only one architecture for multiple applications with little to no change in the pipeline. Poly-GAN allows conditioning on multiple inputs and is suitable for many different tasks. Our novel architecture enforces the conditions at all layers of the encoder and utilizes skip connections from the coarse layers of the encoder to the respective layers of the decoder. Coarse layers are easier to manipulate in shape change using condition, which results in higher level change in the result. Our system achieves state-of-the-art quantitative results on Fashion Synthesis based on the Structural Similarity Index metric and Inception Score metric using the DeepFashion dataset. For the image inpainting task we are achieving competitive results compared to current state of the art methods
    corecore