71 research outputs found

    Thermal-Infrared Remote Target Detection System for Maritime Rescue based on Data Augmentation with 3D Synthetic Data

    Full text link
    This paper proposes a thermal-infrared (TIR) remote target detection system for maritime rescue using deep learning and data augmentation. We established a self-collected TIR dataset consisting of multiple scenes imitating human rescue situations using a TIR camera (FLIR). Additionally, to address dataset scarcity and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to augment the data is further collected. However, a significant domain gap exists between synthetic TIR and real TIR images. Hence, a proper domain adaptation algorithm is essential to overcome the gap. Therefore, we suggest a domain adaptation algorithm in a target-background separated manner from 3D game-to-real, based on a generative model, to address this issue. Furthermore, a segmentation network with fixed-weight kernels at the head is proposed to improve the signal-to-noise ratio (SNR) and provide weak attention, as remote TIR targets inherently suffer from unclear boundaries. Experiment results reveal that the network trained on augmented data consisting of translated synthetic and real TIR data outperforms that trained on only real TIR data by a large margin. Furthermore, the proposed segmentation model surpasses the performance of state-of-the-art segmentation methods.Comment: 12 page

    Retina-Inspired Carbon Nitride-Based Photonic Synapses for Selective Detection of UV Light

    Get PDF
    Photonic synapses combine sensing and processing in a single device, so they are promising candidates to emulate visual perception of a biological retina. However, photonic synapses with wavelength selectivity, which is a key property for visual perception, have not been developed so far. Herein, organic photonic synapses that selectively detect UV rays and process various optical stimuli are presented. The photonic synapses use carbon nitride (C3N4) as an UV-responsive floating-gate layer in transistor geometry. C3N4 nanodots dominantly absorb UV light; this trait is the basis of UV selectivity in these photonic synapses. The presented devices consume only 18.06 fJ per synaptic event, which is comparable to the energy consumption of biological synapses. Furthermore, in situ modulation of exposure to UV light is demonstrated by integrating the devices with UV transmittance modulators. These smart systems can be further developed to combine detection and dose-calculation to determine how and when to decrease UV transmittance for preventive health care.

    Low-temperature synthesis of LiFePO4 nanocrystals by solvothermal route

    Get PDF
    LiFePO4 nanocrystals were synthesized at a very low temperature of 170°C using carbon nanoparticles by a solvothermal process in a polyol medium, namely diethylene glycol without any heat treatment as a post procedure. The powder X-ray diffraction pattern of the LiFePO4 was indexed well to a pure orthorhombic system of olivine structure (space group: Pnma) with no undesirable impurities. The LiFePO4 nanocrystals synthesized at low temperature exhibited mono-dispersed and carbon-mixed plate-type LiFePO4 nanoparticles with average length, width, and thickness of approximately 100 to 300 nm, 100 to 200 nm, and 50 nm, respectively. It also appeared to reveal considerably enhanced electrochemical properties when compared to those of pristine LiFePO4. These observed results clearly indicate the effect of carbon in improving the reactivity and synthesis of LiFePO4 nanoparticles at a significantly lower temperature

    Multi-View Attention Network for Visual Dialog

    No full text
    Visual dialog is a challenging vision-language task in which a series of questions visually grounded by a given image are answered. To resolve the visual dialog task, a high-level understanding of various multimodal inputs (e.g., question, dialog history, and image) is required. Specifically, it is necessary for an agent to (1) determine the semantic intent of question and (2) align question-relevant textual and visual contents among heterogeneous modality inputs. In this paper, we propose Multi-View Attention Network (MVAN), which leverages multiple views about heterogeneous inputs based on attention mechanisms. MVAN effectively captures the question-relevant information from the dialog history with two complementary modules (i.e., Topic Aggregation and Context Matching), and builds multimodal representations through sequential alignment processes (i.e., Modality Alignment). Experimental results on VisDial v1.0 dataset show the effectiveness of our proposed model, which outperforms previous state-of-the-art methods under both single model and ensemble settings
    corecore