294 research outputs found

    Video-based Sign Language Recognition without Temporal Segmentation

    Full text link
    Millions of hearing impaired people around the world routinely use some variants of sign languages to communicate, thus the automatic translation of a sign language is meaningful and important. Currently, there are two sub-problems in Sign Language Recognition (SLR), i.e., isolated SLR that recognizes word by word and continuous SLR that translates entire sentences. Existing continuous SLR methods typically utilize isolated SLRs as building blocks, with an extra layer of preprocessing (temporal segmentation) and another layer of post-processing (sentence synthesis). Unfortunately, temporal segmentation itself is non-trivial and inevitably propagates errors into subsequent steps. Worse still, isolated SLR methods typically require strenuous labeling of each word separately in a sentence, severely limiting the amount of attainable training data. To address these challenges, we propose a novel continuous sign recognition framework, the Hierarchical Attention Network with Latent Space (LS-HAN), which eliminates the preprocessing of temporal segmentation. The proposed LS-HAN consists of three components: a two-stream Convolutional Neural Network (CNN) for video feature representation generation, a Latent Space (LS) for semantic gap bridging, and a Hierarchical Attention Network (HAN) for latent space based recognition. Experiments are carried out on two large scale datasets. Experimental results demonstrate the effectiveness of the proposed framework.Comment: 32nd AAAI Conference on Artificial Intelligence (AAAI-18), Feb. 2-7, 2018, New Orleans, Louisiana, US

    Relay Selection Considering Successive Packets Transmission in Cooperative Communication Networks

    Get PDF
    Relay selection has been regarded as an effective method to improve the performance of cooperative communication system. However, frequent operation of relay selection can bring enormous control message overhead and thereby decrease the performance of cooperative communication. To reduce the relay selection frequency, in this paper, we propose a relay selection scheme to choose the best relay with considering successive packets transmission. In this scheme, according to the length of data packet, data transmission rate and the estimated channel state information (CSI), the best relay is selected to maximize the number of successive packets transmission under the condition that the given symbol-error-rate (SER) is kept. Finally, numerical results show that the proposed relay selection scheme can support the operation of successive packets transmission in cooperative wireless networks and that the maximum number of successive packets transmission is affected by the different network parameters, i.e., data transmission rate, packet length and Doppler frequency at one relay node

    IA2U: A Transfer Plugin with Multi-Prior for In-Air Model to Underwater

    Full text link
    In underwater environments, variations in suspended particle concentration and turbidity cause severe image degradation, posing significant challenges to image enhancement (IE) and object detection (OD) tasks. Currently, in-air image enhancement and detection methods have made notable progress, but their application in underwater conditions is limited due to the complexity and variability of these environments. Fine-tuning in-air models saves high overhead and has more optional reference work than building an underwater model from scratch. To address these issues, we design a transfer plugin with multiple priors for converting in-air models to underwater applications, named IA2U. IA2U enables efficient application in underwater scenarios, thereby improving performance in Underwater IE and OD. IA2U integrates three types of underwater priors: the water type prior that characterizes the degree of image degradation, such as color and visibility; the degradation prior, focusing on differences in details and textures; and the sample prior, considering the environmental conditions at the time of capture and the characteristics of the photographed object. Utilizing a Transformer-like structure, IA2U employs these priors as query conditions and a joint task loss function to achieve hierarchical enhancement of task-level underwater image features, therefore considering the requirements of two different tasks, IE and OD. Experimental results show that IA2U combined with an in-air model can achieve superior performance in underwater image enhancement and object detection tasks. The code will be made publicly available

    Why the processing of repeated targets are better than that of no repetition: evidence from easy-to-difficult and difficult-to-easy switching situations

    Get PDF
    Background: Previous studies have found that the processing of repeated targets are easier than that of non-repetition. Although several theories attempt to explain this issue, the underlying mechanism still remains uncovered. In this study, we tried to address this issue by exploring the underlying brain responses during this process. Methods: Brain activities were recorded while thirty participants performing a Stroop task (Chinese version) in the MRI scanner. Using pseudo-random strategies, we created two types of switching conditions (easy-to-difficult; difficult-to-easy) and relevant repeating conditions. Results: The results show that, in difficult-to-easy switching situation, higher brain activations are found in left precuneus than repeating ones (the precuneus is thought related with attention demands). In easy-to-difficult switching conditions, higher brain activations are found in precuneus, superior temporal gyrus, posterior cingulate cortex, and inferior frontal gyrus than repeating trials (most of these regions are thought related with executive function). No overlapping brain regions are observed in con_CON and incon_INCON conditions. Beta figures of the survived clusters in different conditions, correlations between brain activations and switch cost were calculated. Conclusions: The present study suggests that the feature that response time in switching trials are longer than that in repeating trials are caused by the extra endeavors engaged in the switching processes

    Construction of hierarchical CuO/Cu2O@NiCo2S4 Nanowire arrays on copper foam for high performance supercapacitor electrodes

    Get PDF
    Hierarchical copper oxide @ ternary nickel cobalt sulfide (CuO/Cu2O@NiCo2S4) core-shell nanowire arrays on Cu foam have been successfully constructed by a facile two-step strategy. Vertically aligned CuO/Cu2O nanowire arrays are firstly grown on Cu foam by one-step thermal oxidation of Cu foam, followed by electrodeposition of NiCo2S4 nanosheets on the surface of CuO/Cu2O nanowires to form the CuO/Cu2O@NiCo2S4 core-shell nanostructures. Structural and morphological characterizations indicate that the average thickness of the NiCo2S4 nanosheets is ~20 nm and the diameter of CuO/Cu2O core is ~50 nm. Electrochemical properties of the hierarchical composites as integrated binder-free electrodes for supercapacitor were evaluated by various electrochemical methods. The hierarchical composite electrodes could achieve ultrahigh specific capacitance of 3.186 F cm-2 at 10 mA cm-2, good rate capability (82.06% capacitance retention at the current density from 2 to 50 mA cm-2) and excellent cycling stability, with capacitance retention of 96.73% after 2000 cycles at 10 mA cm-2. These results demonstrate the significance of optimized design and fabrication of electrode materials with more sufficient electrolyte-electrode interface, robust structural integrity and fast ion/electron transfer. © 2017 by the authors. Licensee MDPI, Basel, Switzerland.21371057, NSFC, National Natural Science Foundation of ChinaNational Key R&D Program of China [2016YFE0131200]; National Natural Science Foundation of China [21371057]; International Cooperation Project of Shanghai Municipal Science and Technology Committee [15520721100

    Ladder Loss for Coherent Visual-Semantic Embedding

    Full text link
    For visual-semantic embedding, the existing methods normally treat the relevance between queries and candidates in a bipolar way -- relevant or irrelevant, and all "irrelevant" candidates are uniformly pushed away from the query by an equal margin in the embedding space, regardless of their various proximity to the query. This practice disregards relatively discriminative information and could lead to suboptimal ranking in the retrieval results and poorer user experience, especially in the long-tail query scenario where a matching candidate may not necessarily exist. In this paper, we introduce a continuous variable to model the relevance degree between queries and multiple candidates, and propose to learn a coherent embedding space, where candidates with higher relevance degrees are mapped closer to the query than those with lower relevance degrees. In particular, the new ladder loss is proposed by extending the triplet loss inequality to a more general inequality chain, which implements variable push-away margins according to respective relevance degrees. In addition, a proper Coherent Score metric is proposed to better measure the ranking results including those "irrelevant" candidates. Extensive experiments on multiple datasets validate the efficacy of our proposed method, which achieves significant improvement over existing state-of-the-art methods.Comment: Accepted to AAAI-202
    corecore