123 research outputs found

    Attention Correctness in Neural Image Captioning

    Full text link
    Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision. But despite their popularity, the "correctness" of the implicitly-learned attention maps has only been assessed qualitatively by visualization of several examples. In this paper we focus on evaluating and improving the correctness of attention in neural image captioning models. Specifically, we propose a quantitative evaluation metric for the consistency between the generated attention maps and human annotations, using recently released datasets with alignment between regions in images and entities in captions. We then propose novel models with different levels of explicit supervision for learning attention maps during training. The supervision can be strong when alignment between regions and caption entities are available, or weak when only object segments and categories are provided. We show on the popular Flickr30k and COCO datasets that introducing supervision of attention maps during training solidly improves both attention correctness and caption quality, showing the promise of making machine perception more human-like.Comment: To appear in AAAI-17. See http://www.cs.jhu.edu/~cxliu/ for supplementary materia

    Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

    Get PDF
    In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, we apply the m-RNN model to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. The project page of this work is: www.stat.ucla.edu/~junhua.mao/m-RNN.html .Comment: Add a simple strategy to boost the performance of image captioning task significantly. More details are shown in Section 8 of the paper. The code and related data are available at https://github.com/mjhucla/mRNN-CR ;. arXiv admin note: substantial text overlap with arXiv:1410.109

    Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

    Get PDF
    In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated according to this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

    Storax Protected Oxygen-Glucose Deprivation/Reoxygenation Induced Primary Astrocyte Injury by Inhibiting NF-κB Activation in vitro

    Get PDF
    Stroke is the second leading cause of death and the leading cause of long-term disability in the world. There is an urgent unmet need to develop a range of neuroprotective strategies to restrain the damage that occurs in the hours and days following a stroke. Storax, a natural resin extracted from injuring Liquidambar orientalis Mill, has been used to treat acute stroke in traditional Chinese medicine for many centuries. Storax has demonstrated the neuroprotective effects in cerebrovascular diseases. However, the neuroprotective mechanisms activated by storax in ischemia/reperfusion-injured astrocytes have not been elucidated. In this study, we established an oxygen-glucose deprivation/reoxygenation (OGD/R)-induced astrocytes injury model to investigate the effects of storax on OGD/R-induced astrocytes injury and potential mechanisms. Experimental results showed that storax alleviated expression of inflammatory cytokines and protected primary cortical astrocytes injured by OGD/R. Furthermore, storax could inhibit NF-κB activation in injured astrocytes by OGD/R and inhibition of NF-κB with Bay-11-7082 obscured the neuroprotective effects of storax. In conclusion, storax alleviated expression of inflammatory cytokines and protected primary cortical astrocytes injured by OGD/R, which was partially mediated by NF-κB signaling pathway activation

    Minangkabau culture in the mirror of randai theatre plays

    Get PDF
    Předmětem disertační práce je kultura etnika Minangkabau, jehož domovskou oblastí je současná indonéská provincie Západní Sumatra, nicméně jeho diaspora pokrývá celé indonéské souostroví a zasahuje rovněž do sousední Malajsie, Singapuru a dalších částí světa. Minangská kultura je v disertaci nahlížena z perspektivy her minangského lidového divadla randai, které vzniklo jako součást mužské iniciace, aby se později stalo jedním z významných symbolických výrazových prostředků minangské identity. Disertační práce využívá ukázek z vybraných, autorkou v terénu v jejich autentickém kontextu zaznamenaných her randai, jako příkladů k výkladu klíčových kulturních konceptů, jež tvoří jádro kultury etnika Minangkabau, a zabývá se rovněž historicko-sociálními souvislostmi, které vedly k jejich formování do současné podoby. Předmětem výkladu jsou za prvé společenské postavení a klíčové společenské role žen v kontextu společenského systému založeného na principech matrilineárního příbuzenství ve srovnání s postavením a rolemi mužů, za druhé základní východiska minangské etnopsychologie, na nichž je založeno genderové rozlišení společenských rolí, jakož i odlišné postupy socializace, za třetí institucionalizovaná dočasná migrace jako součást mužské iniciace a konečně vztah adatu, (systému norem, hodnot a vzorů pro chování...The dissertation thesis is focused on the culture of the ethnic group Minangkabau, whose home territory is situated in the present-day Indonesian province of West Sumatra, yet whose diaspora, reaching its highest density in the islands of Sumatra and Java, recently covers the whole of the Indonesian archipelago, overlapping to the neighbouring countries of Malaysia, Singapore, and beyond. In the thesis, Minangkabau culture is viewed through the looking glass of the Minang folk theatre randai, which traces its origins to the iniciation of young Minang males, yet which has with time transformed into an important means of symbolic expression of the Minang identity. Selected randai plays recorded by the author in their authentic social context are used to provide examples for the explanation of some of the key concepts that form the core of Minang culture, while socio-historical conditions are considered for the influence they have had upon the shaping of these concepts to their present form. Topics that are subject to explanation include the following: the social position and the key social roles performed by women in the context of a system of social organization based on the principals of matrilineal kinship, relative to the position and roles of men; basic principals of Minang ethnopsychology as source of...Institute of EthnologyÚstav etnologieFilozofická fakultaFaculty of Art

    Comparing Evapotranspiration from Eddy Covariance Measurements, Water Budgets, Remote Sensing, and Land Surface Models over Canada

    Get PDF
    This study compares six evapotranspiration ET products for Canada’s landmass, namely, eddy covariance EC measurements; surface water budget ET; remote sensing ET from MODIS; and land surface model (LSM) ET from the Community Land Model (CLM), the Ecological Assimilation of Land and Climate Observations (EALCO) model, and the Variable Infiltration Capacity model (VIC). The ET climatology over the Canadian landmass is characterized and the advantages and limitations of the datasets are discussed. The EC measurements have limited spatial coverage, making it difficult for model validations at the national scale. Water budget ET has the largest uncertainty because of data quality issues with precipitation in mountainous regions and in the north. MODIS ET shows relatively large uncertainty in cold seasons and sparsely vegetated regions. The LSM products cover the entire landmass and exhibit small differences in ET among them. Annual ET from the LSMs ranges from small negative values to over 600 mm across the landmass, with a countrywide average of 256 ± 15 mm. Seasonally, the countrywide average monthly ET varies from a low of about 3 mm in four winter months (November–February) to 67 ± 7 mm in July. The ET uncertainty is scale dependent. Larger regions tend to have smaller uncertainties because of the offset of positive and negative biases within the region. More observation networks and better quality controls are critical to improving ET estimates. Future techniques should also consider a hybrid approach that integrates strengths of the various ET products to help reduce uncertainties in ET estimation
    corecore