Search CORE

359 research outputs found

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Author: Huang Zhiheng
Mao Junhua
Wang Jiang
Xu Wei
Yang Yi
Yuille Alan
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, we apply the m-RNN model to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. The project page of this work is: www.stat.ucla.edu/~junhua.mao/m-RNN.html .Comment: Add a simple strategy to boost the performance of image captioning task significantly. More details are shown in Section 8 of the paper. The code and related data are available at https://github.com/mjhucla/mRNN-CR ;. arXiv admin note: substantial text overlap with arXiv:1410.109

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

A Study on the Bond Behavior of Corroded Reinforced Concrete Containing Recycled Aggregates

Author: Haifeng Yang
Liangsheng Lv
Yinghong Qin
Zhiheng Deng
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Gastroesophageal Reflux Disease: Molecular Predictors in Neoplastic Progression of Barrett\u27s Esophagus

Author: Abraham Khan
Fritz Francois
Liying Yang
Sam M. Serouya
Zhiheng Pei
Publication venue: 'IntechOpen'
Publication date: 16/03/2012
Field of study

IntechOpen

Processing-structure-protrusion relationship of 3D Cu TSVs: control at the atomic scale

Author: Jinxin Liu (311455)
Paul Conway (1249635)
Yang Liu (4829)
Zhiheng Huang (6187427)
Publication venue
Publication date: 01/01/2019
Field of study

A phase-field-crystal model is used to investigate the processing-structure-protrusion relationship of blind Cu through-silicon vias (TSVs) at the atomic scale. A higher temperature results in a larger TSV protrusion. Deformation via dislocation motion dominates at temperatures lower than around 300∘C, while both diffusional and dislocation creep occur at temperatures greater than around 300∘C. TSVs with smaller sidewall roughness Ra and wavelength λa exhibit larger protrusions. Moreover, different protrusion profiles are observed for TSVs with different grain structures. Both protrusions and intrusions are observed when a single grain is placed near the TSV top end, while the top surface protrudes near both edges when it contains more grains. Under symmetric loading, coalescence of the grains occurs near the top end, and a symmetric grain structure can accelerate this process. The strain distributions in TSVs are calculated, and the eigenstrain projection along the vertical direction can be considered an index to predict the TSV protrusion tendency

Loughborough University Institutional Repository

Directory of Open Access Journals

Protrusion of Cu-TSV under different strain states

Author: Jinxin Liu (311455)
Paul Conway (1249635)
Yang Liu (4829)
Zhiheng Huang (6187427)
Publication venue
Publication date: 01/01/2019
Field of study

A phase-field-crystal (PFC) model is used to investigate the protrusion of blind TSVs under different strain states. The direction of loading applied to the TSVs has an effect on the protrusion, which is closely related to the copper grains and their orientations at the TSV edges. A nonlinear relation between protrusion and strain rate has been found, which can be explained by different mechanisms of deformation. A higher strain occurring near the top end of the TSVs leads to a larger protrusion of the bind TSVs

Loughborough University Institutional Repository

Crossref

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Author: Huang Zhiheng
Mao Junhua
Wang Jiang
Xu Wei
Yang Yi
Yuille Alan L.
Publication venue: Center for Brains, Minds and Machines (CBMM), arXiv
Publication date: 07/05/2015
Field of study

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated according to this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

DSpace@MIT