Search CORE

10,054 research outputs found

Multimodal Convolutional Neural Networks for Matching Image and Sentence

Author: Li Hang
Lu Zhengdong
Ma Lin
Shang Lifeng
Publication venue
Publication date: 29/08/2015
Field of study

In this paper, we propose multimodal convolutional neural networks (m-CNNs) for matching image and sentence. Our m-CNN provides an end-to-end framework with convolutional architectures to exploit image representation, word composition, and the matching relations between the two modalities. More specifically, it consists of one image CNN encoding the image content, and one matching CNN learning the joint representation of image and sentence. The matching CNN composes words to different semantic fragments and learns the inter-modal relations between image and the composed fragments at different levels, thus fully exploit the matching relations between image and sentence. Experimental results on benchmark databases of bidirectional image and sentence retrieval demonstrate that the proposed m-CNNs can effectively capture the information necessary for image and sentence matching. Specifically, our proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and Microsoft COCO databases achieve the state-of-the-art performances.Comment: Accepted by ICCV 201

arXiv.org e-Print Archive

Crossref

A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks

Author: Chen Ran
Cho Minsik
Li Wei
Ma Yuzhe
Shang Fanhua
Yu Bei
Yu Wenjian
Publication venue
Publication date: 19/08/2019
Field of study

Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model size and the intensive computation. To address this issue, various approximation techniques have been investigated, which seek for a light weighted network with little performance degradation in exchange of smaller model size or faster inference. Both low-rankness and sparsity are appealing properties for the network approximation. In this paper we propose a unified framework to compress the convolutional neural networks (CNNs) by combining these two properties, while taking the nonlinear activation into consideration. Each layer in the network is approximated by the sum of a structured sparse component and a low-rank component, which is formulated as an optimization problem. Then, an extended version of alternating direction method of multipliers (ADMM) with guaranteed convergence is presented to solve the relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet and GoogLeNet with large image classification datasets. The results outperform previous work in terms of accuracy degradation, compression rate and speedup ratio. The proposed method is able to remarkably compress the model (with up to 4.9x reduction of parameters) at a cost of little loss or without loss on accuracy.Comment: 8 pages, 5 figures, 6 table

arXiv.org e-Print Archive

Crossref

Recommended from our members

Nanoparticles for post-infarct ventricular remodeling

Author: Dong C.
Ma A.
Shang Lijun
Publication venue: 'Future Medicine Ltd'
Publication date: 24/10/2018
Field of study

YesIn recent years, tremendous progress has been made in the treatment of acute myocardial infarction (AMI), but pathological ventricular remodeling often causes survivors to suffer from fatal heart failure. Currently, there is no effective therapy to attenuate ventricular remodeling. Recently, nanoparticles-based drug delivery system is widely applied in biomedicine especially in cancer and liver fibrosis, owing to its excellent physical, chemical, and biological properties. Therefore, using nanoparticles as delivery vehicles of small molecules, polypeptides, etc to improve post-infarct ventricular remodeling are expected. In this review, we summarized the updated researches in this fast-growing area and suggested further works needed

Bradford Scholars