10,054 research outputs found
Multimodal Convolutional Neural Networks for Matching Image and Sentence
In this paper, we propose multimodal convolutional neural networks (m-CNNs)
for matching image and sentence. Our m-CNN provides an end-to-end framework
with convolutional architectures to exploit image representation, word
composition, and the matching relations between the two modalities. More
specifically, it consists of one image CNN encoding the image content, and one
matching CNN learning the joint representation of image and sentence. The
matching CNN composes words to different semantic fragments and learns the
inter-modal relations between image and the composed fragments at different
levels, thus fully exploit the matching relations between image and sentence.
Experimental results on benchmark databases of bidirectional image and sentence
retrieval demonstrate that the proposed m-CNNs can effectively capture the
information necessary for image and sentence matching. Specifically, our
proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and
Microsoft COCO databases achieve the state-of-the-art performances.Comment: Accepted by ICCV 201
A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks
Deep neural networks (DNNs) have achieved significant success in a variety of
real world applications, i.e., image classification. However, tons of
parameters in the networks restrict the efficiency of neural networks due to
the large model size and the intensive computation. To address this issue,
various approximation techniques have been investigated, which seek for a light
weighted network with little performance degradation in exchange of smaller
model size or faster inference. Both low-rankness and sparsity are appealing
properties for the network approximation. In this paper we propose a unified
framework to compress the convolutional neural networks (CNNs) by combining
these two properties, while taking the nonlinear activation into consideration.
Each layer in the network is approximated by the sum of a structured sparse
component and a low-rank component, which is formulated as an optimization
problem. Then, an extended version of alternating direction method of
multipliers (ADMM) with guaranteed convergence is presented to solve the
relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet
and GoogLeNet with large image classification datasets. The results outperform
previous work in terms of accuracy degradation, compression rate and speedup
ratio. The proposed method is able to remarkably compress the model (with up to
4.9x reduction of parameters) at a cost of little loss or without loss on
accuracy.Comment: 8 pages, 5 figures, 6 table
Recommended from our members
Nanoparticles for post-infarct ventricular remodeling
YesIn recent years, tremendous progress has been made in the treatment of acute myocardial infarction (AMI), but pathological ventricular remodeling often causes survivors to suffer from fatal heart failure. Currently, there is no effective therapy to attenuate ventricular remodeling. Recently, nanoparticles-based drug delivery system is widely applied in biomedicine especially in cancer and liver fibrosis, owing to its excellent physical, chemical, and biological properties. Therefore, using nanoparticles as delivery vehicles of small molecules, polypeptides, etc to improve post-infarct ventricular remodeling are expected. In this review, we summarized the updated researches in this fast-growing area and suggested further works needed
- …