10,048 research outputs found
DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices
Deploying deep neural networks on mobile devices is a challenging task.
Current model compression methods such as matrix decomposition effectively
reduce the deployed model size, but still cannot satisfy real-time processing
requirement. This paper first discovers that the major obstacle is the
excessive execution time of non-tensor layers such as pooling and normalization
without tensor-like trainable parameters. This motivates us to design a novel
acceleration framework: DeepRebirth through "slimming" existing consecutive and
parallel non-tensor and tensor layers. The layer slimming is executed at
different substructures: (a) streamline slimming by merging the consecutive
non-tensor and tensor layer vertically; (b) branch slimming by merging
non-tensor and tensor branches horizontally. The proposed optimization
operations significantly accelerate the model execution and also greatly reduce
the run-time memory cost since the slimmed model architecture contains less
hidden layers. To maximally avoid accuracy loss, the parameters in new
generated layers are learned with layer-wise fine-tuning based on both
theoretical analysis and empirical verification. As observed in the experiment,
DeepRebirth achieves more than 3x speed-up and 2.5x run-time memory saving on
GoogLeNet with only 0.4% drop of top-5 accuracy on ImageNet. Furthermore, by
combining with other model compression techniques, DeepRebirth offers an
average of 65ms inference time on the CPU of Samsung Galaxy S6 with 86.5% top-5
accuracy, 14% faster than SqueezeNet which only has a top-5 accuracy of 80.5%.Comment: AAAI 201
Recommended from our members
Document generality: its computation for ranking
The increased variety of information makes it critical to retrieve documents which are not only relevant but also broad enough to cover as many different aspects of a certain topic as possible. The increased variety of users also makes it critical to retrieve documents that are jargon free and easy-to-understand rather than the specific technical materials. In this paper, we propose a new concept namely document generality computation. Generality of document is of fundamental importance to information retrieval. Document generality is the state or quality of docu- ment being general. We compute document general- ity based on a domain-ontology method that analyzes scope and semantic cohesion of concepts appeared in the text. For test purposes, our proposed approach is then applied to improving the performance of doc- ument ranking in bio-medical information retrieval. The retrieved documents are re-ranked by a combined score of similarity and the closeness of documents’ generality to that of a query. The experiments have shown that our method can work on a large scale bio-medical text corpus OHSUMED (Hersh, Buckley, Leone & Hickam 1994), which is a subset of MEDLINE collection containing of 348,566 medical journal references and 101 test queries, with an encouraging performance
Learning to Diversify Web Search Results with a Document Repulsion Model
Search diversification (also called diversity search), is an important approach to tackling the query ambiguity problem in information retrieval. It aims to diversify the search results that are originally ranked according to their probabilities of relevance to a given query, by re-ranking them to cover as many as possible different aspects (or subtopics) of the query. Most existing diversity search models heuristically balance the relevance ranking and the diversity ranking, yet lacking an efficient learning mechanism to reach an optimized parameter setting. To address this problem, we propose a learning-to-diversify approach which can directly optimize the search diversification performance (in term of any effectiveness metric). We first extend the ranking function of a widely used learning-to-rank framework, i.e., LambdaMART, so that the extended ranking function can correlate relevance and diversity indicators. Furthermore, we develop an effective learning algorithm, namely Document Repulsion Model (DRM), to train the ranking function based on a Document Repulsion Theory (DRT). DRT assumes that two result documents covering similar query aspects (i.e., subtopics) should be mutually repulsive, for the purpose of search diversification. Accordingly, the proposed DRM exerts a repulsion force between each pair of similar documents in the learning process, and includes the diversity effectiveness metric to be optimized as part of the loss function. Although there have been existing learning based diversity search methods, they often involve an iterative sequential selection process in the ranking process, which is computationally complex and time consuming for training, while our proposed learning strategy can largely reduce the time cost. Extensive experiments are conducted on the TREC diversity track data (2009, 2010 and 2011). The results demonstrate that our model significantly outperforms a number of baselines in terms of effectiveness and robustness. Further, an efficiency analysis shows that the proposed DRM has a lower computational complexity than the state of the art learning-to-diversify methods
- …
