1,502 research outputs found
Dual Skipping Networks
Inspired by the recent neuroscience studies on the left-right asymmetry of
the human brain in processing low and high spatial frequency information, this
paper introduces a dual skipping network which carries out coarse-to-fine
object categorization. Such a network has two branches to simultaneously deal
with both coarse and fine-grained classification tasks. Specifically, we
propose a layer-skipping mechanism that learns a gating network to predict
which layers to skip in the testing stage. This layer-skipping mechanism endows
the network with good flexibility and capability in practice. Evaluations are
conducted on several widely used coarse-to-fine object categorization
benchmarks, and promising results are achieved by our proposed network model.Comment: CVPR 2018 (poster); fix typ
Unifying and Merging Well-trained Deep Neural Networks for Inference Stage
We propose a novel method to merge convolutional neural-nets for the
inference stage. Given two well-trained networks that may have different
architectures that handle different tasks, our method aligns the layers of the
original networks and merges them into a unified model by sharing the
representative codes of weights. The shared weights are further re-trained to
fine-tune the performance of the merged model. The proposed method effectively
produces a compact model that may run original tasks simultaneously on
resource-limited devices. As it preserves the general architectures and
leverages the co-used weights of well-trained networks, a substantial training
overhead can be reduced to shorten the system development time. Experimental
results demonstrate a satisfactory performance and validate the effectiveness
of the method.Comment: To appear in the 27th International Joint Conference on Artificial
Intelligence and the 23rd European Conference on Artificial Intelligence,
2018. (IJCAI-ECAI 2018
Byte-based Language Identification with Deep Convolutional Networks
We report on our system for the shared task on discriminating between similar
languages (DSL 2016). The system uses only byte representations in a deep
residual network (ResNet). The system, named ResIdent, is trained only on the
data released with the task (closed training). We obtain 84.88% accuracy on
subtask A, 68.80% accuracy on subtask B1, and 69.80% accuracy on subtask B2. A
large difference in accuracy on development data can be observed with
relatively minor changes in our network's architecture and hyperparameters. We
therefore expect fine-tuning of these parameters to yield higher accuracies.Comment: 7 pages. Adapted reviewer comments. arXiv admin note: text overlap
with arXiv:1609.0705
- …