29,879 research outputs found
Auto-Sizing Neural Networks: With Applications to n-gram Language Models
Neural networks have been shown to improve performance across a range of
natural-language tasks. However, designing and training them can be
complicated. Frequently, researchers resort to repeated experimentation to pick
optimal settings. In this paper, we address the issue of choosing the correct
number of units in hidden layers. We introduce a method for automatically
adjusting network size by pruning out hidden units through
and regularization. We apply this method to language modeling and
demonstrate its ability to correctly choose the number of hidden units while
maintaining perplexity. We also include these models in a machine translation
decoder and show that these smaller neural models maintain the significant
improvements of their unpruned versions.Comment: EMNLP 201
Transfer Learning for Speech and Language Processing
Transfer learning is a vital technique that generalizes models trained for
one setting or task to other settings or tasks. For example in speech
recognition, an acoustic model trained for one language can be used to
recognize speech in another language, with little or no re-training data.
Transfer learning is closely related to multi-task learning (cross-lingual vs.
multilingual), and is traditionally studied in the name of `model adaptation'.
Recent advance in deep learning shows that transfer learning becomes much
easier and more effective with high-level abstract features learned by deep
models, and the `transfer' can be conducted not only between data distributions
and data types, but also between model structures (e.g., shallow nets and deep
nets) or even model types (e.g., Bayesian models and neural models). This
review paper summarizes some recent prominent research towards this direction,
particularly for speech and language processing. We also report some results
from our group and highlight the potential of this very interesting research
field.Comment: 13 pages, APSIPA 201
- β¦