3,455 research outputs found
Dual Rectified Linear Units (DReLUs): A Replacement for Tanh Activation Functions in Quasi-Recurrent Neural Networks
In this paper, we introduce a novel type of Rectified Linear Unit (ReLU),
called a Dual Rectified Linear Unit (DReLU). A DReLU, which comes with an
unbounded positive and negative image, can be used as a drop-in replacement for
a tanh activation function in the recurrent step of Quasi-Recurrent Neural
Networks (QRNNs) (Bradbury et al. (2017)). Similar to ReLUs, DReLUs are less
prone to the vanishing gradient problem, they are noise robust, and they induce
sparse activations.
We independently reproduce the QRNN experiments of Bradbury et al. (2017) and
compare our DReLU-based QRNNs with the original tanh-based QRNNs and Long
Short-Term Memory networks (LSTMs) on sentiment classification and word-level
language modeling. Additionally, we evaluate on character-level language
modeling, showing that we are able to stack up to eight QRNN layers with
DReLUs, thus making it possible to improve the current state-of-the-art in
character-level language modeling over shallow architectures based on LSTMs
Psychosocial risk factors for sick leave at the individual and organizational level : a multilevel analysis
Psychosocial work and home stressors predict sickness absence from work
info:eu-repo/semantics/nonPublishe
The normalized freebase distance
In this paper, we propose the Normalized Freebase Distance (NFD), a new measure for determing semantic concept relatedness that is based on similar principles as the Normalized Web Distance (NWD). We illustrate that the NFD is more effective when comparing ambiguous concepts
Socioeconomic disparities in diet vary according to migration status among adolescents in Belgium
Little information concerning social disparities in adolescent dietary habits is currently available, especially regarding migration status. The aim of the present study was to estimate socioeconomic disparities in dietary habits of school adolescents from different migration backgrounds. In the 2014 cross-sectional Health Behavior in School-Aged Children survey in Belgium, food consumption was estimated using a self-administrated short food frequency questionnaire. In total, 19,172 school adolescents aged 10-19 years were included in analyses. Multilevel multiple binary and multinomial logistic regressions were performed, stratified by migration status (natives, 2nd- and 1st-generation immigrants). Overall, immigrants more frequently consumed both healthy and unhealthy foods. Indeed, 32.4% of 1st-generation immigrants, 26.5% of 2nd-generation immigrants, and 16.7% of natives consumed fish two days a week. Compared to those having a high family affluence scale (FAS), adolescents with a low FAS were more likely to consume chips and fries once a day (vs. <once a day: Natives aRRR = 1.39 (95%CI: 1.12-1.73); NS in immigrants). Immigrants at schools in Flanders were less likely than those in Brussels to consume sugar-sweetened beverages 2-6 days a week (vs. once a week: Natives aRRR = 1.86 (95%CI: 1.32-2.62); 2nd-generation immigrants aRRR = 1.52 (1.11-2.09); NS in 1st-generation immigrants). The migration gradient observed here underlines a process of acculturation. Narrower socioeconomic disparities in immigrant dietary habits compared with natives suggest that such habits are primarily defined by culture of origin. Nutrition interventions should thus include cultural components of dietary habits
Improving language modeling using densely connected recurrent neural networks
In this paper, we introduce the novel concept of densely connected layers
into recurrent neural networks. We evaluate our proposed architecture on the
Penn Treebank language modeling task. We show that we can obtain similar
perplexity scores with six times fewer parameters compared to a standard
stacked 2-layer LSTM model trained with dropout (Zaremba et al. 2014). In
contrast with the current usage of skip connections, we show that densely
connecting only a few stacked layers with skip connections already yields
significant perplexity reductions.Comment: Accepted at Workshop on Representation Learning, ACL201
Part-of-speech tagging of Twitter microposts only using distributed word representations and a neural network
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
Character-level features are currently used in different neural network-based
natural language processing algorithms. However, little is known about the
character-level patterns those models learn. Moreover, models are often
compared only quantitatively while a qualitative analysis is missing. In this
paper, we investigate which character-level patterns neural networks learn and
if those patterns coincide with manually-defined word segmentations and
annotations. To that end, we extend the contextual decomposition technique
(Murdoch et al. 2018) to convolutional neural networks which allows us to
compare convolutional neural networks and bidirectional long short-term memory
networks. We evaluate and compare these models for the task of morphological
tagging on three morphologically different languages and show that these models
implicitly discover understandable linguistic rules. Our implementation can be
found at https://github.com/FredericGodin/ContextualDecomposition-NLP .Comment: Accepted at EMNLP 201
A Simple Geometric Method for Cross-Lingual Linguistic Transformations with Pre-trained Autoencoders
Powerful sentence encoders trained for multiple languages are on the rise.
These systems are capable of embedding a wide range of linguistic properties
into vector representations. While explicit probing tasks can be used to verify
the presence of specific linguistic properties, it is unclear whether the
vector representations can be manipulated to indirectly steer such properties.
We investigate the use of a geometric mapping in embedding space to transform
linguistic properties, without any tuning of the pre-trained sentence encoder
or decoder. We validate our approach on three linguistic properties using a
pre-trained multilingual autoencoder and analyze the results in both
monolingual and cross-lingual settings
Explaining character-aware neural networks for word-level prediction : do they discover linguistic rules?
- …