5,188 research outputs found
GIRNet: Interleaved Multi-Task Recurrent State Sequence Models
In several natural language tasks, labeled sequences are available in
separate domains (say, languages), but the goal is to label sequences with
mixed domain (such as code-switched text). Or, we may have available models for
labeling whole passages (say, with sentiments), which we would like to exploit
toward better position-specific label inference (say, target-dependent
sentiment annotation). A key characteristic shared across such tasks is that
different positions in a primary instance can benefit from different `experts'
trained from auxiliary data, but labeled primary instances are scarce, and
labeling the best expert for each position entails unacceptable cognitive
burden. We propose GITNet, a unified position-sensitive multi-task recurrent
neural network (RNN) architecture for such applications. Auxiliary and primary
tasks need not share training instances. Auxiliary RNNs are trained over
auxiliary instances. A primary instance is also submitted to each auxiliary
RNN, but their state sequences are gated and merged into a novel composite
state sequence tailored to the primary inference task. Our approach is in sharp
contrast to recent multi-task networks like the cross-stitch and sluice
network, which do not control state transfer at such fine granularity. We
demonstrate the superiority of GIRNet using three applications: sentiment
classification of code-switched passages, part-of-speech tagging of
code-switched text, and target position-sensitive annotation of sentiment in
monolingual passages. In all cases, we establish new state-of-the-art
performance beyond recent competitive baselines.Comment: Accepted at AAAI 201
A Convolutional Neural Network for Modelling Sentences
The ability to accurately represent sentences is central to language
understanding. We describe a convolutional architecture dubbed the Dynamic
Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of
sentences. The network uses Dynamic k-Max Pooling, a global pooling operation
over linear sequences. The network handles input sentences of varying length
and induces a feature graph over the sentence that is capable of explicitly
capturing short and long-range relations. The network does not rely on a parse
tree and is easily applicable to any language. We test the DCNN in four
experiments: small scale binary and multi-class sentiment prediction, six-way
question classification and Twitter sentiment prediction by distant
supervision. The network achieves excellent performance in the first three
tasks and a greater than 25% error reduction in the last task with respect to
the strongest baseline
Improving Distributed Representations of Tweets - Present and Future
Unsupervised representation learning for tweets is an important research
field which helps in solving several business applications such as sentiment
analysis, hashtag prediction, paraphrase detection and microblog ranking. A
good tweet representation learning model must handle the idiosyncratic nature
of tweets which poses several challenges such as short length, informal words,
unusual grammar and misspellings. However, there is a lack of prior work which
surveys the representation learning models with a focus on tweets. In this
work, we organize the models based on its objective function which aids the
understanding of the literature. We also provide interesting future directions,
which we believe are fruitful in advancing this field by building high-quality
tweet representation learning models.Comment: To be presented in Student Research Workshop (SRW) at ACL 201
- …