3,600 research outputs found
Constrained structure of ancient Chinese poetry facilitates speech content grouping
Ancient Chinese poetry is constituted by structured language that deviates from ordinary language usage [1, 2]; its poetic genres impose unique combinatory constraints on linguistic elements [3]. How does the constrained poetic structure facilitate speech segmentation when common linguistic [4, 5, 6, 7, 8] and statistical cues [5, 9] are unreliable to listeners in poems? We generated artificial Jueju, which arguably has the most constrained structure in ancient Chinese poetry, and presented each poem twice as an isochronous sequence of syllables to native Mandarin speakers while conducting magnetoencephalography (MEG) recording. We found that listeners deployed their prior knowledge of Jueju to build the line structure and to establish the conceptual flow of Jueju. Unprecedentedly, we found a phase precession phenomenon indicating predictive processes of speech segmentation—the neural phase advanced faster after listeners acquired knowledge of incoming speech. The statistical co-occurrence of monosyllabic words in Jueju negatively correlated with speech segmentation, which provides an alternative perspective on how statistical cues facilitate speech segmentation. Our findings suggest that constrained poetic structures serve as a temporal map for listeners to group speech contents and to predict incoming speech signals. Listeners can parse speech streams by using not only grammatical and statistical cues but also their prior knowledge of the form of language
Learning a Recurrent Visual Representation for Image Caption Generation
In this paper we explore the bi-directional mapping between images and their
sentence-based descriptions. We propose learning this mapping using a recurrent
neural network. Unlike previous approaches that map both sentences and images
to a common embedding, we enable the generation of novel sentences given an
image. Using the same model, we can also reconstruct the visual features
associated with an image given its visual description. We use a novel recurrent
visual memory that automatically learns to remember long-term visual concepts
to aid in both sentence generation and visual feature reconstruction. We
evaluate our approach on several tasks. These include sentence generation,
sentence retrieval and image retrieval. State-of-the-art results are shown for
the task of generating novel image descriptions. When compared to human
generated captions, our automatically generated captions are preferred by
humans over of the time. Results are better than or comparable to
state-of-the-art results on the image and sentence retrieval tasks for methods
using similar visual features
Selecting Informative Contexts Improves Language Model Finetuning
We present a general finetuning meta-method that we call information gain
filtration for improving the overall training efficiency and final performance
of language model finetuning. This method uses a secondary learner which
attempts to quantify the benefit of finetuning the language model on each given
example. During the finetuning process, we use this learner to decide whether
or not each given example should be trained on or skipped. We show that it
suffices for this learner to be simple and that the finetuning process itself
is dominated by the relatively trivial relearning of a new unigram frequency
distribution over the modelled language domain, a process which the learner
aids. Our method trains to convergence using 40% fewer batches than normal
finetuning, and achieves a median perplexity of 54.0 on a books dataset
compared to a median perplexity of 57.3 for standard finetuning using the same
neural architecture
An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss
Affect conveys important implicit information in human communication. Having
the capability to correctly express affect during human-machine conversations
is one of the major milestones in artificial intelligence. In recent years,
extensive research on open-domain neural conversational models has been
conducted. However, embedding affect into such models is still under explored.
In this paper, we propose an end-to-end affect-rich open-domain neural
conversational model that produces responses not only appropriate in syntax and
semantics, but also with rich affect. Our model extends the Seq2Seq model and
adopts VAD (Valence, Arousal and Dominance) affective notations to embed each
word with affects. In addition, our model considers the effect of negators and
intensifiers via a novel affective attention mechanism, which biases attention
towards affect-rich words in input sentences. Lastly, we train our model with
an affect-incorporated objective function to encourage the generation of
affect-rich words in the output responses. Evaluations based on both perplexity
and human evaluations show that our model outperforms the state-of-the-art
baseline model of comparable size in producing natural and affect-rich
responses.Comment: AAAI-1
- …