32,929 research outputs found
From Biological to Synthetic Neurorobotics Approaches to Understanding the Structure Essential to Consciousness (Part 3)
This third paper locates the synthetic neurorobotics research reviewed in the second paper in terms of themes introduced in the first paper. It begins with biological non-reductionism as understood by Searle. It emphasizes the role of synthetic neurorobotics studies in accessing the dynamic structure essential to consciousness with a focus on system criticality and self, develops a distinction between simulated and formal consciousness based on this emphasis, reviews Tani and colleagues' work in light of this distinction, and ends by forecasting the increasing importance of synthetic neurorobotics studies for cognitive science and philosophy of mind going forward, finally in regards to most- and myth-consciousness
Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification
Connectionist temporal classification (CTC) is a powerful approach for
sequence-to-sequence learning, and has been popularly used in speech
recognition. The central ideas of CTC include adding a label "blank" during
training. With this mechanism, CTC eliminates the need of segment alignment,
and hence has been applied to various sequence-to-sequence learning problems.
In this work, we applied CTC to abstractive summarization for spoken content.
The "blank" in this case implies the corresponding input data are less
important or noisy; thus it can be ignored. This approach was shown to
outperform the existing methods in term of ROUGE scores over Chinese Gigaword
and MATBN corpora. This approach also has the nice property that the ordering
of words or characters in the input documents can be better preserved in the
generated summaries.Comment: Accepted by Interspeech 201
Audio Caption: Listen and Tell
Increasing amount of research has shed light on machine perception of audio
events, most of which concerns detection and classification tasks. However,
human-like perception of audio scenes involves not only detecting and
classifying audio sounds, but also summarizing the relationship between
different audio events. Comparable research such as image caption has been
conducted, yet the audio field is still quite barren. This paper introduces a
manually-annotated dataset for audio caption. The purpose is to automatically
generate natural sentences for audio scene description and to bridge the gap
between machine perception of audio and image. The whole dataset is labelled in
Mandarin and we also include translated English annotations. A baseline
encoder-decoder model is provided for both English and Mandarin. Similar BLEU
scores are derived for both languages: our model can generate understandable
and data-related captions based on the dataset.Comment: accepted by ICASSP201
- …