144 research outputs found
A convolutional attentional neural network for sentiment classification
Neural network models with attention mechanism have shown their efficiencies on various tasks. However, there is little research work on attention mechanism for text classification and existing attention model for text classification lacks of cognitive intuition and mathematical explanation. In this paper, we propose a new architecture of neural network based on the attention model for text classification. In particular, we show that the convolutional neural network (CNN) is a reasonable model for extracting attentions from text sequences in mathematics. We then propose a novel attention model base on CNN and introduce a new network architecture which combines recurrent neural network with our CNN-based attention model. Experimental results on five datasets show that our proposed models can accurately capture the salient parts of sentences to improve the performance of text classification
Commonsense knowledge enhanced memory network for stance classification
Stance classification aims at identifying, in the text, the attitude toward the given targets as favorable, negative, or unrelated. In existing models for stance classification, only textual representation is leveraged, while commonsense knowledge is ignored. In order to better incorporate commonsense knowledge into stance classification, we propose a novel model named commonsense knowledge enhanced memory network, which jointly represents textual and commonsense knowledge representation of given target and text. The textual memory module in our model treats the textual representation as memory vectors, and uses attention mechanism to embody the important parts. For commonsense knowledge memory module, we jointly leverage the entity and relation embeddings learned by TransE model to take full advantage of constraints of the knowledge graph. Experimental results on the SemEval dataset show that the combination of the commonsense knowledge memory and textual memory can improve stance classification
Convolution-based neural attention with applications to sentiment classification
Neural attention mechanism has achieved many successes in various tasks in natural language processing. However, existing neural attention models based on a densely connected network are loosely related to the attention mechanism found in psychology and neuroscience. Motivated by the finding in neuroscience that human possesses the template-searching attention mechanism, we propose to use convolution operation to simulate attentions and give a mathematical explanation of our neural attention model. We then introduce a new network architecture, which combines a recurrent neural network with our convolution-based attention model and further stacks an attention-based neural model to build a hierarchical sentiment classification model. The experimental results show that our proposed models can capture salient parts of the text to improve the performance of sentiment classification at both the sentence level and the document level
Transition-based directed graph construction for emotion-cause pair extraction
Emotion-cause pair extraction aims to extract all potential pairs of emotions and corresponding causes from unannotated emotion text. Most existing methods are pipelined framework, which identifies emotions and extracts causes separately, leading to a drawback of error propagation. Towards this issue, we propose a transition-based model to transform the task into a procedure of parsing-like directed graph construction. The proposed model incrementally generates the directed graph with labeled edges based on a sequence of actions, from which we can recognize emotions with the corresponding causes simultaneously, thereby optimizing separate subtasks jointly and maximizing mutual benefits of tasks interdependently. Experimental results show that our approach achieves the best performance, outperforming the state-of-the-art methods by 6.71% (p<0.01) in F1 measure
Quality Index for Stereoscopic Images by Separately Evaluating Adding and Subtracting
The human visual system (HVS) plays an important role in stereo image quality perception. Therefore, it has aroused many people’s interest in how to take advantage of the knowledge of the visual perception in image quality assessment models. This paper proposes a full-reference metric for quality assessment of stereoscopic images based on the binocular difference channel and binocular summation channel. For a stereo pair, the binocular summation map and binocular difference map are computed first by adding and subtracting the left image and right image. Then the binocular summation is decoupled into two parts, namely additive impairments and detail losses. The quality of binocular summation is obtained as the adaptive combination of the quality of detail losses and additive impairments. The quality of binocular summation is computed by using the Contrast Sensitivity Function (CSF) and weighted multi-scale (MS-SSIM). Finally, the quality of binocular summation and binocular difference is integrated into an overall quality index. The experimental results indicate that compared with existing metrics, the proposed metric is highly consistent with the subjective quality assessment and is a robust measure. The result have also indirectly proved hypothesis of the existence of binocular summation and binocular difference channels
Capacity Constrained Influence Maximization in Social Networks
Influence maximization (IM) aims to identify a small number of influential
individuals to maximize the information spread and finds applications in
various fields. It was first introduced in the context of viral marketing,
where a company pays a few influencers to promote the product. However, apart
from the cost factor, the capacity of individuals to consume content poses
challenges for implementing IM in real-world scenarios. For example, players on
online gaming platforms can only interact with a limited number of friends. In
addition, we observe that in these scenarios, (i) the initial adopters of
promotion are likely to be the friends of influencers rather than the
influencers themselves, and (ii) existing IM solutions produce sub-par results
with high computational demands. Motivated by these observations, we propose a
new IM variant called capacity constrained influence maximization (CIM), which
aims to select a limited number of influential friends for each initial adopter
such that the promotion can reach more users. To solve CIM effectively, we
design two greedy algorithms, MG-Greedy and RR-Greedy, ensuring the
-approximation ratio. To improve the efficiency, we devise the scalable
implementation named RR-OPIM+ with -approximation and
near-linear running time. We extensively evaluate the performance of 9
approaches on 6 real-world networks, and our solutions outperform all
competitors in terms of result quality and running time. Additionally, we
deploy RR-OPIM+ to online game scenarios, which improves the baseline
considerably.Comment: The technical report of the paper entitled 'Capacity Constrained
Influence Maximization in Social Networks' in SIGKDD'2
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Achieving nuanced and accurate emulation of human voice has been a
longstanding goal in artificial intelligence. Although significant progress has
been made in recent years, the mainstream of speech synthesis models still
relies on supervised speaker modeling and explicit reference utterances.
However, there are many aspects of human voice, such as emotion, intonation,
and speaking style, for which it is hard to obtain accurate labels. In this
paper, we propose VoxGenesis, a novel unsupervised speech synthesis framework
that can discover a latent speaker manifold and meaningful voice editing
directions without supervision. VoxGenesis is conceptually simple. Instead of
mapping speech features to waveforms deterministically, VoxGenesis transforms a
Gaussian distribution into speech distributions conditioned and aligned by
semantic tokens. This forces the model to learn a speaker distribution
disentangled from the semantic content. During the inference, sampling from the
Gaussian distribution enables the creation of novel speakers with distinct
characteristics. More importantly, the exploration of latent space uncovers
human-interpretable directions associated with specific speaker characteristics
such as gender attributes, pitch, tone, and emotion, allowing for voice editing
by manipulating the latent codes along these identified directions. We conduct
extensive experiments to evaluate the proposed VoxGenesis using both subjective
and objective metrics, finding that it produces significantly more diverse and
realistic speakers with distinct characteristics than the previous approaches.
We also show that latent space manipulation produces consistent and
human-identifiable effects that are not detrimental to the speech quality,
which was not possible with previous approaches. Audio samples of VoxGenesis
can be found at: \url{https://bit.ly/VoxGenesis}.Comment: preprin
- …