846 research outputs found
Conflating point of interest (POI) data: A systematic review of matching methods
Point of interest (POI) data provide digital representations of places in the
real world, and have been increasingly used to understand human-place
interactions, support urban management, and build smart cities. Many POI
datasets have been developed, which often have different geographic coverages,
attribute focuses, and data quality. From time to time, researchers may need to
conflate two or more POI datasets in order to build a better representation of
the places in the study areas. While various POI conflation methods have been
developed, there lacks a systematic review, and consequently, it is difficult
for researchers new to POI conflation to quickly grasp and use these existing
methods. This paper fills such a gap. Following the protocol of Preferred
Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), we conduct a
systematic review by searching through three bibliographic databases using
reproducible syntax to identify related studies. We then focus on a main step
of POI conflation, i.e., POI matching, and systematically summarize and
categorize the identified methods. Current limitations and future opportunities
are discussed afterwards. We hope that this review can provide some guidance
for researchers interested in conflating POI datasets for their research
MMF3: Neural Code Summarization Based on Multi-Modal Fine-Grained Feature Fusion
Background: Code summarization automatically generates the corresponding
natural language descriptions according to the input code. Comprehensiveness of
code representation is critical to code summarization task. However, most
existing approaches typically use coarse-grained fusion methods to integrate
multi-modal features. They generally represent different modalities of a piece
of code, such as an Abstract Syntax Tree (AST) and a token sequence, as two
embeddings and then fuse the two ones at the AST/code levels. Such a coarse
integration makes it difficult to learn the correlations between fine-grained
code elements across modalities effectively. Aims: This study intends to
improve the model's prediction performance for high-quality code summarization
by accurately aligning and fully fusing semantic and syntactic structure
information of source code at node/token levels. Method: This paper proposes a
Multi-Modal Fine-grained Feature Fusion approach (MMF3) for neural code
summarization. We introduce a novel fine-grained fusion method, which allows
fine-grained fusion of multiple code modalities at the token and node levels.
Specifically, we use this method to fuse information from both token and AST
modalities and apply the fused features to code summarization. Results: We
conduct experiments on one Java and one Python datasets, and evaluate generated
summaries using four metrics. The results show that: 1) the performance of our
model outperforms the current state-of-the-art models, and 2) the ablation
experiments show that our proposed fine-grained fusion method can effectively
improve the accuracy of generated summaries. Conclusion: MMF3 can mine the
relationships between crossmodal elements and perform accurate fine-grained
element-level alignment fusion accordingly. As a result, more clues can be
provided to improve the accuracy of the generated code summaries.Comment: 12 pages, 5 figure
Graph Neural Networks for Natural Language Processing: A Survey
Deep learning has become the dominant approach in coping with various tasks
in Natural LanguageProcessing (NLP). Although text inputs are typically
represented as a sequence of tokens, there isa rich variety of NLP problems
that can be best expressed with a graph structure. As a result, thereis a surge
of interests in developing new deep learning techniques on graphs for a large
numberof NLP tasks. In this survey, we present a comprehensive overview onGraph
Neural Networks(GNNs) for Natural Language Processing. We propose a new
taxonomy of GNNs for NLP, whichsystematically organizes existing research of
GNNs for NLP along three axes: graph construction,graph representation
learning, and graph based encoder-decoder models. We further introducea large
number of NLP applications that are exploiting the power of GNNs and summarize
thecorresponding benchmark datasets, evaluation metrics, and open-source codes.
Finally, we discussvarious outstanding challenges for making the full use of
GNNs for NLP as well as future researchdirections. To the best of our
knowledge, this is the first comprehensive overview of Graph NeuralNetworks for
Natural Language Processing.Comment: 127 page
Layer-wise Representation Fusion for Compositional Generalization
Despite successes across a broad range of applications, sequence-to-sequence
models' construct of solutions are argued to be less compositional than
human-like generalization. There is mounting evidence that one of the reasons
hindering compositional generalization is representations of the encoder and
decoder uppermost layer are entangled. In other words, the syntactic and
semantic representations of sequences are twisted inappropriately. However,
most previous studies mainly concentrate on enhancing token-level semantic
information to alleviate the representations entanglement problem, rather than
composing and using the syntactic and semantic representations of sequences
appropriately as humans do. In addition, we explain why the entanglement
problem exists from the perspective of recent studies about training deeper
Transformer, mainly owing to the ``shallow'' residual connections and its
simple, one-step operations, which fails to fuse previous layers' information
effectively. Starting from this finding and inspired by humans' strategies, we
propose \textsc{FuSion} (\textbf{Fu}sing \textbf{S}yntactic and
Semant\textbf{i}c Representati\textbf{on}s), an extension to
sequence-to-sequence models to learn to fuse previous layers' information back
into the encoding and decoding process appropriately through introducing a
\emph{fuse-attention module} at each encoder and decoder layer. \textsc{FuSion}
achieves competitive and even \textbf{state-of-the-art} results on two
realistic benchmarks, which empirically demonstrates the effectiveness of our
proposal.Comment: work in progress. arXiv admin note: substantial text overlap with
arXiv:2305.1216
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization
Generating a text abstract from a set of documents remains a challenging
task. The neural encoder-decoder framework has recently been exploited to
summarize single documents, but its success can in part be attributed to the
availability of large parallel data automatically acquired from the Web. In
contrast, parallel data for multi-document summarization are scarce and costly
to obtain. There is a pressing need to adapt an encoder-decoder model trained
on single-document summarization data to work with multiple-document input. In
this paper, we present an initial investigation into a novel adaptation method.
It exploits the maximal marginal relevance method to select representative
sentences from multi-document input, and leverages an abstractive
encoder-decoder model to fuse disparate sentences to an abstractive summary.
The adaptation method is robust and itself requires no training data. Our
system compares favorably to state-of-the-art extractive and abstractive
approaches judged by automatic metrics and human assessors.Comment: 11 page
深層学習に基づく感情会話分析に関する研究
Owning the capability to express specific emotions by a chatbot during a conversation is one of the key parts of artificial intelligence, which has an intuitive and quantifiable impact on the improvement of chatbot’s usability and user satisfaction. Enabling machines to emotion recognition in conversation is challenging, mainly because the information in human dialogue innately conveys emotions by long-term experience, abundant knowledge, context, and the intricate patterns between the affective states. Recently, many studies on neural emotional conversational models have been conducted. However, enabling the chatbot to control what kind of emotion to respond to upon its own characters in conversation is still underexplored. At this stage, people are no longer satisfied with using a dialogue system to solve specific tasks, and are more eager to achieve spiritual communication. In the chat process, if the robot can perceive the user's emotions and can accurately process them, it can greatly enrich the content of the dialogue and make the user empathize.
In the process of emotional dialogue, our ultimate goal is to make the machine understand human emotions and give matching responses. Based on these two points, this thesis explores and in-depth emotion recognition in conversation task and emotional dialogue generation task. In the past few years, although considerable progress has been made in emotional research in dialogue, there are still some difficulties and challenges due to the complex nature of human emotions. The key contributions in this thesis are summarized as below:
(1) Researchers have paid more attention to enhancing natural language models with knowledge graphs these days, since knowledge graph has gained a lot of systematic knowledge. A large number of studies had shown that the introduction of external commonsense knowledge is very helpful to improve the characteristic information. We address the task of emotion recognition in conversations using external knowledge to enhance semantics. In this work, we employ an external knowledge graph ATOMIC to extract the knowledge sources. We proposed KES model, a new framework that incorporates different elements of external knowledge and conversational semantic role labeling, where build upon them to learn interactions between interlocutors participating in a conversation. The conversation is a sequence of coherent and orderly discourses. For neural networks, the capture of long-range context information is a weakness. We adopt Transformer a structure composed of self-attention and feed forward neural network, instead of the traditional RNN model, aiming at capturing remote context information. We design a self-attention layer specialized for enhanced semantic text features with external commonsense knowledge. Then, two different networks composed of LSTM are responsible for tracking individual internal state and context external state. In addition, the proposed model has experimented on three datasets in emotion detection in conversation. The experimental results show that our model outperforms the state-of-the-art approaches on most of the tested datasets.
(2) We proposed an emotional dialogue model based on Seq2Seq, which is improved from three aspects: model input, encoder structure, and decoder structure, so that the model can generate responses with rich emotions, diversity, and context. In terms of model input, emotional information and location information are added based on word vectors. In terms of the encoder, the proposed model first encodes the current input and sentence sentiment to generate a semantic vector, and additionally encodes the context and sentence sentiment to generate a context vector, adding contextual information while ensuring the independence of the current input. On the decoder side, attention is used to calculate the weights of the two semantic vectors separately and then decode, to fully integrate the local emotional semantic information and the global emotional semantic information. We used seven objective evaluation indicators to evaluate the model's generation results, context similarity, response diversity, and emotional response. Experimental results show that the model can generate diverse responses with rich sentiment, contextual associations
- …