26 research outputs found
Replication issues in syntax-based aspect extraction for opinion mining
Reproducing experiments is an important instrument to validate previous work
and build upon existing approaches. It has been tackled numerous times in
different areas of science. In this paper, we introduce an empirical
replicability study of three well-known algorithms for syntactic centric
aspect-based opinion mining. We show that reproducing results continues to be a
difficult endeavor, mainly due to the lack of details regarding preprocessing
and parameter setting, as well as due to the absence of available
implementations that clarify these details. We consider these are important
threats to validity of the research on the field, specifically when compared to
other problems in NLP where public datasets and code availability are critical
validity components. We conclude by encouraging code-based research, which we
think has a key role in helping researchers to understand the meaning of the
state-of-the-art better and to generate continuous advances.Comment: Accepted in the EACL 2017 SR
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
We propose a model to automatically describe changes introduced in the source
code of a program using natural language. Our method receives as input a set of
code commits, which contains both the modifications and message introduced by
an user. These two modalities are used to train an encoder-decoder
architecture. We evaluated our approach on twelve real world open source
projects from four different programming languages. Quantitative and
qualitative results showed that the proposed approach can generate feasible and
semantically sound descriptions not only in standard in-project settings, but
also in a cross-project setting.Comment: Accepted at ACL 201
Variational Inference for Learning Representations of Natural Language Edits
Document editing has become a pervasive component of the production of
information, with version control systems enabling edits to be efficiently
stored and applied. In light of this, the task of learning distributed
representations of edits has been recently proposed. With this in mind, we
propose a novel approach that employs variational inference to learn a
continuous latent space of vector representations to capture the underlying
semantic information with regard to the document editing process. We achieve
this by introducing a latent variable to explicitly model the aforementioned
features. This latent variable is then combined with a document representation
to guide the generation of an edited version of this document. Additionally, to
facilitate standardized automatic evaluation of edit representations, which has
heavily relied on direct human input thus far, we also propose a suite of
downstream tasks, PEER, specifically designed to measure the quality of edit
representations in the context of natural language processing.Comment: Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21