105,341 research outputs found
Applying and Combining Three Different Aspect Mining Techniques
Understanding a software system at source-code level requires understanding
the different concerns that it addresses, which in turn requires a way to
identify these concerns in the source code. Whereas some concerns are
explicitly represented by program entities (like classes, methods and
variables) and thus are easy to identify, crosscutting concerns are not
captured by a single program entity but are scattered over many program
entities and are tangled with the other concerns. Because of their crosscutting
nature, such crosscutting concerns are difficult to identify, and reduce the
understandability of the system as a whole.
In this paper, we report on a combined experiment in which we try to identify
crosscutting concerns in the JHotDraw framework automatically. We first apply
three independently developed aspect mining techniques to JHotDraw and evaluate
and compare their results. Based on this analysis, we present three interesting
combinations of these three techniques, and show how these combinations provide
a more complete coverage of the detected concerns as compared to the original
techniques individually. Our results are a first step towards improving the
understandability of a system that contains crosscutting concerns, and can be
used as a basis for refactoring the identified crosscutting concerns into
aspects.Comment: 28 page
Active learning in annotating micro-blogs dealing with e-reputation
Elections unleash strong political views on Twitter, but what do people
really think about politics? Opinion and trend mining on micro blogs dealing
with politics has recently attracted researchers in several fields including
Information Retrieval and Machine Learning (ML). Since the performance of ML
and Natural Language Processing (NLP) approaches are limited by the amount and
quality of data available, one promising alternative for some tasks is the
automatic propagation of expert annotations. This paper intends to develop a
so-called active learning process for automatically annotating French language
tweets that deal with the image (i.e., representation, web reputation) of
politicians. Our main focus is on the methodology followed to build an original
annotated dataset expressing opinion from two French politicians over time. We
therefore review state of the art NLP-based ML algorithms to automatically
annotate tweets using a manual initiation step as bootstrap. This paper focuses
on key issues about active learning while building a large annotated data set
from noise. This will be introduced by human annotators, abundance of data and
the label distribution across data and entities. In turn, we show that Twitter
characteristics such as the author's name or hashtags can be considered as the
bearing point to not only improve automatic systems for Opinion Mining (OM) and
Topic Classification but also to reduce noise in human annotations. However, a
later thorough analysis shows that reducing noise might induce the loss of
crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science -
Vol 3 - Contextualisation digitale - 201
- …