3,766 research outputs found
Replication issues in syntax-based aspect extraction for opinion mining
Reproducing experiments is an important instrument to validate previous work
and build upon existing approaches. It has been tackled numerous times in
different areas of science. In this paper, we introduce an empirical
replicability study of three well-known algorithms for syntactic centric
aspect-based opinion mining. We show that reproducing results continues to be a
difficult endeavor, mainly due to the lack of details regarding preprocessing
and parameter setting, as well as due to the absence of available
implementations that clarify these details. We consider these are important
threats to validity of the research on the field, specifically when compared to
other problems in NLP where public datasets and code availability are critical
validity components. We conclude by encouraging code-based research, which we
think has a key role in helping researchers to understand the meaning of the
state-of-the-art better and to generate continuous advances.Comment: Accepted in the EACL 2017 SR
Alexandria: Extensible Framework for Rapid Exploration of Social Media
The Alexandria system under development at IBM Research provides an
extensible framework and platform for supporting a variety of big-data
analytics and visualizations. The system is currently focused on enabling rapid
exploration of text-based social media data. The system provides tools to help
with constructing "domain models" (i.e., families of keywords and extractors to
enable focus on tweets and other social media documents relevant to a project),
to rapidly extract and segment the relevant social media and its authors, to
apply further analytics (such as finding trends and anomalous terms), and
visualizing the results. The system architecture is centered around a variety
of REST-based service APIs to enable flexible orchestration of the system
capabilities; these are especially useful to support knowledge-worker driven
iterative exploration of social phenomena. The architecture also enables rapid
integration of Alexandria capabilities with other social media analytics
system, as has been demonstrated through an integration with IBM Research's
SystemG. This paper describes a prototypical usage scenario for Alexandria,
along with the architecture and key underlying analytics.Comment: 8 page
Multimedia search without visual analysis: the value of linguistic and contextual information
This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features
Corpora for Computational Linguistics
Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction.
Their influence on other fields is also briefly discussed
- âŠ