469,482 research outputs found
Proceedings of the Conference on Natural Language Processing 2010
This book contains state-of-the-art contributions to the 10th
conference on Natural Language Processing, KONVENS 2010
(Konferenz zur Verarbeitung natürlicher Sprache), with a focus
on semantic processing.
The KONVENS in general aims at offering a broad perspective
on current research and developments within the interdisciplinary
field of natural language processing. The central theme
draws specific attention towards addressing linguistic aspects
ofmeaning, covering deep as well as shallow approaches to semantic
processing. The contributions address both knowledgebased
and data-driven methods for modelling and acquiring
semantic information, and discuss the role of semantic information
in applications of language technology.
The articles demonstrate the importance of semantic processing,
and present novel and creative approaches to natural
language processing in general. Some contributions put their
focus on developing and improving NLP systems for tasks like
Named Entity Recognition or Word Sense Disambiguation, or
focus on semantic knowledge acquisition and exploitation with
respect to collaboratively built ressources, or harvesting semantic
information in virtual games. Others are set within the
context of real-world applications, such as Authoring Aids, Text
Summarisation and Information Retrieval. The collection highlights
the importance of semantic processing for different areas
and applications in Natural Language Processing, and provides
the reader with an overview of current research in this field
A Framework of Personality Cues for Conversational Agents
Conversational agents (CAs)—software systems emulating conversations with humans through natural language—reshape our communication environment. As CAs have been widely used for applications requiring human-like interactions, a key goal in information systems (IS) research and practice is to be able to create CAs that exhibit a particular personality. However, existing research on CA personality is scattered across different fields and researchers and practitioners face difficulty in understanding the current state of the art on the design of CA personality. To address this gap, we systematically analyze existing studies and develop a framework on how to imbue CAs with personality cues and how to organize the underlying range of expressive variation regarding the Big Five personality traits. Our framework contributes to IS research by providing an overview of CA personality cues in verbal and non-verbal language and supports practitioners in designing CAs with a particular personality
A Review on Identification of Contextual Similar Sentences
The task of identifying contextual similar sentences plays a crucial role in various natural language processing applications such as information retrieval, paraphrase detection, and question answering systems. This paper presents a comprehensive review of the methodologies, techniques, and advancements in the identification of contextual similar sentences. Beginning with an overview of the importance and challenges associated with this task, the paper delves into the various approaches employed, including traditional similarity metrics, deep learning architectures, and transformer-based models. Furthermore, the review explores different datasets and evaluation metrics used to assess the performance of these methods. Additionally, the paper discusses recent trends, emerging research directions, and potential applications in the field. By synthesizing existing literature, this review aims to provide researchers and practitioners with insights into the state-of-the-art techniques and future avenues for advancing the identification of contextual similar sentences
Recommended from our members
PowerAqua: Open Question Answering on the Semantic Web
With the rapid growth of semantic information in the Web, the processes of searching and querying these very large amounts of heterogeneous content have become increasingly challenging. This research tackles the problem of supporting users in querying and exploring information across multiple and heterogeneous Semantic Web (SW) sources.
A review of literature on ontology-based Question Answering reveals the limitations of existing technology. Our approach is based on providing a natural language Question Answering interface for the SW, PowerAqua. The realization of PowerAqua represents a considerable advance with respect to other systems, which restrict their scope to an ontology-specific or homogeneous fraction of the publicly available SW content. To our knowledge, PowerAqua is the only system that is able to take advantage of the semantic data available on the Web to interpret and answer user queries posed in natural language. In particular, PowerAqua is uniquely able to answer queries by combining and aggregating information, which can be distributed across heterogeneous semantic resources.
Here, we provide a complete overview of our work on PowerAqua, including: the research challenges it addresses; its architecture; the techniques we have realised to map queries to semantic data, to integrate partial answers drawn from different semantic resources and to rank alternative answers; and the evaluation studies we have performed, to assess the performance of PowerAqua. We believe our experiences can be extrapolated to a variety of end-user applications that wish to open up to large scale and heterogeneous structured datasets, to be able to exploit effectively what possibly is the greatest wealth of data in the history of Artificial Intelligence
Online Hate Speech against Women: Automatic Identification of Misogyny and Sexism on Twitter
[EN] Patriarchal behavior, such as other social habits, has been transferred online, appearing as misogynistic and sexist comments, posts or tweets. This online hate speech against women has serious consequences in real life, and recently, various legal cases have arisen against social platforms that scarcely block the spread of hate messages towards individuals. In this difficult context, this paper presents an approach that is able to detect the two sides of patriarchal behavior, misogyny and sexism, analyzing three collections of English tweets, and obtaining promising results.The work of Simona Frenda and Paolo Rosso was partially funded by the Spanish MINECO under the research project SomEMBED (TIN2015-71147-C2-1-P). We also thank the support of CONACYT-Mexico (project FC-2410).Frenda, S.; Ghanem, B.; Montes-Y-Gómez, M.; Rosso, P. (2019). Online Hate Speech against Women: Automatic Identification of Misogyny and Sexism on Twitter. Journal of Intelligent & Fuzzy Systems. 36(5):4743-4752. https://doi.org/10.3233/JIFS-179023S47434752365Anzovino M. , Fersini E. and Rosso P. , Automatic Identification and Classification of Misogynistic Language on Twitter, Proc 23rd International Conference on Applications of Natural Language to Information Systems, NLDB-2018, Springer-Verlag, LNCS 10859, 2018, pp. 57–64.Burnap P. and Williams M.L. , Hate speech, machine classification and statistical modelling of information flows on Twitter: Interpretation and communication for policy decision making, Internet, Policy and Politics, Oxford, UK, 2014.Burnap, P., Rana, O. F., Avis, N., Williams, M., Housley, W., Edwards, A., … Sloan, L. (2015). Detecting tension in online communities with computational Twitter analysis. Technological Forecasting and Social Change, 95, 96-108. doi:10.1016/j.techfore.2013.04.013Chen Y. , Zhou Y. , Zhu S. and Xu H. , Detecting offensive language in social media to protect adolescent online safety, Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Conference on Social Computing (SocialCom), Amsterdam, Netherlands, IEEE, 2012, pp. 71–80.Escalante, H. J., Villatoro-Tello, E., Garza, S. E., López-Monroy, A. P., Montes-y-Gómez, M., & Villaseñor-Pineda, L. (2017). Early detection of deception and aggressiveness using profile-based representations. Expert Systems with Applications, 89, 99-111. doi:10.1016/j.eswa.2017.07.040Fersini E. , Anzovino M. and Rosso P. , Overview of the Task on Automatic Misogyny Identification at IBEREVAL, CEUR Workshop Proceedings 2150, Seville, Spain, 2018.Fersini E. , Nozza D. and Rosso P. , Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI), Proceedings of the 6th evaluation campaign of Natural Language Processing and Speech tools for Italian (EVALITA’18), Turin, Italy, 2018.Fox, J., & Tang, W. Y. (2014). Sexism in online video games: The role of conformity to masculine norms and social dominance orientation. Computers in Human Behavior, 33, 314-320. doi:10.1016/j.chb.2013.07.014Fulper R. , Ciampaglia G.L. , Ferrara E. , Ahn Y. , Flammini A. , Menczer F. , Lewis B. and Rowe K. , Misogynistic language on Twitter and sexual violence, Proceedings of the ACM Web Science Workshop on Computational Approaches to Social Modeling (ChASM), 2014.Gambäck B. and Sikdar U.K. , Using convolutional neural networks to classify hate-speech, Proceedings of the First Workshop on Abusive Language Online 2017.Hewitt, S., Tiropanis, T., & Bokhove, C. (2016). The problem of identifying misogynist language on Twitter (and other online social spaces). Proceedings of the 8th ACM Conference on Web Science. doi:10.1145/2908131.2908183Justo R. , Corcoran T. , Lukin S.M. , Walker M. and Torres M.I. , Extracting relevant knowledge for the detection of sarcasm and nastiness in the social web, Knowledge-Based Systems, 2014.Lapidot-Lefler, N., & Barak, A. (2012). Effects of anonymity, invisibility, and lack of eye-contact on toxic online disinhibition. Computers in Human Behavior, 28(2), 434-443. doi:10.1016/j.chb.2011.10.014Nobata C. , Tetreault J. , Thomas A. , Mehdad Y. and Chang Y. , Abusive language detection in online user content, Proceedings of the 25th International Conference on World Wide Web, Geneva, Switzerland, 2016, pp. 145–153.Poland, B. (2016). Haters. doi:10.2307/j.ctt1fq9wdpSamghabadi N.S. , Maharjan S. , Sprague A. , Diaz-Sprague R. and Solorio T. , Detecting nastiness in social media, Proceedings of the First Workshop on Abusive Language Online, Vancouver, Canada, 2017, pp. 63–72. Association for Computational Linguistics.Sood, S., Antin, J., & Churchill, E. (2012). Profanity use in online communities. Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI ’12. doi:10.1145/2207676.220861
Linguistic challenges in automatic summarization technology
[EN] Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.Diedrichsen, E. (2017). Linguistic challenges in automatic summarization technology. Journal of Computer-Assisted Linguistic Research. 1(1):40-60. doi:10.4995/jclr.2017.7787.SWORD40601
PSYCHOACOUSTIC OPTIMIZATION OF THE VQ-VAE AND TRANSFORMER ARCHITECTURES FOR HUMAN-LIKE AUDITORY PERCEPTION IN MUSIC INFORMATION RETRIEVAL AND GENERATION TASKS
Despite incredible advancements in the utilization of learning-based architectures
(AI) in natural language and image domains, their applicability to the domain of
music has remained limited. In fact, the performance of state-of-the-art Automated
Music Transcription (AMT) systems has seen only marginal improvements from
novel AI architectures. Moreover, the importance of psychoacoustic perception and
its incorporation into MIR systems have mostly stayed addressed, leading to shortcomings
in current approaches. This thesis provides an overview of music processing
and novel neural architectures, investigates the reasons behind the subpar performance
achieved by their utilization in music information retrieval (MIR) tasks,
and proposes several ways of adjusting both the music (data-related) pre-processing
pipelines, and psychoacoustically-adjusted transformer-based model to improve the
performance on MIR and AMT tasks. In particular, a new music transformer architecture
is proposed, and various algorithms of music pre-processing for psychoacoustic
optimization are implemented along with several adaptive models aimed at
addressing the missing factor of modeling human music perception. The preliminary
performance results exhibit promising outcomes, warranting the continued investigation
of transformer architectures for music information retrieval applications.
Several intriguing insights unveiled during the research process are discussed and
presented. The thesis concludes by delineating a set of promising future research directions,
paving the way for further advancements in the field of music information
retrieval and generation using proposed architectures
Information Access in a Multilingual World: Transitioning from Research to Real-World Applications
Multilingual Information Access (MLIA) is at a turning point wherein substantial real-world applications are being introduced after fifteen years of research into cross-language information retrieval, question answering, statistical machine translation and named entity recognition. Previous workshops on this topic have focused on research and small- scale applications. The focus of this workshop was on technology transfer from research to applications and on what future research needs to be done which facilitates MLIA in an increasingly connected multilingual world
- …