158 research outputs found
Proceedings of the Conference on Natural Language Processing 2010
This book contains state-of-the-art contributions to the 10th
conference on Natural Language Processing, KONVENS 2010
(Konferenz zur Verarbeitung natürlicher Sprache), with a focus
on semantic processing.
The KONVENS in general aims at offering a broad perspective
on current research and developments within the interdisciplinary
field of natural language processing. The central theme
draws specific attention towards addressing linguistic aspects
ofmeaning, covering deep as well as shallow approaches to semantic
processing. The contributions address both knowledgebased
and data-driven methods for modelling and acquiring
semantic information, and discuss the role of semantic information
in applications of language technology.
The articles demonstrate the importance of semantic processing,
and present novel and creative approaches to natural
language processing in general. Some contributions put their
focus on developing and improving NLP systems for tasks like
Named Entity Recognition or Word Sense Disambiguation, or
focus on semantic knowledge acquisition and exploitation with
respect to collaboratively built ressources, or harvesting semantic
information in virtual games. Others are set within the
context of real-world applications, such as Authoring Aids, Text
Summarisation and Information Retrieval. The collection highlights
the importance of semantic processing for different areas
and applications in Natural Language Processing, and provides
the reader with an overview of current research in this field
D8.6 Dissemination, training and exploitation results
Mauerhofer, C., Rajagopal, K., & Greller, W. (2011). D8.6 Dissemination, training and exploitation results. LTfLL-project.Report on sustainability, dissemination and exploitation of the LtfLL projectThe work on this publication has been sponsored by the LTfLL STREP that is funded by the European Commission's 7th Framework Programme. Contract 212578 [http://www.ltfll-project.org
Diachronic Evaluation of NER Systems on Old Newspapers
In recent years, many cultural institutions have engaged in large-scale newspaper digitization projects and large amounts of historical texts are being acquired (via transcription or OCRization). Beyond document preservation, the next step consists in providing an enhanced access to the content of these digital resources. In this regard, the processing of units which act as referential anchors, namely named entities (NE), is of particular importance. Yet, the application of standard NE tools to historical texts faces several challenges and performances are often not as good as on contemporary documents. This paper investigates the performances of different NE recognition tools applied on old newspapers by conducting a diachronic evaluation over 7 time-series taken from the archives of Swiss newspaper Le Temps
Workshop Proceedings of the 12th edition of the KONVENS conference
The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years
On the Complementarity of Images and Text for the Expression of Emotions in Social Media
Authors of posts in social media communicate their emotions and what causes
them with text and images. While there is work on emotion and stimulus
detection for each modality separately, it is yet unknown if the modalities
contain complementary emotion information in social media. We aim at filling
this research gap and contribute a novel, annotated corpus of English
multimodal Reddit posts. On this resource, we develop models to automatically
detect the relation between image and text, an emotion stimulus category and
the emotion class. We evaluate if these tasks require both modalities and find
for the image-text relations, that text alone is sufficient for most categories
(complementary, illustrative, opposing): the information in the text allows to
predict if an image is required for emotion understanding. The emotions of
anger and sadness are best predicted with a multimodal model, while text alone
is sufficient for disgust, joy, and surprise. Stimuli depicted by objects,
animals, food, or a person are best predicted by image-only models, while
multimodal models are most effective on art, events, memes, places, or
screenshots.Comment: accepted for WASSA 2022 at ACL 202
Distinguishing specialised discourse : the example of juridical texts on industrial property rights and trademark legislation
In this paper, we have given an overview of computational linguistic tools available to us, which can be used to produce raw material for the lexicographic description of a specialised language. The underlying idea of our method is the following: what is significantly more frequent in a domain-specific text than in a general language reference text may be a term (or collocation) of the domain. In the near future, our tools will be integrated in a web-based environment in order to make them available for textbased research, e.g. in the humanities, whenever needed. The researcher interested in term or phraseology candidate extraction of a certain domain would identify and upload texts to be searched, and the tools would be running on servers of e.g. computational linguistics centres. The researcher would select tools to be applied and receive the analysis results over the network
- …