241 research outputs found
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
Argumentation models and their use in corpus annotation: practice, prospects, and challenges
The study of argumentation is transversal to several research domains, from philosophy to linguistics, from the law to computer science and artificial intelligence. In discourse analysis, several distinct models have been proposed to harness argumentation, each with a different focus or aim. To analyze the use of argumentation in natural language, several corpora annotation efforts have been carried out, with a more or less explicit grounding on one of such theoretical argumentation models. In fact, given the recent growing interest in argument mining applications, argument-annotated corpora are crucial to train machine learning models in a supervised way. However, the proliferation of such corpora has led to a wide disparity in the granularity of the argument annotations employed. In this paper, we review the most relevant theoretical argumentation models, after which we survey argument annotation projects closely following those theoretical models. We also highlight the main simplifications that are often introduced in practice. Furthermore, we glimpse other annotation efforts that are not so theoretically grounded but instead follow a shallower approach. It turns out that most argument annotation projects make their own assumptions and simplifications, both in terms of the textual genre they focus on and in terms of adapting the adopted theoretical argumentation model for their own agenda. Issues of compatibility among argument-annotated corpora are discussed by looking at the problem from a syntactical, semantic, and practical perspective. Finally, we discuss current and prospective applications of models that take advantage of argument-annotated corpora
Multimodality and superdiversity: evidence for a research agenda
In recent years, social science research in superdiversity has questioned notions such as multiculturalism and pluralism, which hinge on and de facto reproduce ideological constructs such as separate and clearly identifiable national cultures and ethnic identities; research in language and superdiversity, in translanguaging, polylanguaging and metrolingualism have analogously questioned concepts such as multi- and bi-lingualism, which hinge on ideological constructs such as national languages, mother tongue and native speaker proficiency. Research in multimodality has questioned the centrality of language in everyday communication as well as its paradigmatic role to the understanding of communicative practices. While the multimodality of communication is generally acknowledged in work on language and superdiversity, the potential of a social semiotic multimodal approach for understanding communication in superdiversity has not been adequately explored and developed yet – and neither has the concept of superdiversity been addressed in multimodal research. The present paper wants to start to fill this gap. By discussing sign-making practices in the superdiverse context of Leeds Kirkgate Market (UK), it maps the potentials of an ethnographic social semiotics for the study of communication in superdiversity and sketches an agenda for research on multimodality and superdiversity, identifying a series of working hypotheses, research questions, areas of investigations and domains and fields of enquiry
Action Categorisation in Multimodal Instructions
We present an explorative study for the (semi-)automatic categorisation of actions in Dutch multimodal first aid instructions, where the actions needed to successfully execute the procedure in question are presented verbally and in pictures. We start with the categorisation of verbalised actions and expect that this will later facilitate the identification of those actions in the pictures, which is known to be hard. Comparisons of and user-based experimentation with the verbal and visual representations will allow us to determine the effectiveness of picture-text combinations and will eventually support the automatic generation of multimodal documents. We used Natural Language Processing tools to identify and categorise 2,388 verbs in a corpus of 78 multimodal instructions (MIs). We show that the main action structure of an instruction can be retrieved through verb identification using the Alpino parser followed by a manual election operation. The selected main action verbs were subsequently generalised and categorised with the use of Cornetto, a lexical resource that combines a Dutch Wordnet and a Dutch Reference Lexicon. Results show that these tools are useful but also have limitations which make human intervention essential to guide an accurate categorisation of actions in multimodal instructions
Accessing spoken interaction through dialogue processing [online]
Zusammenfassung
Unser Leben, unsere Leistungen und unsere Umgebung, alles wird
derzeit durch Schriftsprache dokumentiert. Die rasante
Fortentwicklung der technischen Möglichkeiten Audio, Bilder und
Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt
werden um die schriftliche Dokumentation von menschlicher
Kommunikation, zum Beispiel Meetings, zu unterstützen, zu
ergänzen oder gar zu ersetzen. Diese neuen Technologien können
uns in die Lage versetzen Information aufzunehmen, die
anderweitig verloren gehen, die Kosten der Dokumentation zu
senken und hochwertige Dokumente mit audiovisuellem Material
anzureichern. Die Indizierung solcher Aufnahmen stellt die
Kerntechnologie dar um dieses Potential auszuschöpfen. Diese
Arbeit stellt effektive Alternativen zu schlüsselwortbasierten
Indizes vor, die Suchraumeinschränkungen bewirken und teilweise
mit einfachen Mitteln zu berechnen sind.
Die Indizierung von Sprachdokumenten kann auf verschiedenen
Ebenen erfolgen: Ein Dokument gehört stilistisch einer
bestimmten Datenbasis an, welche durch sehr einfache Merkmale
bei hoher Genauigkeit automatisch bestimmt werden kann.
Durch diese Art von Klassifikation kann eine Reduktion des
Suchraumes um einen Faktor der Größenordnung 410 erfolgen. Die
Anwendung von thematischen Merkmalen zur Textklassifikation
bei einer Nachrichtendatenbank resultiert in einer Reduktion um
einen Faktor 18. Da Sprachdokumente sehr lang sein können müssen
sie in thematische Segmente unterteilt werden. Ein neuer
probabilistischer Ansatz sowie neue Merkmale (Sprecherinitia
tive und Stil) liefern vergleichbare oder bessere Resultate als
traditionelle schlüsselwortbasierte Ansätze. Diese thematische
Segmente können durch die vorherrschende Aktivität
charakterisiert werden (erzählen, diskutieren, planen, ...),
die durch ein neuronales Netz detektiert werden kann. Die
Detektionsraten sind allerdings begrenzt da auch Menschen
diese Aktivitäten nur ungenau bestimmen. Eine maximale
Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten
Daten theoretisch möglich. Eine thematische Klassifikation
dieser Segmente wurde ebenfalls auf einer Datenbasis
durchgeführt, die Detektionsraten für diesen Index sind jedoch
gering.
Auf der Ebene der einzelnen Äußerungen können Dialogakte wie
Aussagen, Fragen, Rückmeldungen (aha, ach ja, echt?, ...) usw.
mit einem diskriminativ trainierten Hidden Markov Model erkannt
werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen
wie Frage/AntwortSpielen erweitert werden (Dialogspiele).
Dialogakte und spiele können eingesetzt werden um
Klassifikatoren für globale Sprechstile zu bauen. Ebenso
könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz
erinnern und versuchen, diese in einer grafischen
Repräsentation wiederzufinden.
In einer Studie mit sehr pessimistischen Annahmen konnten
Benutzer eines aus vier ähnlichen und gleichwahrscheinlichen
Gesprächen mit einer Genauigkeit von ~ 43% durch eine graphische
Repräsentation von Aktivität bestimmt.
Dialogakte könnte in diesem Szenario ebenso nützlich sein, die
Benutzerstudie konnte aufgrund der geringen Datenmenge darüber
keinen endgültigen Aufschluß geben. Die Studie konnte allerdings
für detailierte Basismerkmale wie Formalität und
Sprecheridentität keinen Effekt zeigen.
Abstract
Written language is one of our primary means for documenting our
lives, achievements, and environment. Our capabilities to
record, store and retrieve audio, still pictures, and video are
undergoing a revolution and may support, supplement or even
replace written documentation. This technology enables us to
record information that would otherwise be lost, lower the cost
of documentation and enhance highquality documents with
original audiovisual material.
The indexing of the audio material is the key technology to
realize those benefits. This work presents effective
alternatives to keyword based indices which restrict the search
space and may in part be calculated with very limited resources.
Indexing speech documents can be done at a various levels:
Stylistically a document belongs to a certain database which can
be determined automatically with high accuracy using very simple
features. The resulting factor in search space reduction is in
the order of 410 while topic classification yielded a factor
of 18 in a news domain.
Since documents can be very long they need to be segmented into
topical regions. A new probabilistic segmentation framework as
well as new features (speaker initiative and style) prove to be
very effective compared to traditional keyword based methods. At
the topical segment level activities (storytelling, discussing,
planning, ...) can be detected using a machine learning approach
with limited accuracy; however even human annotators do not
annotate them very reliably. A maximum search space reduction
factor of 6 is theoretically possible on the databases used. A
topical classification of these regions has been attempted
on one database, the detection accuracy for that index, however,
was very low.
At the utterance level dialogue acts such as statements,
questions, backchannels (aha, yeah, ...), etc. are being
recognized using a novel discriminatively trained HMM procedure.
The procedure can be extended to recognize short sequences such
as question/answer pairs, so called dialogue games.
Dialog acts and games are useful for building classifiers for
speaking style. Similarily a user may remember a certain dialog
act sequence and may search for it in a graphical
representation.
In a study with very pessimistic assumptions users are able to
pick one out of four similar and equiprobable meetings correctly
with an accuracy ~ 43% using graphical activity information.
Dialogue acts may be useful in this situation as well but the
sample size did not allow to draw final conclusions. However the
user study fails to show any effect for detailed basic features
such as formality or speaker identity
Representation Learning for Natural Language Processing
This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
- …