644 research outputs found

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Integrating knowledge graph embeddings to improve mention representation for bridging anaphora resolution

    Get PDF
    International audienceLexical semantics and world knowledge are crucial for interpreting bridging anaphora. Yet, existing computational methods for acquiring and injecting this type of information into bridging resolution systems suffer important limitations. Based on explicit querying of external knowledge bases, earlier approaches are computationally expensive (hence, hardly scalable) and they map the data to be processed into high-dimensional spaces (careful handling of the curse of dimensionality and overfitting has to be in order). In this work, we take a different and principled approach which naturally addresses these issues. Specifically, we convert the external knowledge source (in this case, WordNet) into a graph, and learn embeddings of the graph nodes of low dimension to capture the crucial features of the graph topology and, at the same time, rich semantic information. Once properly identified from the mention text spans, these low dimensional graph node embeddings are combined with distributional text-based embeddings to provide enhanced mention representations. We illustrate the effectiveness of our approach by evaluating it on commonly used datasets, namely ISNotes (Markert et al., 2012) and BASHI (Rösiger, 2018). Our enhanced mention representations yield significant accuracy improvements on both datasets when compared to different standalone text-based mention representations

    Constraints on metalinguistic anaphora

    Get PDF
    The focus of this paper is on a subset of heteronymous mention, namely those cases in which the mentioning expression is, roughly speaking, anaphorically linked to the string it mentions. I will distinguish two subclasses. In the first one, the antecedent of the metalinguistic anaphor is a quotation. This means that both the antecedent and the anaphor refer to a linguistic entity (the same one, it turns out; these expressions are co-referential). In the second subclass, the antecedent is not a quotation; it is a string in ordinary use. Here we have no co-reference: whereas the anaphor refers metalinguistically, the antecedent either refers to an object in the world or does not refer at all. This second subclass is especially interesting because it instantiates a shift in the universe of discourse, from extralinguistic reality to language. Where such a shift occurs, I will speak of ‘world-to-language' anaphora. I will argue that metalinguistic anaphora is best described in terms of a theory that assumes that various anaphoric expressions encode various degrees of salience of referents. But I will also show that salience is built in the context of utterance. It is not necessarily an acquired feature of the referent by the time the anaphor is processed: there is adjustment between the anaphor and its immediate linguistic environment. Besides, we will see that other factors may also affect anaphora resolution, which suggests that the best account must, in essence, be pragmatic
    • …
    corecore