118,390 research outputs found
Constructing Word-Context-Coupled Space Aligned with Associative Knowledge Relations for Interpretable Language Modeling
As the foundation of current natural language processing methods, pre-trained
language model has achieved excellent performance. However, the black-box
structure of the deep neural network in pre-trained language models seriously
limits the interpretability of the language modeling process. After revisiting
the coupled requirement of deep neural representation and semantics logic of
language modeling, a Word-Context-Coupled Space (W2CSpace) is proposed by
introducing the alignment processing between uninterpretable neural
representation and interpretable statistical logic. Moreover, a clustering
process is also designed to connect the word- and context-level semantics.
Specifically, an associative knowledge network (AKN), considered interpretable
statistical logic, is introduced in the alignment process for word-level
semantics. Furthermore, the context-relative distance is employed as the
semantic feature for the downstream classifier, which is greatly different from
the current uninterpretable semantic representations of pre-trained models. Our
experiments for performance evaluation and interpretable analysis are executed
on several types of datasets, including SIGHAN, Weibo, and ChnSenti. Wherein a
novel evaluation strategy for the interpretability of machine learning models
is first proposed. According to the experimental results, our language model
can achieve better performance and highly credible interpretable ability
compared to related state-of-the-art methods.Comment: Accepted at ACL 2023, Finding
Recommended from our members
Learning for semantic parsing using statistical syntactic parsing techniques
textNatural language understanding is a sub-field of natural language processing, which builds automated systems to understand natural language. It is such an ambitious task that it sometimes is referred to as an AI-complete problem, implying that its difficulty is equivalent to solving the central artificial intelligence problem -- making computers as intelligent as people. Despite its complexity, natural language understanding continues to be a fundamental problem in natural language processing in terms of its theoretical and empirical importance. In recent years, startling progress has been made at different levels of natural language processing tasks, which provides great opportunity for deeper natural language understanding. In this thesis, we focus on the task of semantic parsing, which maps a natural language sentence into a complete, formal meaning representation in a meaning representation language. We present two novel state-of-the-art learned syntax-based semantic parsers using statistical syntactic parsing techniques, motivated by the following two reasons. First, the syntax-based semantic parsing is theoretically well-founded in computational semantics. Second, adopting a syntax-based approach allows us to directly leverage the enormous progress made in statistical syntactic parsing. The first semantic parser, Scissor, adopts an integrated syntactic-semantic parsing approach, in which a statistical syntactic parser is augmented with semantic parameters to produce a semantically-augmented parse tree (SAPT). This integrated approach allows both syntactic and semantic information to be available during parsing time to obtain an accurate combined syntactic-semantic analysis. The performance of Scissor is further improved by using discriminative reranking for incorporating non-local features. The second semantic parser, SynSem, exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional semantic interpretation. This pipeline approach allows semantic parsing to conveniently leverage the most recent progress in statistical syntactic parsing. We report experimental results on two real applications: an interpreter for coaching instructions in robotic soccer and a natural-language database interface, showing that the improvement of Scissor and SynSem over other systems is mainly on long sentences, where the knowledge of syntax given in the form of annotated SAPTs or syntactic parses from an existing parser helps semantic composition. SynSem also significantly improves results with limited training data, and is shown to be robust to syntactic errors.Computer Science
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
- …