30 research outputs found

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Prosody-Based Unsupervised Speech Summarization with Two-Layer Mutually Reinforced Random Walk

    No full text
    <p>This paper presents a graph-based model that integrates prosodic features into an unsupervised speech summarization framework without any lexical information. In particular it builds on previous work using mutually reinforced random walks, in which a two-layer graph structure is used to select the most salient utterances of a conversation. The model consists of one layer of utterance nodes and another layer of prosody nodes. The random walk algorithm propagates scores between layers to use shared information for selecting utterance nodes with highest scores as summaries. A comparative evaluation of our prosody-based model against several baselines on a corpus of academic multi-party meetings reveals that it performs competitively on very short summaries, and better on longer summaries according to ROUGE scores as well as the average relevance of selected utterances.</p

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Spoken content retrieval beyond pipeline integration of automatic speech recognition and information retrieval

    Get PDF
    The dramatic increase in the creation of multimedia content is leading to the development of large archives in which a substantial amount of the information is in spoken form. Efficient access to this information requires effective spoken content retrieval (SCR) methods. Traditionally, SCR systems have focused on a pipeline integration of two fundamental technologies: transcription using automatic speech recognition (ASR) and search supported using text-based information retrieval (IR). Existing SCR approaches estimate the relevance of a spoken retrieval item based on the lexical overlap between a user’s query and the textual transcriptions of the items. However, the speech signal contains other potentially valuable non-lexical information that remains largely unexploited by SCR approaches. Particularly, acoustic correlates of speech prosody, that have been shown useful to identify salient words and determine topic changes, have not been exploited by existing SCR approaches. In addition, the temporal nature of multimedia content means that accessing content is a user intensive, time consuming process. In order to minimise user effort in locating relevant content, SCR systems could suggest playback points in retrieved content indicating the locations where the system believes relevant information may be found. This typically requires adopting a segmentation mechanism for splitting documents into smaller “elements” to be ranked and from which suitable playback points could be selected. Existing segmentation approaches do not generalise well to every possible information need or provide robustness to ASR errors. This thesis extends SCR beyond the standard ASR and IR pipeline approach by: (i) exploring the utilisation of prosodic information as complementary evidence of topical relevance to enhance current SCR approaches; (ii) determining elements of content that, when retrieved, minimise user search effort and provide increased robustness to ASR errors; and (iii) developing enhanced evaluation measures that could better capture the factors that affect user satisfaction in SCR

    Computational Mechanisms of Language Understanding and Use in the Brain and Behaviour

    Get PDF
    Linguistic communication is a unique characteristic of intelligent behaviour that distinguishes humans from non-human animals. Natural language is a structured, complex communication system supported by a variety of cognitive functions, realized by hundreds of millions of neurons in the brain. Artificial neural networks typically used in natural language processing (NLP) are often designed to focus on benchmark performance, where one of the main goals is reaching the state-of-the-art performance on a set of language tasks. Although the advances in NLP have been tremendous in the past decade, such networks provide only limited insights into biological mechanisms underlying linguistic processing in the brain. In this thesis, we propose an integrative approach to the study of computational mechanisms underlying fundamental language processes, spanning biologically plausible neural networks, and learning of basic communicative abilities through environmentally grounded behaviour. In doing so, we argue for the usage-based approach to language, where language is supported by a variety of cognitive functions and learning mechanisms. Thus, we focus on the three following questions: How are basic linguistic units, such as words, represented in the brain? Which neural mechanisms operate on those representations in cognitive tasks? How can aspects of such representations, such as associative similarity and structure, be learned in a usage-based framework? To answer the first two questions, we build novel, biologically realistic models of neural function that perform different semantic processing tasks: the Remote Associates Test (RAT) and the semantic fluency task. Both tasks have been used in experimental and clinical environments to study organizational principles and retrieval mechanisms from semantic memory. The models we propose realize the mental lexicon and cognitive retrieval processes operating on that lexicon using associative mechanisms in a biologically plausible manner. We argue that such models are the first and only biologically plausible models that propose specific mechanisms as well as reproduce a wide range of human behavioural data on those tasks, further corroborating their plausibility. To address the last question, we use an interactive, collaborative agent-based reinforcement learning setup in a navigation task where agents learn to communicate to solve the task. We argue that agents in such a setup learn to jointly coordinate their actions, and develop a communication protocol that is often optimal for the performance on the task, while exhibiting some core properties of language, such as representational similarity structure and compositionality, essential for associative mechanisms underlying cognitive representations

    Visualizing Evaluative Language in Relation to Constructing Identity in English Editorials and Op-Eds

    Get PDF
    This thesis is concerned with the problem of managing complexity in Systemic Functional Linguistic (SFL) analyses of language, particularly at the discourse semantics level. To deal with this complexity, the thesis develops AppAnn, a suite of linguistic visualization techniques that are specifically designed to provide both synoptic and dynamic views on discourse semantic patterns in text and corpus. Moreover, AppAnn visualizations are illustrated in a series of explorations of identity in a corpus of editorials and op-eds about the bin Laden killing. The findings suggest that the intriguing intricacies of discourse semantic meanings can be successfully discerned and more readily understood through linguistic visualization. The findings also provide insightful implications for discourse analysis by contributing to our understanding of a number of underdeveloped concepts of SFL, including coupling, commitment, instantiation, affiliation and individuation

    Attention Restraint, Working Memory Capacity, and Mind Wandering: Do Emotional Valence or Intentionality Matter?

    Get PDF
    Attention restraint appears to mediate the relationship between working memory capacity (WMC) and mind wandering (Kane et al., 2016). Prior work has identifed two dimensions of mind wandering—emotional valence and intentionality. However, less is known about how WMC and attention restraint correlate with these dimensions. Te current study examined the relationship between WMC, attention restraint, and mind wandering by emotional valence and intentionality. A confrmatory factor analysis demonstrated that WMC and attention restraint were strongly correlated, but only attention restraint was related to overall mind wandering, consistent with prior fndings. However, when examining the emotional valence of mind wandering, attention restraint and WMC were related to negatively and positively valenced, but not neutral, mind wandering. Attention restraint was also related to intentional but not unintentional mind wandering. Tese results suggest that WMC and attention restraint predict some, but not all, types of mind wandering

    Behavior quantification as the missing link between fields: Tools for digital psychiatry and their role in the future of neurobiology

    Full text link
    The great behavioral heterogeneity observed between individuals with the same psychiatric disorder and even within one individual over time complicates both clinical practice and biomedical research. However, modern technologies are an exciting opportunity to improve behavioral characterization. Existing psychiatry methods that are qualitative or unscalable, such as patient surveys or clinical interviews, can now be collected at a greater capacity and analyzed to produce new quantitative measures. Furthermore, recent capabilities for continuous collection of passive sensor streams, such as phone GPS or smartwatch accelerometer, open avenues of novel questioning that were previously entirely unrealistic. Their temporally dense nature enables a cohesive study of real-time neural and behavioral signals. To develop comprehensive neurobiological models of psychiatric disease, it will be critical to first develop strong methods for behavioral quantification. There is huge potential in what can theoretically be captured by current technologies, but this in itself presents a large computational challenge -- one that will necessitate new data processing tools, new machine learning techniques, and ultimately a shift in how interdisciplinary work is conducted. In my thesis, I detail research projects that take different perspectives on digital psychiatry, subsequently tying ideas together with a concluding discussion on the future of the field. I also provide software infrastructure where relevant, with extensive documentation. Major contributions include scientific arguments and proof of concept results for daily free-form audio journals as an underappreciated psychiatry research datatype, as well as novel stability theorems and pilot empirical success for a proposed multi-area recurrent neural network architecture.Comment: PhD thesis cop
    corecore