8,630 research outputs found
Personality Dysfunction Manifest in Words : Understanding Personality Pathology Using Computational Language Analysis
Personality disorders (PDs) are some of the most prevalent and high-risk mental health conditions, and yet remain poorly understood. Today, the development of new technologies means that there are advanced tools that can be used to improve our understanding and treatment of PD. One promising tool â indeed, the focus of this thesis â is computational language analysis. By looking at patterns in how people with personality pathology use words, it is possible to gain access into their constellation of thinking, feelings, and behaviours. To date, however, there has been little research at the intersection of verbal behaviour and personality pathology. Accordingly, the central goal of this thesis is to demonstrate how PD can be better understood through the analysis of natural language. This thesis presents three research articles, comprising four empirical studies, that each leverage computational language analysis to better understand personality pathology. Each paper focuses on a distinct core feature of PD, while incorporating language analysis methods: Paper 1 (Study 1) focuses on interpersonal dysfunction; Paper 2 (Studies 2 and 3) focuses on emotion dysregulation; and Paper 3 (Study 4) focuses on behavioural dysregulation (i.e., engagement in suicidality and deliberate self-harm). Findings from this research have generated better understanding of fundamental features of PD, including insight into characterising dimensions of social dysfunction (Paper 1), maladaptive emotion processes that may contribute to emotion dysregulation (Paper 2), and psychosocial dynamics relating to suicidality and deliberate self-harm (Paper 3) in PD. Such theoretical knowledge subsequently has important implications for clinical practice, particularly regarding the potential to inform psychological therapy. More broadly, this research highlights how language can provide implicit and unobtrusive insight into the personality and psychological processes that underlie personality pathology at a large-scale, using an individualised, naturalistic approach
Opportunities and risks of stochastic deep learning
This thesis studies opportunities and risks associated with stochasticity in deep learning that specifically manifest in the context of adversarial robustness and neural architecture search (NAS). On the one hand, opportunities arise because stochastic methods have a strong impact on robustness and generalisation, both from a theoretical and an empirical standpoint. In addition, they provide a framework for navigating non-differentiable search spaces, and for expressing data and model uncertainty. On the other hand, trade-offs (i.e., risks) that are coupled with these benefits need to be carefully considered. The three novel contributions that comprise the main body of this thesis are, by these standards, instances of opportunities and risks.
In the context of adversarial robustness, our first contribution proves that the impact of an adversarial input perturbation on the output of a stochastic neural network (SNN) is theoretically bounded. Specifically, we demonstrate that SNNs are maximally robust when they achieve weight-covariance alignment, i.e., when the vectors of their classifier layer are aligned with the eigenvectors of that layer's covariance matrix. Based on our theoretical insights, we develop a novel SNN architecture with excellent empirical adversarial robustness and show that our theoretical guarantees also hold experimentally.
Furthermore, we discover that SNNs partially owe their robustness to having a noisy loss landscape. Gradient-based adversaries find this landscape difficult to ascend during adversarial perturbation search, and therefore fail to create strong adversarial examples. We show that inducing a noisy loss landscape is not an effective defence mechanism, as it is easy to circumvent. To demonstrate that point, we develop a stochastic loss-smoothing extension to state-of-the-art gradient-based adversaries that allows them to attack successfully. Interestingly, our loss-smoothing extension can also (i) be successful against non-stochastic neural networks that defend by altering their loss landscape in different ways, and (ii) strengthen gradient-free adversaries.
Our third and final contribution lies in the field of few-shot learning, where we develop a stochastic NAS method for adapting pre-trained neural networks to previously unseen classes, by observing only a few training examples of each new class. We determine that the adaptation of a pre-trained backbone is not as simple as adapting all of its parameters. In fact, adapting or fine-tuning the entire architecture is sub-optimal, as a lot of layers already encode knowledge optimally. Our NAS algorithm searches for the optimal subset of pre-trained parameters to be adapted or fine-tuned, which yields a significant improvement over the existing paradigm for few-shot adaptation
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system â GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback
We propose a method to capture the handling abilities of fast jet pilots in a software model via reinforcement learning (RL) from human preference feedback. We use pairwise preferences over simulated flight trajectories to learn an interpretable rule-based model called a reward tree, which enables the automated scoring of trajectories alongside an explanatory rationale. We train an RL agent to execute high-quality handling behaviour by using the reward tree as the objective, and thereby generate data for iterative preference collection and further refinement of both tree and agent. Experiments with synthetic preferences show reward trees to be competitive with uninterpretable neural network reward models on quantitative and qualitative evaluations
Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers
With the advent of the modern pre-trained Transformers, the text preprocessing has started to be neglected and not specifically addressed in recent NLP literature. However, both from a linguistic and from a computer science point of view, we believe that even when using modern Transformers, text preprocessing can significantly impact on the performance of a classification model. We want to investigate and compare, through this study, how preprocessing impacts on the Text Classification (TC) performance of modern and traditional classification models. We report and discuss the preprocessing techniques found in the literature and their most recent variants or applications to address TC tasks in different domains. In order to assess how much the preprocessing affects classification performance, we apply the three top referenced preprocessing techniques (alone or in combination) to four publicly available datasets from different domains. Then, nine machine learning models â including modern Transformers â get the preprocessed text as input. The results presented show that an educated choice on the text preprocessing strategy to employ should be based on the task as well as on the model considered. Outcomes in this survey show that choosing the best preprocessing technique â in place of the worst â can significantly improve accuracy on the classification (up to 25%, as in the case of an XLNet on the IMDB dataset). In some cases, by means of a suitable preprocessing strategy, even a simple NaĂŻve Bayes classifier proved to outperform (i.e., by 2% in accuracy) the best performing Transformer. We found that Transformers and traditional models exhibit a higher impact of the preprocessing on the TC performance. Our main findings are: (1) also on modern pre-trained language models, preprocessing can affect performance, depending on the datasets and on the preprocessing technique or combination of techniques used, (2) in some cases, using a proper preprocessing strategy, simple models can outperform Transformers on TC tasks, (3) similar classes of models exhibit similar level of sensitivity to text preprocessing
Harnessing eXplainable artificial intelligence for feature selection in time series energy forecasting : a comparative analysis of Grad-CAM and SHAP
DATA AVAILABILITY: Datasets related to this article can be found at [63], an open-source
online data repository hosted at Mendeley Data.This study investigates the efficacy of Explainable Artificial Intelligence (XAI) methods, specifically Gradient-weighted Class Activation Mapping (Grad-CAM) and Shapley Additive Explanations (SHAP), in the feature selection process for national demand forecasting. Utilising a multi-headed Convolutional Neural Network (CNN), both XAI methods exhibit capabilities in enhancing forecasting accuracy and model efficiency by identifying and eliminating irrelevant features. Comparative analysis revealed Grad-CAMâs exceptional computational efficiency in high-dimensional applications and SHAPâs superior ability in revealing features that degrade forecast accuracy. However, limitations are found in both methods, with Grad-CAM including features that decrease model stability, and SHAP inaccurately ranking significant features. Future research should focus on refining these XAI methods to overcome these limitations and further probe into other XAI methodsâ applicability within the time-series forecasting domain. This study underscores the potential of XAI in improving load forecasting, which can contribute significantly to the development of more interpretative, accurate and efficient forecasting models.National Key R&D Program of China, National Natural Science Foundation of China, National Research Foundation China/South Africa Research Cooperation Programme, China/South Africa Bilateral, and Royal Academy of Engineering Transforming Systems through Partnership.http://www.elsevier.com/locate/apenergyElectrical, Electronic and Computer Engineerin
Linking language and emotion: how emotion is understood in language comprehension, production and prediction using psycholinguistic methods
Emotions are an integral part of why and how we use language in everyday life. We communicate our concerns, express our woes, and share our joy through the use of non-verbal and verbal language. Yet there is a limited understanding of when and how emotional language is processed differently to neutral language, or of how emotional information facilitates or inhibits language processing. Indeed, various efforts have been made to bring back emotions into the discipline of psycholinguistics in the last decade. This can be seen in many interdisciplinary models focusing on the role played by emotion in each aspect of linguistic experience. In this thesis, I answer this call and pursue questions that remain unanswered in psycholinguistics regarding its interaction with emotion. The general trend that I am using to bring emotion into psycholinguistic research is straightforward. Where applicable and relevant, I use well-established tasks or paradigms to investigate the effects of emotional content in language processing. Hence, I focused on three main areas of language processing: comprehension, production and prediction.
The first experimental chapter includes a series of experiments utilising the Modality Switching Paradigm to investigate whether sentences describing emotional states are processed differently from sentences describing cognitive states. No switching effects were found consistently in my 3 experiments. My results suggest that these distinct classes of interoceptive concepts, such as âthinkingâ or âbeing happyâ, are not processed differently from each other, suggesting that people do not switch attention between different interoceptive systems when comprehending emotional or cognitive sentences. I discuss the implications for grounded cognition theory in the embodiment literature.
In my second experimental chapter, I used the Cumulative Semantic Interference Paradigm to investigate these two questions: (1) whether emotion concepts interfere with one another when repeatedly retrieved (emotion label objects), and (2) whether similar interference occurs for concrete objects that share similar valence association (emotion-laden objects). This could indicate that people use information such as valence and arousal to group objects in semantic memory. I found that interference occurs when people retrieve direct emotion labels repeatedly (e.g., âhappyâ and âsadâ) but not when they retrieve the names of concrete objects that have similar emotion connotations (e.g., âpuppyâ and ârainbowâ). I discuss my findings in terms of the different types of information that support representation of abstract vs. concrete concepts.
In my final experimental chapter, I used the Visual World Paradigm to investigate whether the emotional state of an agent is used to inform predictions during sentence processing. I found that people do use the description of emotional state of an agent (e.g., âThe boy is happyâ) to predict the cause of that affective state during sentence processing (e.g., âbecause he was given an ice-creamâ). A key result here is that people were more likely to fixate on the emotionally congruent objects (e.g., ice-cream) compared to incongruent objects (e.g., broccoli). This suggests that people rapidly and automatically inform predictions about upcoming sentence information based on the emotional state of the agent. I discuss our findings as a novel contribution to the Visual World literature.
I conducted a diverse set of experiments using a range of established psycholinguistic methods to investigate the roles of emotional information in language processing. I found clear results in the eye-tracking study but inconsistent effects in both switching and interference studies. I interpret these mixed findings in the following way: emotional content does not always have effects in language processing and that effect are most likely in tasks that explicitly require participants to simulate emotion states in some way. Regardless, not only was I successful in finding some novel results by extending previous tasks, but I was also able to show that this is an avenue that can be explored more to advance the affective psycholinguistic field
- âŠ