27 research outputs found
Deep fusion of multi-channel neurophysiological signal for emotion recognition and monitoring
How to fuse multi-channel neurophysiological signals for emotion recognition is emerging as a hot research topic in community of Computational Psychophysiology. Nevertheless, prior feature engineering based approaches require extracting various domain knowledge related features at a high time cost. Moreover, traditional fusion method cannot fully utilise correlation information between different channels and frequency components. In this paper, we design a hybrid deep learning model, in which the 'Convolutional Neural Network (CNN)' is utilised for extracting task-related features, as well as mining inter-channel and inter-frequency correlation, besides, the 'Recurrent Neural Network (RNN)' is concatenated for integrating contextual information from the frame cube sequence. Experiments are carried out in a trial-level emotion recognition task, on the DEAP benchmarking dataset. Experimental results demonstrate that the proposed framework outperforms the classical methods, with regard to both of the emotional dimensions of Valence and Arousal
A Distribution Separation Method Using Irrelevance Feedback Data for Information Retrieval
In many research and application areas, such as information retrieval and machine learning, we often encounter dealing with a probability distribution which is mixed by one distribution that is relevant to our task in hand and the other that is irrelevant and we want to get rid of. Thus, it is an essential problem to separate the irrelevant distribution from the mixture distribution. This paper is focused on the application in Information Retrieval, where relevance feedback is a widely used technique to build a refined query model based on a set of feedback documents. However, in practice, the relevance feedback set, even provided by users explicitly or implicitly, is often a mixture of relevant and irrelevant documents. Consequently, the resultant query model (typically a term distribution) is often a mixture rather than a true relevance term distribution, leading to a negative impact on the retrieval performance. To tackle this problem, we recently proposed a Distribution Separation Method (DSM), which aims to approximate the true relevance distribution by separating a seed irrelevance distribution from the mixture one. While it achieved a promising performance in an empirical evaluation with simulated explicit irrelevance feedback data, it has not been deployed in the scenario where one should automatically obtain the irrelevance feedback data. In this article, we propose a substantial extension of the basic DSM from two perspectives: developing a further regularization framework and deploying DSM in the automatic irrelevance feedback scenario. Specifically, in order to avoid the output distribution of DSM drifting away from the true relevance distribution when the quality of seed irrelevant distribution (as the input to DSM) is not guaranteed, we propose a DSM regularization framework to constrain the estimation for the relevance distribution. This regularization framework includes three algorithms, each corresponding to a regularization strategy incorporated in the objective function of DSM. In addition, we exploit DSM in automatic (i.e., pseudo) irrelevance feedback, by automatically detecting the seed irrelevant documents via three different document re-ranking methods. We have carried out extensive experiments based on various TREC data sets, in order to systematically evaluate the proposed methods. The experimental results demonstrate the effectiveness of our proposed approaches in comparison with various strong baselines
Generalized bias-variance evaluation of TREC participated systems
Recent research has shown that the improvement of mean retrieval effectiveness (e.g., MAP) may sacrifice the retrieval stability across queries, implying a tradeoff between effectiveness and stability. The evaluation of both effectiveness and stability are often based on a baseline model, which could be weak or biased. In addition, the effectiveness-stability tradeoff has not been systematically or quantitatively evaluated over TREC participated systems. The above two problems, to some extent, limit our awareness of such tradeoff and its impact on developing future IR models. In this paper, motivated by a recently proposed bias-variance based evaluation, we adopt a strong and unbiased “baseline”, which is a virtual target model constructed by the best performance (for each query) among all the participated systems in a retrieval task. We also propose generalized bias variance metrics, based on which a systematic and quantitative evaluation of the effectiveness-stability tradeoff is carried out over the participated systems in the TREC Ad-hoc Track (1993-1999) and Web Track (2010-2012). We observe a clear effectiveness-stability tradeoff, with a trend of becoming more obvious in more recent years. This implies that when we pursue more effective IR systems over years, the stability has become problematic and could have been largely overlooked
Generalized Analysis of a Distribution Separation Method
Separating two probability distributions from a mixture model that is made up of the combinations of the two is essential to a wide range of applications. For example, in information retrieval (IR), there often exists a mixture distribution consisting of a relevance distribution that we need to estimate and an irrelevance distribution that we hope to get rid of. Recently, a distribution separation method (DSM) was proposed to approximate the relevance distribution, by separating a seed irrelevance distribution from the mixture distribution. It was successfully applied to an IR task, namely pseudo-relevance feedback (PRF), where the query expansion model is often a mixture term distribution. Although initially developed in the context of IR, DSM is indeed a general mathematical formulation for probability distribution separation. Thus, it is important to further generalize its basic analysis and to explore its connections to other related methods. In this article, we first extend DSM’s theoretical analysis, which was originally based on the Pearson correlation coefficient, to entropy-related measures, including the KL-divergence (Kullback–Leibler divergence), the symmetrized KL-divergence and the JS-divergence (Jensen–Shannon divergence). Second, we investigate the distribution separation idea in a well-known method, namely the mixture model feedback (MMF) approach. We prove that MMF also complies with the linear combination assumption, and then, DSM’s linear separation algorithm can largely simplify the EM algorithm in MMF. These theoretical analyses, as well as further empirical evaluation results demonstrate the advantages of our DSM approach
EEG Based Emotion Identification Using Unsupervised Deep Feature Learning
Capturing user’s emotional state is an emerging way for implicit relevance feedback in information retrieval (IR). Recently, EEG-based emotion recognition has drawn increasing attention. However, a key challenge is effective learning of useful features from EEG signals. In this paper, we present our on-going work on using Deep Belief Network (DBN) to automatically extract high-level features from raw EEG signals. Our preliminary experiment on the DEAP dataset shows that the learned features perform comparably to the use of manually generated features for emotion recognition
VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
With the emergence of large language models (LLMs) and vision foundation
models, how to combine the intelligence and capacity of these open-sourced or
API-available models to achieve open-world visual perception remains an open
question. In this paper, we introduce VisionGPT to consolidate and automate the
integration of state-of-the-art foundation models, thereby facilitating
vision-language understanding and the development of vision-oriented AI.
VisionGPT builds upon a generalized multimodal framework that distinguishes
itself through three key features: (1) utilizing LLMs (e.g., LLaMA-2) as the
pivot to break down users' requests into detailed action proposals to call
suitable foundation models; (2) integrating multi-source outputs from
foundation models automatically and generating comprehensive responses for
users; (3) adaptable to a wide range of applications such as text-conditioned
image understanding/generation/editing and visual question answering. This
paper outlines the architecture and capabilities of VisionGPT, demonstrating
its potential to revolutionize the field of computer vision through enhanced
efficiency, versatility, and generalization, and performance. Our code and
models will be made publicly available. Keywords: VisionGPT, Open-world visual
perception, Vision-language understanding, Large language model, and Foundation
modelComment: 17 pages, 5 figures, and 1 table. arXiv admin note: substantial text
overlap with arXiv:2311.1012