43 research outputs found
Frequency Recognition in SSVEP-based BCI using Multiset Canonical Correlation Analysis
Canonical correlation analysis (CCA) has been one of the most popular methods
for frequency recognition in steady-state visual evoked potential (SSVEP)-based
brain-computer interfaces (BCIs). Despite its efficiency, a potential problem
is that using pre-constructed sine-cosine waves as the required reference
signals in the CCA method often does not result in the optimal recognition
accuracy due to their lack of features from the real EEG data. To address this
problem, this study proposes a novel method based on multiset canonical
correlation analysis (MsetCCA) to optimize the reference signals used in the
CCA method for SSVEP frequency recognition. The MsetCCA method learns multiple
linear transforms that implement joint spatial filtering to maximize the
overall correlation among canonical variates, and hence extracts SSVEP common
features from multiple sets of EEG data recorded at the same stimulus
frequency. The optimized reference signals are formed by combination of the
common features and completely based on training data. Experimental study with
EEG data from ten healthy subjects demonstrates that the MsetCCA method
improves the recognition accuracy of SSVEP frequency in comparison with the CCA
method and other two competing methods (multiway CCA (MwayCCA) and phase
constrained CCA (PCCA)), especially for a small number of channels and a short
time window length. The superiority indicates that the proposed MsetCCA method
is a new promising candidate for frequency recognition in SSVEP-based BCIs
Computational strategies for single-cell multi-omics integration
Single-cell omics technologies are currently solving biological and medical problems that earlier have remained elusive, such as discovery of new cell types, cellular differentiation trajectories and communication networks across cells and tissues. Current advances especially in single-cell multi-omics hold high potential for breakthroughs by integration of multiple different omics layers. To pair with the recent biotechnological developments, many computational approaches to process and analyze single-cell multi-omics data have been proposed. In this review, we first introduce recent developments in single-cell multi-omics in general and then focus on the available data integration strategies. The integration approaches are divided into three categories: early, intermediate, and late data integration. For each category, we describe the underlying conceptual principles and main characteristics, as well as provide examples of currently available tools and how they have been applied to analyze single-cell multi-omics data. Finally, we explore the challenges and prospective future directions of single-cell multi-omics data integration, including examples of adopting multi-view analysis approaches used in other disciplines to single-cell multi-omics.</p
Multi-view Data Analysis
Multi-view data analysis is a key technology for making effective decisions by leveraging information from multiple data sources. The process of data acquisition across various sensory modalities gives rise to the heterogeneous property of data. In my thesis, multi-view data representations are studied towards exploiting the enriched information encoded in different domains or feature types, and novel algorithms are formulated to enhance feature discriminability. Extracting informative data representation is a critical step in visual recognition and data mining tasks. Multi-view embeddings provide a new way of representation learning to bridge the semantic gap between the low-level observations and high-level human comprehensible knowledge benefitting from enriched information in multiple modalities.Recent advances on multi-view learning have introduced a new paradigm in jointly modeling cross-modal data. Subspace learning method, which extracts compact features by exploiting a common latent space and fuses multi-view information, has emerged proiminent among different categories of multi-view learning techniques. This thesis provides novel solutions in learning compact and discriminative multi-view data representations by exploiting the data structures in low dimensional subspace. We also demonstrate the performance of the learned representation scheme on a number of challenging tasks in recognition, retrieval and ranking problems.The major contribution of the thesis is a unified solution for subspace learning methods, which is extensible for multiple views, supervised learning, and non-linear transformations. Traditional statistical learning techniques including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied by constructing graphs of specific forms under the same framework. Methods using non-linear transforms based on kernels and (deep) neural networks are derived, which lead to superior performance compared to the linear ones. A novel multi-view discriminant embedding method is proposed by taking the view difference into consideration. Secondly, a multiview nonparametric discriminant analysis method is introduced by exploiting the class boundary structure and discrepancy information of the available views. This allows for multiple projecion directions, by relaxing the Gaussian distribution assumption of related methods. Thirdly, we propose a composite ranking method by keeping a close correlation with the individual rankings for optimal rank fusion. We propose a multi-objective solution to ranking problems by capturing inter-view and intra-view information using autoencoderlike networks. Finally, a novel end-to-end solution is introduced to enhance joint ranking with minimum view-specific ranking loss, so that we can achieve the maximum global view agreements within a single optimization process.In summary, this thesis aims to address the challenges in representing multi-view data across different tasks. The proposed solutions have shown superior performance in numerous tasks, including object recognition, cross-modal image retrieval, face recognition and object ranking
Multiview Learning with Sparse and Unannotated data.
PhD ThesisObtaining annotated training data for supervised learning, is a bottleneck in many
contemporary machine learning applications. The increasing prevalence of multi-modal
and multi-view data creates both new opportunities for circumventing this issue, and
new application challenges. In this thesis we explore several approaches to alleviating
annotation issues in multi-view scenarios.
We start by studying the problem of zero-shot learning (ZSL) for image recognition,
where class-level annotations for image recognition are eliminated by transferring information
from text modality instead. We next look at cross-modal matching, where
paired instances across views provide the supervised label information for learning. We
develop methodology for unsupervised and semi-supervised learning of pairing, thus
eliminating the need for annotation requirements.
We rst apply these ideas to unsupervised multi-view matching in the context of
bilingual dictionary induction (BLI), where instances are words in two languages and
nding a correspondence between the words produces a cross-lingual word translation
model. We then return to vision and language and look at learning unsupervised pairing
between images and text. We will see that this can be seen as a limiting case of ZSL
where text-image pairing annotation requirements are completely eliminated.
Overall these contributions in multi-view learning provide a suite of methods for
reducing annotation requirements: both in conventional classi cation and cross-view
matching settings
Latent Factor Analysis of High-Dimensional Brain Imaging Data
Recent advances in neuroimaging study, especially functional magnetic resonance imaging (fMRI), has become an important tool in understanding the human brain. Human cognitive functions can be mapped with the brain functional organization through the high-resolution fMRI scans. However, the high-dimensional data with the increasing number of scanning tasks and subjects pose a challenge to existing methods that wasn’t optimized for high-dimensional imaging data. In this thesis, I develop advanced data-driven methods to help utilize more available sources of information in order to reveal more robust brain-behavior relationship. In the first chapter, I provide an overview of the current related research in fMRI and my contributions to the field. In the second chapter, I propose two extensions to the connectome-based predictive modeling (CPM) method that is able to combine multiple connectomes when building predictive models. The two extensions are both able to generate higher prediction accuracy than using the single connectome or the average of multiple connectomes, suggesting the advantage of incorporating multiple sources of information in predictive modeling. In the third chapter, I improve CPM from the target behavioral measure’s perspective. I propose another two extensions for CPM that are able to combine multiple available behavioral measures into a composite measure for CPM to predict. The derived composite measures are shown to be predicted more accurately than any other single behavioral measure, suggesting a more robust brainbehavior relationship. In the fourth chapter, I propose a nonlinear dimensionality reduction framework to embed fMRI data from multiple tasks into a low-dimensional space. This framework helps reveal the common brain state in the multiple available tasks while also help discover the differences among these tasks. The results also provide valuable insights into the various prediction performance based on connectomes from different tasks. In the fifth chapter, I propose an another hyerbolic geometry-based brain graph edge embedding framework. The framework is based on Poincar´e embedding and is able to more accurately represent edges in the brain graph in a low-dimensional space than traditional Euclidean geometry-based embedding. Utilizing the embedding, we are able to cluster edges of the brain graph into disjoint clusters. The edge clusters can then be used to define overlapping brain networks and the derived metrics like network overlapping number can be used to investigate functional flexibility of each brain region. Overall, these work provide rich data-driven methods that help understand the brain-behavioral relationship through predictive modeling and low-dimensional data representation
Recommended from our members
Method for Enabling Causal Inference in Relational Domains
The analysis of data from complex systems is quickly becoming a fundamental aspect of modern business, government, and science. The field of causal learning is concerned with developing a set of statistical methods that allow practitioners make inferences about unseen interventions. This field has seen significant advances in recent years. However, the vast majority of this work assumes that data instances are independent, whereas many systems are best described in terms of interconnected instances, i.e. relational systems. This discrepancy prevents causal inference techniques from being reliably applied in many real-world settings. In this thesis, I will present three contributions to the field of causal inference that seek to enable the analysis of relational systems. First, I will present theory for consistently testing statistical dependence in relational domains. I then show how the significance of this test can be measured in practice using a novel bootstrap method for structured domains. Second, I show that statistical dependence in relational domains is inherently asymmetric, implying a simple test of causal direction from observational data. This test requires no assumptions on either the marginal distributions of variables or the functional form of dependence. Third, I describe relational causal adjustment, a procedure to identify the effects of arbitrary interventions from observational relational data via an extension of Pearl\u27s backdoor criterion. A series of evaluations on synthetic domains shows the estimates obtained by relational causal adjustment are close to those obtained from explicit experimentation
Statistical Methods in Integrative Genomics
Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions
A Comprehensive Survey on Deep Graph Representation Learning
Graph representation learning aims to effectively encode high-dimensional
sparse graph-structured data into low-dimensional dense vectors, which is a
fundamental task that has been widely studied in a range of fields, including
machine learning and data mining. Classic graph embedding methods follow the
basic idea that the embedding vectors of interconnected nodes in the graph can
still maintain a relatively close distance, thereby preserving the structural
information between the nodes in the graph. However, this is sub-optimal due
to: (i) traditional methods have limited model capacity which limits the
learning performance; (ii) existing techniques typically rely on unsupervised
learning strategies and fail to couple with the latest learning paradigms;
(iii) representation learning and downstream tasks are dependent on each other
which should be jointly enhanced. With the remarkable success of deep learning,
deep graph representation learning has shown great potential and advantages
over shallow (traditional) methods, there exist a large number of deep graph
representation learning techniques have been proposed in the past decade,
especially graph neural networks. In this survey, we conduct a comprehensive
survey on current deep graph representation learning algorithms by proposing a
new taxonomy of existing state-of-the-art literature. Specifically, we
systematically summarize the essential components of graph representation
learning and categorize existing approaches by the ways of graph neural network
architectures and the most recent advanced learning paradigms. Moreover, this
survey also provides the practical and promising applications of deep graph
representation learning. Last but not least, we state new perspectives and
suggest challenging directions which deserve further investigations in the
future