951 research outputs found
VIGAN: Missing View Imputation with Generative Adversarial Networks
In an era when big data are becoming the norm, there is less concern with the
quantity but more with the quality and completeness of the data. In many
disciplines, data are collected from heterogeneous sources, resulting in
multi-view or multi-modal datasets. The missing data problem has been
challenging to address in multi-view data analysis. Especially, when certain
samples miss an entire view of data, it creates the missing view problem.
Classic multiple imputations or matrix completion methods are hardly effective
here when no information can be based on in the specific view to impute data
for such samples. The commonly-used simple method of removing samples with a
missing view can dramatically reduce sample size, thus diminishing the
statistical power of a subsequent analysis. In this paper, we propose a novel
approach for view imputation via generative adversarial networks (GANs), which
we name by VIGAN. This approach first treats each view as a separate domain and
identifies domain-to-domain mappings via a GAN using randomly-sampled data from
each view, and then employs a multi-modal denoising autoencoder (DAE) to
reconstruct the missing view from the GAN outputs based on paired data across
the views. Then, by optimizing the GAN and DAE jointly, our model enables the
knowledge integration for domain mappings and view correspondences to
effectively recover the missing view. Empirical results on benchmark datasets
validate the VIGAN approach by comparing against the state of the art. The
evaluation of VIGAN in a genetic study of substance use disorders further
proves the effectiveness and usability of this approach in life science.Comment: 10 pages, 8 figures, conferenc
A Comprehensive Survey on Deep Graph Representation Learning
Graph representation learning aims to effectively encode high-dimensional
sparse graph-structured data into low-dimensional dense vectors, which is a
fundamental task that has been widely studied in a range of fields, including
machine learning and data mining. Classic graph embedding methods follow the
basic idea that the embedding vectors of interconnected nodes in the graph can
still maintain a relatively close distance, thereby preserving the structural
information between the nodes in the graph. However, this is sub-optimal due
to: (i) traditional methods have limited model capacity which limits the
learning performance; (ii) existing techniques typically rely on unsupervised
learning strategies and fail to couple with the latest learning paradigms;
(iii) representation learning and downstream tasks are dependent on each other
which should be jointly enhanced. With the remarkable success of deep learning,
deep graph representation learning has shown great potential and advantages
over shallow (traditional) methods, there exist a large number of deep graph
representation learning techniques have been proposed in the past decade,
especially graph neural networks. In this survey, we conduct a comprehensive
survey on current deep graph representation learning algorithms by proposing a
new taxonomy of existing state-of-the-art literature. Specifically, we
systematically summarize the essential components of graph representation
learning and categorize existing approaches by the ways of graph neural network
architectures and the most recent advanced learning paradigms. Moreover, this
survey also provides the practical and promising applications of deep graph
representation learning. Last but not least, we state new perspectives and
suggest challenging directions which deserve further investigations in the
future
Source-Aware Embedding Training on Heterogeneous Information Networks
Heterogeneous information networks (HINs) have been extensively applied to
real-world tasks, such as recommendation systems, social networks, and citation
networks. While existing HIN representation learning methods can effectively
learn the semantic and structural features in the network, little awareness was
given to the distribution discrepancy of subgraphs within a single HIN.
However, we find that ignoring such distribution discrepancy among subgraphs
from multiple sources would hinder the effectiveness of graph embedding
learning algorithms. This motivates us to propose SUMSHINE (Scalable
Unsupervised Multi-Source Heterogeneous Information Network Embedding) -- a
scalable unsupervised framework to align the embedding distributions among
multiple sources of an HIN. Experimental results on real-world datasets in a
variety of downstream tasks validate the performance of our method over the
state-of-the-art heterogeneous information network embedding algorithms.Comment: Published in Data Intelligence 202
Towards Data-centric Graph Machine Learning: Review and Outlook
Data-centric AI, with its primary focus on the collection, management, and
utilization of data to drive AI models and applications, has attracted
increasing attention in recent years. In this article, we conduct an in-depth
and comprehensive review, offering a forward-looking outlook on the current
efforts in data-centric AI pertaining to graph data-the fundamental data
structure for representing and capturing intricate dependencies among massive
and diverse real-life entities. We introduce a systematic framework,
Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of
the graph data lifecycle, including graph data collection, exploration,
improvement, exploitation, and maintenance. A thorough taxonomy of each stage
is presented to answer three critical graph-centric questions: (1) how to
enhance graph data availability and quality; (2) how to learn from graph data
with limited-availability and low-quality; (3) how to build graph MLOps systems
from the graph data-centric view. Lastly, we pinpoint the future prospects of
the DC-GML domain, providing insights to navigate its advancements and
applications.Comment: 42 pages, 9 figure
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions
Recent Advances in Deep Learning Techniques for Face Recognition
In recent years, researchers have proposed many deep learning (DL) methods
for various tasks, and particularly face recognition (FR) made an enormous leap
using these techniques. Deep FR systems benefit from the hierarchical
architecture of the DL methods to learn discriminative face representation.
Therefore, DL techniques significantly improve state-of-the-art performance on
FR systems and encourage diverse and efficient real-world applications. In this
paper, we present a comprehensive analysis of various FR systems that leverage
the different types of DL techniques, and for the study, we summarize 168
recent contributions from this area. We discuss the papers related to different
algorithms, architectures, loss functions, activation functions, datasets,
challenges, improvement ideas, current and future trends of DL-based FR
systems. We provide a detailed discussion of various DL methods to understand
the current state-of-the-art, and then we discuss various activation and loss
functions for the methods. Additionally, we summarize different datasets used
widely for FR tasks and discuss challenges related to illumination, expression,
pose variations, and occlusion. Finally, we discuss improvement ideas, current
and future trends of FR tasks.Comment: 32 pages and citation: M. T. H. Fuad et al., "Recent Advances in Deep
Learning Techniques for Face Recognition," in IEEE Access, vol. 9, pp.
99112-99142, 2021, doi: 10.1109/ACCESS.2021.309613
- …