88 research outputs found
Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning
Many interesting problems in machine learning are being revisited with new
deep learning tools. For graph-based semisupervised learning, a recent
important development is graph convolutional networks (GCNs), which nicely
integrate local vertex features and graph topology in the convolutional layers.
Although the GCN model compares favorably with other state-of-the-art methods,
its mechanisms are not clear and it still requires a considerable amount of
labeled data for validation and model selection. In this paper, we develop
deeper insights into the GCN model and address its fundamental limits. First,
we show that the graph convolution of the GCN model is actually a special form
of Laplacian smoothing, which is the key reason why GCNs work, but it also
brings potential concerns of over-smoothing with many convolutional layers.
Second, to overcome the limits of the GCN model with shallow architectures, we
propose both co-training and self-training approaches to train GCNs. Our
approaches significantly improve GCNs in learning with very few labels, and
exempt them from requiring additional labels for validation. Extensive
experiments on benchmarks have verified our theory and proposals.Comment: AAAI-2018 Oral Presentatio
Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision
Feature selection is essential for effective visual recognition. We propose
an efficient joint classifier learning and feature selection method that
discovers sparse, compact representations of input features from a vast sea of
candidates, with an almost unsupervised formulation. Our method requires only
the following knowledge, which we call the \emph{feature sign}---whether or not
a particular feature has on average stronger values over positive samples than
over negatives. We show how this can be estimated using as few as a single
labeled training sample per class. Then, using these feature signs, we extend
an initial supervised learning problem into an (almost) unsupervised clustering
formulation that can incorporate new data without requiring ground truth
labels. Our method works both as a feature selection mechanism and as a fully
competitive classifier. It has important properties, low computational cost and
excellent accuracy, especially in difficult cases of very limited training
data. We experiment on large-scale recognition in video and show superior speed
and performance to established feature selection approaches such as AdaBoost,
Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771
Enabling Un-/Semi-Supervised Machine Learning for MDSE of the Real-World CPS/IoT Applications
In this paper, we propose a novel approach to support domain-specific
Model-Driven Software Engineering (MDSE) for the real-world use-case scenarios
of smart Cyber-Physical Systems (CPS) and the Internet of Things (IoT). We
argue that the majority of available data in the nature for Artificial
Intelligence (AI), specifically Machine Learning (ML) are unlabeled. Hence,
unsupervised and/or semi-supervised ML approaches are the practical choices.
However, prior work in the literature of MDSE has considered supervised ML
approaches, which only work with labeled training data. Our proposed approach
is fully implemented and integrated with an existing state-of-the-art MDSE tool
to serve the CPS/IoT domain. Moreover, we validate the proposed approach using
a portion of the open data of the REFIT reference dataset for the smart energy
systems domain. Our model-to-code transformations (code generators) provide the
full source code of the desired IoT services out of the model instances in an
automated manner. Currently, we generate the source code in Java and Python.
The Python code is responsible for the ML functionalities and uses the APIs of
several ML libraries and frameworks, namely Scikit-Learn, Keras and TensorFlow.
For unsupervised and semi-supervised learning, the APIs of Scikit-Learn are
deployed. In addition to the pure MDSE approach, where certain ML methods,
e.g., K-Means, Mini-Batch K-Means, DB-SCAN, Spectral Clustering, Gaussian
Mixture Model, Self-Training, Label Propagation and Label Spreading are
supported, a more flexible, hybrid approach is also enabled to support the
practitioner in deploying a pre-trained ML model with any arbitrary
architecture and learning algorithm.Comment: Preliminary versio
Enhancing Domain Word Embedding via Latent Semantic Imputation
We present a novel method named Latent Semantic Imputation (LSI) to transfer
external knowledge into semantic space for enhancing word embedding. The method
integrates graph theory to extract the latent manifold structure of the
entities in the affinity space and leverages non-negative least squares with
standard simplex constraints and power iteration method to derive spectral
embeddings. It provides an effective and efficient approach to combining entity
representations defined in different Euclidean spaces. Specifically, our
approach generates and imputes reliable embedding vectors for low-frequency
words in the semantic space and benefits downstream language tasks that depend
on word embedding. We conduct comprehensive experiments on a carefully designed
classification problem and language modeling and demonstrate the superiority of
the enhanced embedding via LSI over several well-known benchmark embeddings. We
also confirm the consistency of the results under different parameter settings
of our method.Comment: ACM SIGKDD 201
Transfer Learning across Networks for Collective Classification
This paper addresses the problem of transferring useful knowledge from a
source network to predict node labels in a newly formed target network. While
existing transfer learning research has primarily focused on vector-based data,
in which the instances are assumed to be independent and identically
distributed, how to effectively transfer knowledge across different information
networks has not been well studied, mainly because networks may have their
distinct node features and link relationships between nodes. In this paper, we
propose a new transfer learning algorithm that attempts to transfer common
latent structure features across the source and target networks. The proposed
algorithm discovers these latent features by constructing label propagation
matrices in the source and target networks, and mapping them into a shared
latent feature space. The latent features capture common structure patterns
shared by two networks, and serve as domain-independent features to be
transferred between networks. Together with domain-dependent node features, we
thereafter propose an iterative classification algorithm that leverages label
correlations to predict node labels in the target network. Experiments on
real-world networks demonstrate that our proposed algorithm can successfully
achieve knowledge transfer between networks to help improve the accuracy of
classifying nodes in the target network.Comment: Published in the proceedings of IEEE ICDM 201
- …