199,554 research outputs found
Efficient Deep Learning on Multi-Source Private Data
Machine learning models benefit from large and diverse datasets. Using such
datasets, however, often requires trusting a centralized data aggregator. For
sensitive applications like healthcare and finance this is undesirable as it
could compromise patient privacy or divulge trade secrets. Recent advances in
secure and privacy-preserving computation, including trusted hardware enclaves
and differential privacy, offer a way for mutually distrusting parties to
efficiently train a machine learning model without revealing the training data.
In this work, we introduce Myelin, a deep learning framework which combines
these privacy-preservation primitives, and use it to establish a baseline level
of performance for fully private machine learning
Learning Robust, Transferable Sentence Representations for Text Classification
Despite deep recurrent neural networks (RNNs) demonstrate strong performance
in text classification, training RNN models are often expensive and requires an
extensive collection of annotated data which may not be available. To overcome
the data limitation issue, existing approaches leverage either pre-trained word
embedding or sentence representation to lift the burden of training RNNs from
scratch. In this paper, we show that jointly learning sentence representations
from multiple text classification tasks and combining them with pre-trained
word-level and sentence level encoders result in robust sentence
representations that are useful for transfer learning. Extensive experiments
and analyses using a wide range of transfer and linguistic tasks endorse the
effectiveness of our approach.Comment: arXiv admin note: substantial text overlap with arXiv:1804.0791
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation
Unsupervised domain adaptation (UDA) aims to leverage the knowledge learned
from a labeled source dataset to solve similar tasks in a new unlabeled domain.
Prior UDA methods typically require to access the source data when learning to
adapt the model, making them risky and inefficient for decentralized private
data. This work tackles a practical setting where only a trained source model
is available and investigates how we can effectively utilize such a model
without source data to solve UDA problems. We propose a simple yet generic
representation learning framework, named \emph{Source HypOthesis Transfer}
(SHOT). SHOT freezes the classifier module (hypothesis) of the source model and
learns the target-specific feature extraction module by exploiting both
information maximization and self-supervised pseudo-labeling to implicitly
align representations from the target domains to the source hypothesis. To
verify its versatility, we evaluate SHOT in a variety of adaptation cases
including closed-set, partial-set, and open-set domain adaptation. Experiments
indicate that SHOT yields state-of-the-art results among multiple domain
adaptation benchmarks.Comment: ICML2020. Fix the typos for Digits. Code is available at
https://github.com/tim-learn/SHO
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
With the breakthroughs in deep learning, the recent years have witnessed a
booming of artificial intelligence (AI) applications and services, spanning
from personal assistant to recommendation systems to video/audio surveillance.
More recently, with the proliferation of mobile computing and
Internet-of-Things (IoT), billions of mobile and IoT devices are connected to
the Internet, generating zillions Bytes of data at the network edge. Driving by
this trend, there is an urgent need to push the AI frontiers to the network
edge so as to fully unleash the potential of the edge big data. To meet this
demand, edge computing, an emerging paradigm that pushes computing tasks and
services from the network core to the network edge, has been widely recognized
as a promising solution. The resulted new inter-discipline, edge AI or edge
intelligence, is beginning to receive a tremendous amount of interest. However,
research on edge intelligence is still in its infancy stage, and a dedicated
venue for exchanging the recent advances of edge intelligence is highly desired
by both the computer system and artificial intelligence communities. To this
end, we conduct a comprehensive survey of the recent research efforts on edge
intelligence. Specifically, we first review the background and motivation for
artificial intelligence running at the network edge. We then provide an
overview of the overarching architectures, frameworks and emerging key
technologies for deep learning model towards training/inference at the network
edge. Finally, we discuss future research opportunities on edge intelligence.
We believe that this survey will elicit escalating attentions, stimulate
fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang,
"Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge
Computing," Proceedings of the IEE
Edge Intelligence: The Confluence of Edge Computing and Artificial Intelligence
Along with the rapid developments in communication technologies and the surge
in the use of mobile devices, a brand-new computation paradigm, Edge Computing,
is surging in popularity. Meanwhile, Artificial Intelligence (AI) applications
are thriving with the breakthroughs in deep learning and the many improvements
in hardware architectures. Billions of data bytes, generated at the network
edge, put massive demands on data processing and structural optimization. Thus,
there exists a strong demand to integrate Edge Computing and AI, which gives
birth to Edge Intelligence. In this paper, we divide Edge Intelligence into AI
for edge (Intelligence-enabled Edge Computing) and AI on edge (Artificial
Intelligence on Edge). The former focuses on providing more optimal solutions
to key problems in Edge Computing with the help of popular and effective AI
technologies while the latter studies how to carry out the entire process of
building AI models, i.e., model training and inference, on the edge. This paper
provides insights into this new inter-disciplinary field from a broader
perspective. It discusses the core concepts and the research road-map, which
should provide the necessary background for potential future research
initiatives in Edge Intelligence.Comment: 13 pages, 3 figure
Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks
Learning distributed sentence representations is one of the key challenges in
natural language processing. Previous work demonstrated that a recurrent neural
network (RNNs) based sentence encoder trained on a large collection of
annotated natural language inference data, is efficient in the transfer
learning to facilitate other related tasks. In this paper, we show that joint
learning of multiple tasks results in better generalizable sentence
representations by conducting extensive experiments and analysis comparing the
multi-task and single-task learned sentence encoders. The quantitative analysis
using auxiliary tasks show that multi-task learning helps to embed better
semantic information in the sentence representations compared to single-task
learning. In addition, we compare multi-task sentence encoders with
contextualized word representations and show that combining both of them can
further boost the performance of transfer learning
Deep Learning Towards Mobile Applications
Recent years have witnessed an explosive growth of mobile devices. Mobile
devices are permeating every aspect of our daily lives. With the increasing
usage of mobile devices and intelligent applications, there is a soaring demand
for mobile applications with machine learning services. Inspired by the
tremendous success achieved by deep learning in many machine learning tasks, it
becomes a natural trend to push deep learning towards mobile applications.
However, there exist many challenges to realize deep learning in mobile
applications, including the contradiction between the miniature nature of
mobile devices and the resource requirement of deep neural networks, the
privacy and security concerns about individuals' data, and so on. To resolve
these challenges, during the past few years, great leaps have been made in this
area. In this paper, we provide an overview of the current challenges and
representative achievements about pushing deep learning on mobile devices from
three aspects: training with mobile data, efficient inference on mobile
devices, and applications of mobile deep learning. The former two aspects cover
the primary tasks of deep learning. Then, we go through our two recent
applications that apply the data collected by mobile devices to inferring mood
disturbance and user identification. Finally, we conclude this paper with the
discussion of the future of this area.Comment: Conference version accepted by ICDCS'1
Joint auto-encoders: a flexible multi-task learning framework
The incorporation of prior knowledge into learning is essential in achieving
good performance based on small noisy samples. Such knowledge is often
incorporated through the availability of related data arising from domains and
tasks similar to the one of current interest. Ideally one would like to allow
both the data for the current task and for previous related tasks to
self-organize the learning system in such a way that commonalities and
differences between the tasks are learned in a data-driven fashion. We develop
a framework for learning multiple tasks simultaneously, based on sharing
features that are common to all tasks, achieved through the use of a modular
deep feedforward neural network consisting of shared branches, dealing with the
common features of all tasks, and private branches, learning the specific
unique aspects of each task. Once an appropriate weight sharing architecture
has been established, learning takes place through standard algorithms for
feedforward networks, e.g., stochastic gradient descent and its variations. The
method deals with domain adaptation and multi-task learning in a unified
fashion, and can easily deal with data arising from different types of sources.
Numerical experiments demonstrate the effectiveness of learning in domain
adaptation and transfer learning setups, and provide evidence for the flexible
and task-oriented representations arising in the network
Confidential Inference via Ternary Model Partitioning
Today's cloud vendors are competing to provide various offerings to simplify
and accelerate AI service deployment. However, cloud users always have concerns
about the confidentiality of their runtime data, which are supposed to be
processed on third-party's compute infrastructures. Information disclosure of
user-supplied data may jeopardize users' privacy and breach increasingly
stringent data protection regulations. In this paper, we systematically
investigate the life cycles of inference inputs in deep learning image
classification pipelines and understand how the information could be leaked.
Based on the discovered insights, we develop a Ternary Model Partitioning
mechanism and bring trusted execution environments to mitigate the identified
information leakages. Our research prototype consists of two co-operative
components: (1) Model Assessment Framework, a local model evaluation and
partitioning tool that assists cloud users in deployment preparation; (2)
Infenclave, an enclave-based model serving system for online confidential
inference in the cloud. We have conducted comprehensive security and
performance evaluation on three representative ImageNet-level deep learning
models with different network depths and architectural complexity. Our results
demonstrate the feasibility of launching confidential inference services in the
cloud with maximized confidentiality guarantees and low performance costs
Seesaw-Net: Convolution Neural Network With Uneven Group Convolution
In this paper, we are interested in boosting the representation capability of
convolution neural networks which utilizing the inverted residual structure.
Based on the success of Inverted Residual structure[Sandler et al. 2018] and
Interleaved Low-Rank Group Convolutions[Sun et al. 2018], we rethink this two
pattern of neural network structure, rather than NAS(Neural architecture
search) method[Zoph and Le 2017; Pham et al. 2018; Liu et al. 2018b], we
introduce uneven point-wise group convolution, which provide a novel search
space for designing basic blocks to obtain better trade-off between
representation capability and computational cost. Meanwhile, we propose two
novel information flow patterns that will enable cross-group information flow
for multiple group convolution layers with and without any channel
permute/shuffle operation. Dense experiments on image classification task show
that our proposed model, named Seesaw-Net, achieves state-of-the-art(SOTA)
performance with limited computation and memory cost. Our code will be
open-source and available together with pre-trained models
- …