440 research outputs found
A Very Brief Introduction to Machine Learning With Applications to Communication Systems
Given the unprecedented availability of data and computing resources, there
is widespread renewed interest in applying data-driven machine learning methods
to problems for which the development of conventional engineering solutions is
challenged by modelling or algorithmic deficiencies. This tutorial-style paper
starts by addressing the questions of why and when such techniques can be
useful. It then provides a high-level introduction to the basics of supervised
and unsupervised learning. For both supervised and unsupervised learning,
exemplifying applications to communication networks are discussed by
distinguishing tasks carried out at the edge and at the cloud segments of the
network at different layers of the protocol stack
Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection
Real-life applications, heavily relying on machine learning, such as dialog
systems, demand out-of-domain detection methods. Intent classification models
should be equipped with a mechanism to distinguish seen intents from unseen
ones so that the dialog agent is capable of rejecting the latter and avoiding
undesired behavior. However, despite increasing attention paid to the task, the
best practices for out-of-domain intent detection have not yet been fully
established.
This paper conducts a thorough comparison of out-of-domain intent detection
methods. We prioritize the methods, not requiring access to out-of-domain data
during training, gathering of which is extremely time- and labor-consuming due
to lexical and stylistic variation of user utterances. We evaluate multiple
contextual encoders and methods, proven to be efficient, on three standard
datasets for intent classification, expanded with out-of-domain utterances. Our
main findings show that fine-tuning Transformer-based encoders on in-domain
data leads to superior results. Mahalanobis distance, together with utterance
representations, derived from Transformer-based encoders, outperforms other
methods by a wide margin and establishes new state-of-the-art results for all
datasets.
The broader analysis shows that the reason for success lies in the fact that
the fine-tuned Transformer is capable of constructing homogeneous
representations of in-domain utterances, revealing geometrical disparity to out
of domain utterances. In turn, the Mahalanobis distance captures this disparity
easily.Comment: to appear in AAAI 202
ICF-SRSR: Invertible scale-Conditional Function for Self-Supervised Real-world Single Image Super-Resolution
Single image super-resolution (SISR) is a challenging ill-posed problem that
aims to up-sample a given low-resolution (LR) image to a high-resolution (HR)
counterpart. Due to the difficulty in obtaining real LR-HR training pairs,
recent approaches are trained on simulated LR images degraded by simplified
down-sampling operators, e.g., bicubic. Such an approach can be problematic in
practice because of the large gap between the synthesized and real-world LR
images. To alleviate the issue, we propose a novel Invertible scale-Conditional
Function (ICF), which can scale an input image and then restore the original
input with different scale conditions. By leveraging the proposed ICF, we
construct a novel self-supervised SISR framework (ICF-SRSR) to handle the
real-world SR task without using any paired/unpaired training data.
Furthermore, our ICF-SRSR can generate realistic and feasible LR-HR pairs,
which can make existing supervised SISR networks more robust. Extensive
experiments demonstrate the effectiveness of the proposed method in handling
SISR in a fully self-supervised manner. Our ICF-SRSR demonstrates superior
performance compared to the existing methods trained on synthetic paired images
in real-world scenarios and exhibits comparable performance compared to
state-of-the-art supervised/unsupervised methods on public benchmark datasets
Embedding Semantic Relations into Word Representations
Learning representations for semantic relations is important for various
tasks such as analogy detection, relational search, and relation
classification. Although there have been several proposals for learning
representations for individual words, learning word representations that
explicitly capture the semantic relations between words remains under
developed. We propose an unsupervised method for learning vector
representations for words such that the learnt representations are sensitive to
the semantic relations that exist between two words. First, we extract lexical
patterns from the co-occurrence contexts of two words in a corpus to represent
the semantic relations that exist between those two words. Second, we represent
a lexical pattern as the weighted sum of the representations of the words that
co-occur with that lexical pattern. Third, we train a binary classifier to
detect relationally similar vs. non-similar lexical pattern pairs. The proposed
method is unsupervised in the sense that the lexical pattern pairs we use as
train data are automatically sampled from a corpus, without requiring any
manual intervention. Our proposed method statistically significantly
outperforms the current state-of-the-art word representations on three
benchmark datasets for proportional analogy detection, demonstrating its
ability to accurately capture the semantic relations among words.Comment: International Joint Conferences in AI (IJCAI) 201
Hierarchical VAEs Know What They Don't Know
Deep generative models have been demonstrated as state-of-the-art density
estimators. Yet, recent work has found that they often assign a higher
likelihood to data from outside the training distribution. This seemingly
paradoxical behavior has caused concerns over the quality of the attained
density estimates. In the context of hierarchical variational autoencoders, we
provide evidence to explain this behavior by out-of-distribution data having
in-distribution low-level features. We argue that this is both expected and
desirable behavior. With this insight in hand, we develop a fast, scalable and
fully unsupervised likelihood-ratio score for OOD detection that requires data
to be in-distribution across all feature-levels. We benchmark the method on a
vast set of data and model combinations and achieve state-of-the-art results on
out-of-distribution detection.Comment: Appeared in Proceedings of the 38th International Conference on
Machine Learning (ICML 2021). 18 pages, source code available at
https://github.com/JakobHavtorn/hvae-oodd,
https://github.com/vlievin/biva-pytorch and
https://github.com/larsmaaloee/BIV
- …