617 research outputs found
Recommended from our members
Exponential Family Embeddings
Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. Exponential family embeddings extend the idea of word embeddings to other types of high-dimensional data. Exponential family embeddings have three ingredients; embeddings as latent variables, a predefined conditioning set for each observation called the context and a conditional likelihood from the exponential family. The embeddings are inferred with a scalable algorithm. This thesis highlights three advantages of the exponential family embeddings model class: (A) The approximations used for existing methods such as word2vec can be understood as a biased stochastic gradients procedure on a specific type of exponential family embedding model --- the Bernoulli embedding. (B) By choosing different likelihoods from the exponential family we can generalize the task of learning distributed representations to different application domains. For example, we can learn embeddings of grocery items from shopping data, embeddings of movies from click data, or embeddings of neurons from recordings of zebrafish brains. On all three applications, we find exponential family embedding models to be more effective than other types of dimensionality reduction. They better reconstruct held-out data and find interesting qualitative structure. (C) Finally, the probabilistic modeling perspective allows us to incorporate structure and domain knowledge in the embedding space. We develop models for studying how language varies over time, differs between related groups of data, and how word usage differs between languages. Key to the success of these methods is that the embeddings share statistical information through hierarchical priors or neural networks. We demonstrate the benefits of this approach in empirical studies of Senate speeches, scientific abstracts, and shopping baskets
Hybrid Modeling Design Patterns
Design patterns provide a systematic way to convey solutions to recurring
modeling challenges. This paper introduces design patterns for hybrid modeling,
an approach that combines modeling based on first principles with data-driven
modeling techniques. While both approaches have complementary advantages there
are often multiple ways to combine them into a hybrid model, and the
appropriate solution will depend on the problem at hand. In this paper, we
provide four base patterns that can serve as blueprints for combining
data-driven components with domain knowledge into a hybrid approach. In
addition, we also present two composition patterns that govern the combination
of the base patterns into more complex hybrid models. Each design pattern is
illustrated by typical use cases from application areas such as climate
modeling, engineering, and physics
Raising the Bar in Graph-level Anomaly Detection
Graph-level anomaly detection has become a critical topic in diverse areas,
such as financial fraud detection and detecting anomalous activities in social
networks. While most research has focused on anomaly detection for visual data
such as images, where high detection accuracies have been obtained, existing
deep learning approaches for graphs currently show considerably worse
performance. This paper raises the bar on graph-level anomaly detection, i.e.,
the task of detecting abnormal graphs in a set of graphs. By drawing on ideas
from self-supervised learning and transformation learning, we present a new
deep learning approach that significantly improves existing deep one-class
approaches by fixing some of their known problems, including hypersphere
collapse and performance flip. Experiments on nine real-world data sets
involving nine techniques reveal that our method achieves an average
performance improvement of 11.8% AUC compared to the best existing approach.Comment: To appear in IJCAI-ECAI 202
Complex-Valued Autoencoders for Object Discovery
Object-centric representations form the basis of human perception and enable
us to reason about the world and to systematically generalize to new settings.
Currently, most machine learning work on unsupervised object discovery focuses
on slot-based approaches, which explicitly separate the latent representations
of individual objects. While the result is easily interpretable, it usually
requires the design of involved architectures. In contrast to this, we propose
a distributed approach to object-centric representations: the Complex
AutoEncoder. Following a coding scheme theorized to underlie object
representations in biological neurons, its complex-valued activations represent
two messages: their magnitudes express the presence of a feature, while the
relative phase differences between neurons express which features should be
bound together to create joint object representations. We show that this simple
and efficient approach achieves better reconstruction performance than an
equivalent real-valued autoencoder on simple multi-object datasets.
Additionally, we show that it achieves competitive unsupervised object
discovery performance to a SlotAttention model on two datasets, and manages to
disentangle objects in a third dataset where SlotAttention fails - all while
being 7-70 times faster to train
Extending Machine Language Models toward Human-Level Language Understanding
Language is central to human intelligence. We review recent break- throughs in machine language processing and consider what re- mains to be achieved. Recent approaches rely on domain general principles of learning and representation captured in artificial neu- ral networks. Most current models, however, focus too closely on language itself. In humans, language is part of a larger system for acquiring, representing, and communicating about objects and sit- uations in the physical and social world, and future machine lan- guage models should emulate such a system. We describe exist- ing machine models linking language to concrete situations, and point toward extensions to address more abstract cases. Human language processing exploits complementary learning systems, in- cluding a deep neural network-like learning system that learns grad- ually as machine systems do, as well as a fast-learning system that supports learning new information quickly. Adding such a system to machine language models will be an important further step toward truly human-like language understanding
- …