Search CORE

617 research outputs found

Recommended from our members

Exponential Family Embeddings

Author: Rudolph Maja
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. Exponential family embeddings extend the idea of word embeddings to other types of high-dimensional data. Exponential family embeddings have three ingredients; embeddings as latent variables, a predefined conditioning set for each observation called the context and a conditional likelihood from the exponential family. The embeddings are inferred with a scalable algorithm. This thesis highlights three advantages of the exponential family embeddings model class: (A) The approximations used for existing methods such as word2vec can be understood as a biased stochastic gradients procedure on a specific type of exponential family embedding model --- the Bernoulli embedding. (B) By choosing different likelihoods from the exponential family we can generalize the task of learning distributed representations to different application domains. For example, we can learn embeddings of grocery items from shopping data, embeddings of movies from click data, or embeddings of neurons from recordings of zebrafish brains. On all three applications, we find exponential family embedding models to be more effective than other types of dimensionality reduction. They better reconstruct held-out data and find interesting qualitative structure. (C) Finally, the probabilistic modeling perspective allows us to incorporate structure and domain knowledge in the embedding space. We develop models for studying how language varies over time, differs between related groups of data, and how word usage differs between languages. Key to the success of these methods is that the embeddings share statistical information through hierarchical priors or neural networks. We demonstrate the benefits of this approach in empirical studies of Senate speeches, scientific abstracts, and shopping baskets

Columbia University Academic Commons

Hybrid Modeling Design Patterns

Author: Kurz Stefan
Rakitsch Barbara
Rudolph Maja
Publication venue
Publication date: 29/12/2023
Field of study

Design patterns provide a systematic way to convey solutions to recurring modeling challenges. This paper introduces design patterns for hybrid modeling, an approach that combines modeling based on first principles with data-driven modeling techniques. While both approaches have complementary advantages there are often multiple ways to combine them into a hybrid model, and the appropriate solution will depend on the problem at hand. In this paper, we provide four base patterns that can serve as blueprints for combining data-driven components with domain knowledge into a hybrid approach. In addition, we also present two composition patterns that govern the combination of the base patterns into more complex hybrid models. Each design pattern is illustrated by typical use cases from application areas such as climate modeling, engineering, and physics

arXiv.org e-Print Archive

Raising the Bar in Graph-level Anomaly Detection

Author: Kloft Marius
Mandt Stephan
Qiu Chen
Rudolph Maja
Publication venue
Publication date: 27/05/2022
Field of study

Graph-level anomaly detection has become a critical topic in diverse areas, such as financial fraud detection and detecting anomalous activities in social networks. While most research has focused on anomaly detection for visual data such as images, where high detection accuracies have been obtained, existing deep learning approaches for graphs currently show considerably worse performance. This paper raises the bar on graph-level anomaly detection, i.e., the task of detecting abnormal graphs in a set of graphs. By drawing on ideas from self-supervised learning and transformation learning, we present a new deep learning approach that significantly improves existing deep one-class approaches by fixing some of their known problems, including hypersphere collapse and performance flip. Experiments on nine real-world data sets involving nine techniques reveal that our method achieves an average performance improvement of 11.8% AUC compared to the best existing approach.Comment: To appear in IJCAI-ECAI 202

arXiv.org e-Print Archive

Complex-Valued Autoencoders for Object Discovery

Author: Lippe Phillip
Löwe Sindy
Rudolph Maja
Welling Max
Publication venue
Publication date: 05/04/2022
Field of study

Object-centric representations form the basis of human perception and enable us to reason about the world and to systematically generalize to new settings. Currently, most machine learning work on unsupervised object discovery focuses on slot-based approaches, which explicitly separate the latent representations of individual objects. While the result is easily interpretable, it usually requires the design of involved architectures. In contrast to this, we propose a distributed approach to object-centric representations: the Complex AutoEncoder. Following a coding scheme theorized to underlie object representations in biological neurons, its complex-valued activations represent two messages: their magnitudes express the presence of a feature, while the relative phase differences between neurons express which features should be bound together to create joint object representations. We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets. Additionally, we show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models

Author: Baldridge Jason
Hill Felix
McClelland James
Rudolph Maja
Schütze Hinrich
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 28/09/2020
Field of study

Open Access LMU ( Ludwig-Maximilians-Univ. München)

Extending Machine Language Models toward Human-Level Language Understanding

Author: Baldridge Jason
Hill Felix
McClelland James
Rudolph Maja
Schütze Hinrich
Publication venue
Publication date: 12/12/2019
Field of study

Language is central to human intelligence. We review recent break- throughs in machine language processing and consider what re- mains to be achieved. Recent approaches rely on domain general principles of learning and representation captured in artificial neu- ral networks. Most current models, however, focus too closely on language itself. In humans, language is part of a larger system for acquiring, representing, and communicating about objects and sit- uations in the physical and social world, and future machine lan- guage models should emulate such a system. We describe exist- ing machine models linking language to concrete situations, and point toward extensions to address more abstract cases. Human language processing exploits complementary learning systems, in- cluding a deep neural network-like learning system that learns grad- ually as machine systems do, as well as a fast-learning system that supports learning new information quickly. Adding such a system to machine language models will be an important further step toward truly human-like language understanding

arXiv.org e-Print Archive

Open Access LMU ( Ludwig-Maximilians-Univ. München)