Search CORE

44 research outputs found

Consistent inference of a general model using the pseudo-likelihood method

Author: Dikmen Onur
Mozeika Alexander
Piili Joonas
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2014
Field of study

Recently maximum pseudo-likelihood (MPL) inference method has been successfully applied to statistical physics models with intractable likelihoods. We use information theory to derive a relation between the pseudo-likelihood and likelihood functions. We use this relation to show consistency of the pseudo-likelihood method for a general model.Comment: A new more mathematically rigorous version accepted for publication in Physical Review E: Rapid Communicatio

arXiv.org e-Print Archive

Crossref

Aaltodoc Publication Archive

Differentially private Markov chain Monte Carlo

Author: Dikmen Onur
Heikkilä Mikko
Honkela Antti
Jälkö Joonas
Publication venue: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS)
Publication date: 01/01/2019
Field of study

Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly Detection

Author: Bouguelia Mohamed-Rafik
Calikus Ece
Dikmen Onur
Nowaczyk Slawomir
Publication venue
Publication date: 27/01/2021
Field of study

In contextual anomaly detection (CAD), an object is only considered anomalous within a specific context. Most existing methods for CAD use a single context based on a set of user-specified contextual features. However, identifying the right context can be very challenging in practice, especially in datasets, with a large number of attributes. Furthermore, in real-world systems, there might be multiple anomalies that occur in different contexts and, therefore, require a combination of several "useful" contexts to unveil them. In this work, we leverage active learning and ensembles to effectively detect complex contextual anomalies in situations where the true contextual and behavioral attributes are unknown. We propose a novel approach, called WisCon (Wisdom of the Contexts), that automatically creates contexts from the feature set. Our method constructs an ensemble of multiple contexts, with varying importance scores, based on the assumption that not all useful contexts are equally so. Experiments show that WisCon significantly outperforms existing baselines in different categories (i.e., active classifiers, unsupervised contextual and non-contextual anomaly detectors, and supervised classifiers) on seven datasets. Furthermore, the results support our initial hypothesis that there is no single perfect context that successfully uncovers all kinds of contextual anomalies, and leveraging the "wisdom" of multiple contexts is necessary.Comment: Submitted to IEEE TKD

arXiv.org e-Print Archive

Efficient differentially private learning improves drug sensitivity prediction

Author: Das Mrinal
Dikmen Onur
Honkela Antti
Kaski Samuel
Nieminen Arttu
Publication venue
Publication date: 05/07/2017
Field of study

Background: Users of a personalised recommendation system face a dilemma: recommendations can be improved by learning from data, but only if other users are willing to share their private information. Good personalised predictions are vitally important in precision medicine, but genomic information on which the predictions are based is also particularly sensitive, as it directly identifies the patients and hence cannot easily be anonymised. Differential privacy has emerged as a potentially promising solution: privacy is considered sufficient if presence of individual patients cannot be distinguished. However, differentially private learning with current methods does not improve predictions with feasible data sizes and dimensionalities. Results: We show that useful predictors can be learned under powerful differential privacy guarantees, and even from moderately-sized data sets, by demonstrating significant improvements in the accuracy of private drug sensitivity prediction with a new robust private regression method. Our method matches the predictive accuracy of the state-of-the-art non-private lasso regression using only 4x more samples under relatively strong differential privacy guarantees. Good performance with limited data is achieved by limiting the sharing of private information by decreasing the dimensionality and by projecting outliers to fit tighter bounds, therefore needing to add less noise for equal privacy. Conclusions: The proposed differentially private regression method combines theoretical appeal and asymptotic efficiency with good prediction accuracy even with moderate-sized data. As already the simple-to-implement method shows promise on the challenging genomic data, we anticipate rapid progress towards practical applications in many fields.Peer reviewe

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Aaltodoc Publication Archive

Helsingin yliopiston digitaalinen arkisto

Differentially Private Variational Inference for Non-conjugate Models

Author: Dikmen Onur
Honkela Antti
Jälkö Joonas
Publication venue: The Association for Uncertainty in Artificial Intelligence
Publication date: 01/01/2017
Field of study

Many machine learning applications are based on data collected from people, such as their tastes and behaviour as well as biological traits and genetic data. Regardless of how important the application might be, one has to make sure individuals' identities or the privacy of the data are not compromised in the analysis. Differential privacy constitutes a powerful framework that prevents breaching of data subject privacy from the output of a computation. Differentially private versions of many important Bayesian inference methods have been proposed, but there is a lack of an efficient unified approach applicable to arbitrary models. In this contribution, we propose a differentially private variational inference method with a very wide applicability. It is built on top of doubly stochastic variational inference, a recent advance which provides a variational solution to a large class of models. We add differential privacy into doubly stochastic variational inference by clipping and perturbing the gradients. The algorithm is made more efficient through privacy amplification from subsampling. We demonstrate the method can reach an accuracy close to non-private level under reasonably strong privacy guarantees, clearly improving over previous sampling-based alternatives especially in the strong privacy regime.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Maximum Marginal Likelihood Estimation for Nonnegative Dictionary Learning in the Gamma-Poisson Model

Author: Cédric Fevotte
Onur Dikmen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref