Search CORE

9 research outputs found

Detecting correlated Gaussian databases

Author: Nazer Bobak
Publication venue: IEEE
Publication date: 17/02/2023
Field of study

CCF-1955981 - National Science Foundationhttps://arxiv.org/abs/2206.12011First author draf

Boston University Institutional Repository (OpenBU)

The Umeyama algorithm for matching correlated Gaussian geometric models in the low-dimensional regime

Author: Gong Shuyang
Li Zhangsong
Publication venue
Publication date: 22/02/2024
Field of study

Motivated by the problem of matching two correlated random geometric graphs, we study the problem of matching two Gaussian geometric models correlated through a latent node permutation. Specifically, given an unknown permutation

\pi^*

\{1,\ldots,n\}

and given

n

i.i.d. pairs of correlated Gaussian vectors

\{X_{\pi^*(i)},Y_i\}

\mathbb{R}^d

with noise parameter

\sigma

, we consider two types of (correlated) weighted complete graphs with edge weights given by

A_{i,j}=\langle X_i,X_j \rangle

B_{i,j}=\langle Y_i,Y_j \rangle

. The goal is to recover the hidden vertex correspondence

\pi^*

based on the observed matrices

A

and

B

. For the low-dimensional regime where

d=O(\log n)

, Wang, Wu, Xu, and Yolou [WWXY22+] established the information thresholds for exact and almost exact recovery in matching correlated Gaussian geometric models. They also conducted numerical experiments for the classical Umeyama algorithm. In our work, we prove that this algorithm achieves exact recovery of

\pi^*

when the noise parameter

\sigma=o(d^{-3}n^{-2/d})

, and almost exact recovery when

\sigma=o(d^{-3}n^{-1/d})

. Our results approach the information thresholds up to a

\operatorname{poly}(d)

factor in the low-dimensional regime.Comment: 31 page

arXiv.org e-Print Archive

Joint Correlation Detection and Alignment of Gaussian Databases

Author: Tamir Ran
Publication venue
Publication date: 02/11/2022
Field of study

In this work, we propose an efficient two-stage algorithm solving a joint problem of correlation detection and permutation recovery between two Gaussian databases. Correlation detection is an hypothesis testing problem; under the null hypothesis, the databases are independent, and under the alternate hypothesis, they are correlated, under an unknown row permutation. We develop relatively tight bounds on the type-I and type-II error probabilities, and show that the analyzed detector performs better than a recently proposed detector, at least for some specific parameter choices. Since the proposed detector relies on a statistic, which is a sum of dependent indicator random variables, then in order to bound the type-I probability of error, we develop a novel graph-theoretic technique for bounding the

k

-th order moments of such statistics. When the databases are accepted as correlated, the algorithm also outputs an estimation for the underlying row permutation. By comparing to known converse results for this problem, we prove that the alignment error probability converges to zero under the asymptotically lowest possible correlation coefficient.Comment: 41 pages, 7 figure

arXiv.org e-Print Archive

Database Matching Under Noisy Synchronization Errors

Author: Bakirtas Serhat
Erkip Elza
Publication venue
Publication date: 24/10/2023
Field of study

The re-identification or de-anonymization of users from anonymized data through matching with publicly available correlated user data has raised privacy concerns, leading to the complementary measure of obfuscation in addition to anonymization. Recent research provides a fundamental understanding of the conditions under which privacy attacks, in the form of database matching, are successful in the presence of obfuscation. Motivated by synchronization errors stemming from the sampling of time-indexed databases, this paper presents a unified framework considering both obfuscation and synchronization errors and investigates the matching of databases under noisy entry repetitions. By investigating different structures for the repetition pattern, replica detection and seeded deletion detection algorithms are devised and sufficient and necessary conditions for successful matching are derived. Finally, the impacts of some variations of the underlying assumptions, such as the adversarial deletion model, seedless database matching, and zero-rate regime, on the results are discussed. Overall, our results provide insights into the privacy-preserving publication of anonymized and obfuscated time-indexed data as well as the closely related problem of the capacity of synchronization channels

arXiv.org e-Print Archive

Database Alignment with Gaussian Features

Author: Cullina Daniel
Dai Osman Emre
Kiyavash Negar
Publication venue: MLR Press
Publication date: 31/03/2020
Field of study

We consider the problem of aligning a pair of databases with jointly Gaussian features. We consider two algorithms, complete database alignment via MAP estimation among all possible database alignments, and partial alignment via a thresholding approach of log likelihood ratios. We derive conditions on mutual information between feature pairs, identifying the regimes where the algorithms are guaranteed to perform reliably and those where they cannot be expected to succeed

Infoscience - École polytechnique fédérale de Lausanne