Search CORE

11 research outputs found

Nonconvex Optimization Algorithms for Structured Matrix Estimation in Large-Scale Data Applications

Author: Giampouras Paraskevas
Γιαμπουράς Παρασκευάς
Publication venue
Publication date: 01/01/2018
Field of study

Το πρόβλημα της εκτίμησης δομημένου πίνακα ανήκει στην κατηγορία των προβλημάτων εύρεσης αναπαραστάσεων χαμηλής διάστασης (low-dimensional embeddings) σε δεδομένα υψηλής διάστασης. Στις μέρες μας συναντάται σε μια πληθώρα εφαρμογών που σχετίζονται με τις ερευνητικές περιοχές της επεξεργασίας σήματος και της μηχανικής μάθησης. Στην παρούσα διατριβή προτείνονται νέοι μαθηματικοί φορμαλισμοί σε τρία διαφορετικά προβλήματα εκτίμησης δομημένων πινάκων από δεδομένα μεγάλης κλίμακας. Πιο συγκεκριμένα, μελετώνται τα ερευνητικά προβλήματα α) της εκτίμησης πίνακα που είναι ταυτόχρονα αραιός, χαμηλού βαθμού και μη-αρνητικός, β) της παραγοντοποίησης πίνακα χαμηλού βαθμού, και γ) της ακολουθιακής (online) εκτίμησης πίνακα υποχώρου (subspace matrix) χαμηλού βαθμού από ελλιπή δεδομένα. Για όλα τα προβλήματα αυτά προτείνονται καινoτόμοι και αποδοτικοί αλγόριθμοι βελτιστοποίησης (optimization algorithms). Βασική υπόθεση που υιοθετείται σε κάθε περίπτωση είναι πως τα δεδομένα έχουν παραχθεί με βάση ένα γραμμικό μοντέλο. Το σύνολο των προσεγγίσεων που ακολουθούνται χαρακτηρίζονται από μη-κυρτότητα. Όπως γίνεται φανερό στην παρούσα διατριβή, η ιδιότητα αυτή, παρά τις δυσκολίες που εισάγει στην θεωρητική τεκμηρίωση των προτεινόμενων μεθόδων (σε αντίθεση με τις κυρτές προσεγγίσεις στις οποίες η θεωρητική ανάλυση είναι σχετικά ευκολότερη), οδηγεί σε σημαντικά οφέλη όσον αφορά την απόδοσή τους σε πλήθος πραγματικών εφαρμογών. Για την εκτίμηση πίνακα που είναι ταυτόχρονα αραιός, χαμηλού βαθμού και μη-αρνητικός, προτείνονται στην παρούσα διατριβή τρεις νέοι αλγόριθμοι, από τους οποίους οι δύο πρώτοι ελαχιστοποιούν μια κοινή συνάρτηση κόστους και ο τρίτος μια ελαφρώς διαφορετική συνάρτηση κόστους. Κοινό χαρακτηριστικό και των δύο αυτών συναρτήσεων είναι ότι κατά βάση αποτελούνται από έναν όρο προσαρμογής στα δεδομένα και δύο όρους κανονικοποίησης, οι οποίοι χρησιμοποιούνται για την επιβολή αραιότητας και χαμηλού βαθμού, αντίστοιχα. Στην πρώτη περίπτωση αυτό επιτυγχάνεται με την αξιοποίηση του αθροίσματος της επανασταθμισμένης l1 νόρμας (reweighted l1 norm) και της επανασταθμισμένης πυρηνικής νόρμας (reweighted nuclear norm), οι οποίες ευθύνονται για το μη- κυρτό χαρακτήρα της προκύπτουσας συνάρτησης κόστους. Από τους δύο προτεινόμενους αλγορίθμους που ελαχιστοποιούν τη συνάρτηση αυτή, ο ένας ακολουθεί τη μέθοδο καθόδου σταδιακής εγγύτητας και ο άλλος βασίζεται στην πιο απαιτητική υπολογιστικά μέθοδο ADMM. Η δεύτερη συνάρτηση κόστους διαφοροποιείται σε σχέση με την πρώτη καθώς χρησιμοποιεί μια προσέγγιση παραγοντοποίησης για τη μοντελοποίηση του χαμηλού βαθμού του δομημένου πίνακα. Επιπλέον, λόγω της μη εκ των προτέρων γνώσης του πραγματικού βαθμού, ενσωματώνει έναν όρο επιβολής χαμηλού βαθμού, μέσω της μη- κυρτής έκφρασης που έχει προταθεί ως ένα άνω αυστηρό φράγμα της (κυρτής) πυρηνικής νόρμας (σ.σ. στο εξής θα αναφέρεται ως εναλλακτική μορφή της πυρηνικής νόρμας). Και στην περίπτωση αυτή, το πρόβλημα που προκύπτει είναι μη-κυρτό λόγω του φορμαλισμού του μέσω της παραγοντοποίησης πίνακα, ενώ η βελτιστοποίηση πραγματοποιείται εφαρμόζοντας μια υπολογιστικά αποδοτική μέθοδο καθόδου συνιστωσών ανά μπλοκ (block coordinate descent). Tο σύνολο των προτεινόμενων σχημάτων χρησιμοποιείται για τη μοντελοποίηση, με καινοτόμο τρόπο, του προβλήματος φασματικού διαχωρισμού υπερφασματικών εικόνων (ΥΦΕ). Όπως εξηγείται αναλυτικά, τόσο η αραιότητα όσο και ο χαμηλός βαθμός παρέχουν πολύτιμες ερμηνείες ορισμένων φυσικών χαρακτηριστικών των ΥΦΕ, όπως π.χ. η χωρική συσχέτιση. Πιο συγκεκριμένα, η αραιότητα και ο χαμηλός βαθμός μπορούν να υιοθετηθούν ως δομές στον πίνακα αφθονίας (abundance matrix - ο πίνακας που περιέχει τα ποσοστά παρουσίας των υλικών στην περιοχή που απεικονίζει κάθε εικονοστοιχείο). Τα σημαντικά πλεονεκτήματα που προσφέρουν οι προτεινόμενες τεχνικές, σε σχέση με ανταγωνιστικούς αλγορίθμους, αναδεικνύονται σε ένα πλήθος διαφορετικών πειραμάτων που πραγματοποιούνται τόσο σε συνθετικά όσο και σε αληθινά υπερφασματικά δεδομένα. Στο πλαίσιο της παραγοντοποίησης πίνακα χαμηλού βαθμού (low-rank matrix factorization) περιγράφονται στη διατριβή τέσσερις νέοι αλγόριθμοι, ο καθένας εκ των οποίων έχει σχεδιαστεί για μια διαφορετική έκφανση του συγκεκριμένου προβλήματος. Όλα τα προτεινόμενα σχήματα έχουν ένα κοινό χαρακτηριστικό: επιβάλλουν χαμηλό βαθμό στους πίνακες-παράγοντες καθώς και στο γινόμενό τους με την εισαγωγή ενός νέου όρου κανονικοποίησης. Ο όρος αυτός προκύπτει ως μια γενίκευση της εναλλακτικής έκφρασης της πυρηνικής νόρμας με τη μετατροπή της σε σταθμισμένη μορφή. Αξίζει να επισημανθεί πως με κατάλληλη επιλογή των πινάκων στάθμισης καταλήγουμε σε μια ειδική έκφραση της συγκεκριμένης νόρμας η οποία ανάγει την διαδικασία επιβολής χαμηλού βαθμού σε αυτή της από κοινού επιβολής αραιότητας στις στήλες των δύο πινάκων. Όπως αναδεικνύεται αναλυτικά, η ιδιότητα αυτή είναι πολύ χρήσιμη ιδιαιτέρως σε εφαρμογές διαχείρισης δεδομένων μεγάλης κλίμακας. Στα πλαίσια αυτά μελετώνται τρία πολύ σημαντικά προβλήματα στο πεδίο της μηχανικής μάθησης και συγκεκριμένα αυτά της αποθορυβοποίησης σήματος (denoising), πλήρωσης πίνακα (matrix completion) και παραγοντοποίησης μη-αρνητικού πίνακα (nonnegative matrix factorization). Χρησιμοποιώντας τη μέθοδο ελαχιστοποίησης άνω φραγμάτων συναρτήσεων διαδοχικών μπλοκ (block successive upper bound minimization) αναπτύσσονται τρεις νέοι επαναληπτικά σταθμισμένοι αλγόριθμοι τύπου Newton, οι οποίοι σχεδιάζονται κατάλληλα, λαμβάνοντας υπόψη τα ιδιαίτερα χαρακτηριστικά του εκάστοτε προβλήματος. Τέλος, παρουσιάζεται αλγόριθμος παραγοντοποίησης πίνακα ο οποίος έχει σχεδιαστεί πάνω στην προαναφερθείσα ιδέα επιβολής χαμηλού βαθμού, υποθέτοντας παράλληλα αραιότητα στον ένα πίνακα-παράγοντα. Η επαλήθευση της αποδοτικότητας όλων των αλγορίθμων που εισάγονται γίνεται με την εφαρμογή τους σε εκτεταμένα συνθετικά πειράματα, όπως επίσης και σε εφαρμογές πραγματικών δεδομένων μεγάλης κλίμακας π.χ. αποθορυβοποίηση ΥΦΕ, πλήρωση πινάκων από συστήματα συστάσεων (recommender systems) ταινιών, διαχωρισμός μουσικού σήματος και τέλος μη-επιβλεπόμενος φασματικός διαχωρισμός. Το τελευταίο πρόβλημα το οποίο διαπραγματεύεται η παρούσα διατριβή είναι αυτό της ακολουθιακής εκμάθησης υποχώρου χαμηλού βαθμού και της πλήρωσης πίνακα. Το πρόβλημα αυτό εδράζεται σε ένα διαφορετικό πλαίσιο μάθησης, την επονομαζόμενη ακολουθιακή μάθηση, η οποία αποτελεί μια πολύτιμη προσέγγιση σε εφαρμογές δεδομένων μεγάλης κλίμακας, αλλά και σε εφαρμογές που λαμβάνουν χώρα σε χρονικά μεταβαλλόμενα περιβάλλοντα. Στην παρούσα διατριβή προτείνονται δύο διαφορετικοί αλγόριθμοι, ένας μπεϋζιανός και ένας ντετερμινιστικός. Ο πρώτος αλγόριθμος προκύπτει από την εφαρμογή μιας καινοτόμου ακολουθιακής μεθόδου συμπερασμού βασισμένου σε μεταβολές. Αυτή η μέθοδος χρησιμοποιείται για την πραγματοποίηση προσεγγιστικού συμπερασμού στο προτεινόμενο ιεραρχικό μπεϋζιανό μοντέλο. Αξίζει να σημειωθεί πως το μοντέλο αυτό έχει σχεδιαστεί με κατάλληλο τρόπο έτσι ώστε να ενσωματώνει, σε πιθανοτικό πλαίσιο, την ίδια ιδέα επιβολής χαμηλού βαθμού που προτείνεται για το πρόβλημα παραγοντοποίησης πίνακα χαμηλού βαθμού, δηλαδή επιβάλλοντας από-κοινού αραιότητα στους πίνακες-παράγοντες. Ωστόσο, ακολουθώντας την πιθανοτική προσέγγιση, αυτό πραγματοποιείται επιβάλλοντας πολύ-επίπεδες a priori κατανομές Laplace στις στήλες τους. Ο αλγόριθμος που προκύπτει είναι πλήρως αυτοματοποιημένος, μιας και δεν απαιτεί τη ρύθμιση κάποιας παραμέτρου κανονικοποίησης. Ο δεύτερος αλγόριθμος προκύπτει από την ελαχιστοποίηση μιας κατάλληλα διαμορφωμένης συνάρτησης κόστους. Και στην περίπτωση αυτή, χρησιμοποιείται η προαναφερθείσα ιδέα επιβολής χαμηλού βαθμού (κατάλληλα τροποποιημένη έτσι ώστε να μπορεί να εφαρμοστεί στο ακολουθιακό πλαίσιο μάθησης). Ενδιαφέρον παρουσιάζει το γεγονός πως ο τελευταίος αλγόριθμος μπορεί να θεωρηθεί ως μια ντετερμινιστική εκδοχή του προαναφερθέντος πιθανοτικού αλγορίθμου. Τέλος, σημαντικό χαρακτηριστικό και των δύο αλγορίθμων είναι ότι δεν είναι απαραίτητη η εκ των προτέρων γνώση του βαθμού του πίνακα υποχώρου. Τα πλεονεκτήματα των προτεινόμενων προσεγγίσεων παρουσιάζονται σε ένα μεγάλο εύρος πειραμάτων που πραγματοποιήθηκαν σε συνθετικά δεδομένα, στο πρόβλημα της ακολουθιακής πλήρωσης ΥΦΕ και στην εκμάθηση ιδιο-προσώπων κάνοντας χρήση πραγματικών δεδομένων.Structured matrix estimation belongs to the family of learning tasks whose main goal is to reveal low-dimensional embeddings of high-dimensional data. Nowadays, this task appears in various forms in a plethora of signal processing and machine learning applications. In the present thesis, novel mathematical formulations for three different instances of structured matrix estimation are proposed. Concretely, the problems of a) simultaneously sparse, low-rank and nonnegative matrix estimation, b) low-rank matrix factorization and c) online low-rank subspace learning and matrix completion, are addressed and analyzed. In all cases, it is assumed that data are generated by a linear process, i.e., we deal with linear measurements. A suite of novel and efficient {\it optimization algorithms} amenable to handling {\it large-scale data} are presented. A key common feature of all the introduced schemes is {\it nonconvexity}. It should be noted that albeit nonconvexity complicates the derivation of theoretical guarantees (contrary to convex relevant approaches, which - in most cases - can be theoretically analyzed relatively easily), significant gains in terms of the estimation performance of the emerging algorithms have been recently witnessed in several real practical situations. Let us first focus on simultaneously sparse, low-rank and nonnegative matrix estimation from linear measurements. In the thesis this problem is resolved by three different optimization algorithms, which address two different and novel formulations of the relevant task. All the proposed schemes are suitably devised for minimizing a cost function consisting of a least-squares data fitting term and two regularization terms. The latter are utilized for promoting sparsity and low-rankness. The novelty of the first formulation lies in the use, for the first time in the literature, of the sum of the reweighted

\ell_1

and the reweighted nuclear norms. The merits of reweighted

\ell_1

and nuclear norms have been exposed in numerous sparse and low-rank matrix recovery problems. As is known, albeit these two norms induce nonconvexity in the resulting optimization problems, they provide a better approximation of the

\ell_0

norm and the rank function, respectively, as compared to relevant convex regularizers. Herein, we aspire to benefit from the use of the combination of these two norms. The first algorithm is an incremental proximal minimization scheme, while the second one is an ADMM solver. The third algorithm's main goal is to further reduce the computational complexity. Towards this end, it deviates from the other two in the use of a matrix factorization based approach for modelling low-rankness. Since the rank of the sought matrix is generally unknown, a low-rank imposing term, i.e., the variational form of the nuclear norm, which is a function of the matrix factors, is utilized. In this case, the optimization process takes place via a block coordinate descent type scheme. The proposed formulations are utilized for modelling in a pioneering way a very important problem in hyperspectral image processing, that of hyperspectral image unmixing. It is shown that both sparsity and low-rank offer meaningful interpretations of inherent natural characteristics of hyperspectral images. More specifically, both sparsity and low-rankness are reasonable hypotheses that can be made for the so-called {\it abundance} matrix, i.e., the nonnegative matrix containing the fractions of presence of the different materials, called {\it endmembers}, at the region depicted by each pixel. The merits of the proposed algorithms over other state-of-the-art hyperspectral unmixing algorithms are corroborated in a wealth of simulated and real hyperspectral imaging data experiments. In the framework of low-rank matrix factorization (LRMF) four novel optimization algorithms are presented, each modelling a different instance of it. All the proposed schemes share a common thread: they impose low-rank on both matrix factors and the sought matrix by a newly introduced regularization term. This term can be considered as a generalized weighted version of the variational form of the nuclear norm. Notably, by appropriately selecting the weight matrix, low-rank enforcement amounts to imposing joint column sparsity on both matrix factors. This property is actually proven to be quite important in applications dealing with large-scale data, since it leads to a significant decrease of the induced computational complexity. Along these lines, three well-known machine learning tasks, namely, denoising, matrix completion and low-rank nonnegative matrix factorization (NMF), are redefined according to the new low-rank regularization approach. Then, following the block successive upper bound minimization framework, alternating iteratively reweighted least-squares, Newton-type algorithms are devised accounting for the particular characteristics of the problem that each time is addressed. Lastly, an additional low-rank and sparse NMF algorithm is proposed, which hinges upon the same low-rank promoting idea mentioned above, while also accounting for sparsity on one of the matrix factors. All the derived algorithms are tested on extensive simulated data experiments and real large-scale data applications such as hyperspectral image denoising, matrix completion for recommender systems, music signal decomposition and unsupervised hyperspectral image unmixing with unknown number of endmembers. The last problem that this thesis touches upon is online low-rank subspace learning and matrix completion. This task follows a different learning model, i.e., online learning, which offers a valuable processing framework when one deals with large-scale streaming data possibly under time-varying conditions. In the thesis, two different online algorithms are put forth. The first one stems from a newly developed online variational Bayes scheme. This is applied for performing approximate inference based on a carefully designed novel multi-hierarchical Bayesian model. Notably, the adopted model encompasses similar low-rank promoting ideas to those mentioned for LRMF. That is, low-rank is imposed via promoting jointly column sparsity on the columns of the matrix factors. However, following the Bayesian rationale, this now takes place by assigning Laplace-type marginal priors on the matrix factors. Going one step further, additional sparsity is independently modelled on the subspace matrix thus imposing multiple structures on the same matrix. The resulting algorithm is fully automated, i.e., it does not demand fine-tuning of any parameters. The second algorithm follows a cost function minimization based strategy. Again, the same low-rank promoting idea introduced for LRMF is incorporated in this problem via the use of a - modified to the online processing scenario - low-rank regularization term. Interestingly, the resulting optimization scheme can be considered as the deterministic analogue of the Bayesian one. Both the proposed algorithms present a favorable feature, i.e., they are competent to learn subspaces without requiring the a priori knowledge of their true rank. Their effectiveness is showcased in extensive simulated data experiments and in online hyperspectral image completion and eigenface learning using real data

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

Compressed sensing with approximate message passing: measurement matrix and algorithm design

Author: Guo Chunli
Publication venue: The University of Edinburgh
Publication date: 29/06/2015
Field of study

Compressed sensing (CS) is an emerging technique that exploits the properties of a sparse or compressible signal to efficiently and faithfully capture it with a sampling rate far below the Nyquist rate. The primary goal of compressed sensing is to achieve the best signal recovery with the least number of samples. To this end, two research directions have been receiving increasing attention: customizing the measurement matrix to the signal of interest and optimizing the reconstruction algorithm. In this thesis, contributions in both directions are made in the Bayesian setting for compressed sensing. The work presented in this thesis focuses on the approximate message passing (AMP) schemes, a new class of recovery algorithm that takes advantage of the statistical properties of the CS problem. First of all, a complete sample distortion (SD) framework is presented to fundamentally quantify the reconstruction performance for a certain pair of measurement matrix and recovery scheme. In the SD setting, the non-optimality region of the homogeneous Gaussian matrix is identified and the novel zeroing matrix is proposed with an improved performance. With the SD framework, the optimal sample allocation strategy for the block diagonal measurement matrix are derived for the wavelet representation of natural images. Extensive simulations validate the optimality of the proposed measurement matrix design. Motivated by the zeroing matrix, we extend the seeded matrix design in the CS literature to the novel modulated matrix structure. The major advantage of the modulated matrix over the seeded matrix lies in the simplicity of its state evolution dynamics. Together with the AMP based algorithm, the modulated matrix possesses a 1-D performance prediction system, with which we can optimize the matrix configuration. We then focus on a special modulated matrix form, designated as the two block matrix, which can also be seen as a generalization of the zeroing matrix. The effectiveness of the two block matrix is demonstrated through both sparse and compressible signals. The underlining reason for the improved performance is presented through the analysis of the state evolution dynamics. The final contribution of the thesis explores improving the reconstruction algorithm. By taking the signal prior into account, the Bayesian optimal AMP (BAMP) algorithm is demonstrated to dramatically improve the reconstruction quality. The key insight for its success is that it utilizes the minimum mean square error (MMSE) estimator for the CS denoising. However, the prerequisite of the prior information makes it often impractical. A novel SURE-AMP algorithm is proposed to address the dilemma. The critical feature of SURE-AMP is that the Stein’s unbiased risk estimate (SURE) based parametric least square estimator is used to replace the MMSE estimator. Given the optimization of the SURE estimator only involves the noisy data, it eliminates the need for the signal prior, thus can accommodate more general sparse models

Edinburgh Research Archive

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server

Fundamentals

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/01/2023
Field of study

Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

Directory of Open Access Books (DOAB)

Fundamentals

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

OAPEN Library

Statistical inference for some choice models

Author: Zhou Kaifang
Publication venue
Publication date: 01/03/2023
Field of study

This thesis comprises two chapters that study the statistical inference problems for two types of choice models, namely, the discrete voter model and the Bradley-Terry models, respectively. In Chapter 1, we consider a discrete-time voter model process on a set of nodes, each being in one of two states, either 0 or 1. In each time step, each node adopts the state of a randomly sampled neighbour according to sampling probabilities, referred to as node interaction parameters. We study the maximum likelihood estimation of the node interaction parameters from observed node states for a given number of realizations of the voter model process. We present parameter estimation error bounds by interpreting the observation data as being generated according to an extended voter process that consists of cycles, each corresponding to a realization of the voter model process until absorption to a consensus state. We present new bounds for all moments and a probability tail bound for consensus time. We also present a sampling complexity lower bound for parameter estimation within a prescribed error tolerance for the class of locally stable estimators. In Chapter 2, we study the popular methods for inference of the Bradley-Terry model parameters, namely the gradient descent and MM algorithm, for maximum likelihood estimation and maximum a posteriori probability estimation. This class of models includes the Bradley-Terry model of paired comparisons, the Rao-Kupper model of paired comparisons allowing for tie outcomes, the Luce choice model, and the Plackett-Luce ranking model. We propose a simple modification of the classical gradient descent and MM algorithm with a parameter rescaling performed at each iteration step that avoids the observed slow convergence issue that we found in our previous work (Vojnovic et al. [2020]). We study the convergence rates of accelerated gradient descent and MM Algorithms for Bradley-Terry models. We also produce some experimental results using synthetic and real-world data to show that significant efficiency gains can be obtained by our new proposed method

LSE Theses Online

LIPIcs, Volume 261, ICALP 2023, Complete Volume

Author: Etessami Kousha
Feige Uriel
Puppis Gabriele
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 261, ICALP 2023, Complete Volum

Dagstuhl Research Online Publication Server

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server

TOPICS IN COMPUTATIONAL NUMBER THEORY AND CRYPTANALYSIS - On Simultaneous Chinese Remaindering, Primes, the MiNTRU Assumption, and Functional Encryption

Author: Barthel Jim Jean-Pierre
Publication venue: University of Luxembourg, Luxembourg
Publication date: 02/09/2022
Field of study

This thesis reports on four independent projects that lie in the intersection of mathematics, computer science, and cryptology: Simultaneous Chinese Remaindering: The classical Chinese Remainder Problem asks to find all integer solutions to a given system of congruences where each congruence is defined by one modulus and one remainder. The Simultaneous Chinese Remainder Problem is a direct generalization of its classical counterpart where for each modulus the single remainder is replaced by a non-empty set of remainders. The solutions of a Simultaneous Chinese Remainder Problem instance are completely defined by a set of minimal positive solutions, called primitive solutions, which are upper bounded by the lowest common multiple of the considered moduli. However, contrary to its classical counterpart, which has at most one primitive solution, the Simultaneous Chinese Remainder Problem may have an exponential number of primitive solutions, so that any general-purpose solving algorithm requires exponential time. Furthermore, through a direct reduction from the 3-SAT problem, we prove first that deciding whether a solution exists is NP-complete, and second that if the existence of solutions is guaranteed, then deciding whether a solution of a particular size exists is also NP-complete. Despite these discouraging results, we studied methods to find the minimal solution to Simultaneous Chinese Remainder Problem instances and we discovered some interesting statistical properties. A Conjecture On Primes In Arithmetic Progressions And Geometric Intervals: Dirichlet’s theorem on primes in arithmetic progressions states that for any positive integer q and any coprime integer a, there are infinitely many primes in the arithmetic progression a + nq (n ∈ N), however, it does not indicate where those primes can be found. Linnik’s theorem predicts that the first such prime p0 can be found in the interval [0;q^L] where L denotes an absolute and explicitly computable constant. Albeit only L = 5 has been proven, it is widely believed that L ≤ 2. We generalize Linnik’s theorem by conjecturing that for any integers q ≥ 2, 1 ≤ a ≤ q − 1 with gcd(q, a) = 1, and t ≥ 1, there exists a prime p such that p ∈ [q^t;q^(t+1)] and p ≡ a mod q. Subsequently, we prove the conjecture for all sufficiently large exponent t, we computationally verify it for all sufficiently small modulus q, and we investigate its relation to other mathematical results such as Carmichael’s totient function conjecture. On The (M)iNTRU Assumption Over Finite Rings: The inhomogeneous NTRU (iNTRU) assumption is a recent computational hardness assumption, which claims that first adding a random low norm error vector to a known gadget vector and then multiplying the result with a secret vector is sufficient to obfuscate the considered secret vector. The matrix inhomogeneous NTRU (MiNTRU) assumption essentially replaces vectors with matrices. Albeit those assumptions strongly remind the well-known learning-with-errors (LWE) assumption, their hardness has not been studied in full detail yet. We provide an elementary analysis of the corresponding decision assumptions and break them in their basis case using an elementary q-ary lattice reduction attack. Concretely, we restrict our study to vectors over finite integer rings, which leads to a problem that we call (M)iNTRU. Starting from a challenge vector, we construct a particular q-ary lattice that contains an unusually short vector whenever the challenge vector follows the (M)iNTRU distribution. Thereby, elementary lattice reduction allows us to distinguish a random challenge vector from a synthetically constructed one. A Conditional Attack Against Functional Encryption Schemes: Functional encryption emerged as an ambitious cryptographic paradigm supporting function evaluations over encrypted data revealing the result in plain. Therein, the result consists either in a valid output or a special error symbol. We develop a conditional selective chosen-plaintext attack against the indistinguishability security notion of functional encryption. Intuitively, indistinguishability in the public-key setting is based on the premise that no adversary can distinguish between the encryptions of two known plaintext messages. As functional encryption allows us to evaluate functions over encrypted messages, the adversary is restricted to evaluations resulting in the same output only. To ensure consistency with other primitives, the decryption procedure of a functional encryption scheme is allowed to fail and output an error. We observe that an adversary may exploit the special role of these errors to craft challenge messages that can be used to win the indistinguishability game. Indeed, the adversary can choose the messages such that their functional evaluation leads to the common error symbol, but their intermediate computation values differ. A formal decomposition of the underlying functionality into a mathematical function and an error trigger reveals this dichotomy. Finally, we outline the impact of this observation on multiple DDH-based inner-product functional encryption schemes when we restrict them to bounded-norm evaluations only

Open Repository and Bibliography - Luxembourg