    The Query Complexity of Mastermind with l_p Distances

    Consider a variant of the Mastermind game in which queries are l_p distances, rather than the usual Hamming distance. That is, a codemaker chooses a hidden vector y in {-k,-k+1,...,k-1,k}^n and answers to queries of the form ||y-x||_p where x in {-k,-k+1,...,k-1,k}^n. The goal is to minimize the number of queries made in order to correctly guess y. In this work, we show an upper bound of O(min{n,(n log k)/(log n)}) queries for any real 10. Thus, essentially any approximation of this problem is as hard as finding the hidden vector exactly, up to constant factors. Finally, we show that for the noisy version of the problem, i.e., the setting when the codemaker answers queries with any q = (1 +/- epsilon)||y-x||_p, there is no query efficient algorithm

    The Exact Query Complexity of Yes-No Permutation Mastermind

    Mastermind is famous two-player game. The ïŹrst player (codemaker) chooses a secret code which the second player (codebreaker) is supposed to crack within a minimum number of code guesses (queries). Therefore, the codemaker’s duty is to help the codebreaker by providing a well-deïŹned error measure between the secret code and the guessed code after each query. We consider a variant, called Yes-No AB-Mastermind, where both secret code and queries must be repetition-free and the provided information by the codemaker only indicates if a query contains any correct position at all. For this Mastermind version with n positions and k ≄ n colors and ` := k + 1 − n, we prove a lower bound of ∑ k j=` log 2 j and an upper bound of n log 2 n + k on the number of queries necessary to break the secret code. For the important case k = n, where both secret code and queries represent permutations, our results imply an exact asymptotic complexity of Θ (n log n) queries

    On the Query Complexity of Black-Peg AB-Mastermind

    Mastermind is a two players zero sum game of imperfect information. Starting with Erd˝os and RĂ©nyi (1963), its combinatorics have been studied to date by several authors, e.g., Knuth (1977), ChvĂĄtal (1983), Goodrich (2009). The ïŹrst player, called “codemaker”, chooses a secret code and the second player, called “codebreaker”, tries to break the secret code by making as few guesses as possible, exploiting information that is given by the codemaker after each guess. For variants that allow color repetition, Doerr et al. (2016) showed optimal results. In this paper, we consider the so called Black-Peg variant of Mastermind, where the only information concerning a guess is the number of positions in which the guess coincides with the secret code. More precisely, we deal with a special version of the Black-Peg game with n holes and k ≄ n colors where no repetition of colors is allowed. We present upper and lower bounds on the number of guesses necessary to break the secret code. For the case k = n, the secret code can be algorithmically identiïŹed within less than (n − 3)dlog 2 ne + 5 2 n − 1 queries. This result improves the result of Ker-I Ko and Shia-Chung Teng (1985) by almost a factor of 2. For the case k > n, we prove an upper bound of (n − 2)dlog 2 ne + k + 1. Furthermore, we prove a new lower bound of n for the case k = n, which improves the recent n − log log(n) bound of Berger et al. (2016). We then generalize this lower bound to k queries for the case k ≄ n

    Complexity Theory for Discrete Black-Box Optimization Heuristics

    A predominant topic in the theory of evolutionary algorithms and, more generally, theory of randomized black-box optimization techniques is running time analysis. Running time analysis aims at understanding the performance of a given heuristic on a given problem by bounding the number of function evaluations that are needed by the heuristic to identify a solution of a desired quality. As in general algorithms theory, this running time perspective is most useful when it is complemented by a meaningful complexity theory that studies the limits of algorithmic solutions. In the context of discrete black-box optimization, several black-box complexity models have been developed to analyze the best possible performance that a black-box optimization algorithm can achieve on a given problem. The models differ in the classes of algorithms to which these lower bounds apply. This way, black-box complexity contributes to a better understanding of how certain algorithmic choices (such as the amount of memory used by a heuristic, its selective pressure, or properties of the strategies that it uses to create new solution candidates) influences performance. In this chapter we review the different black-box complexity models that have been proposed in the literature, survey the bounds that have been obtained for these models, and discuss how the interplay of running time analysis and black-box complexity can inspire new algorithmic solutions to well-researched problems in evolutionary computation. We also discuss in this chapter several interesting open questions for future work.Comment: This survey article is to appear (in a slightly modified form) in the book "Theory of Randomized Search Heuristics in Discrete Search Spaces", which will be published by Springer in 2018. The book is edited by Benjamin Doerr and Frank Neumann. Missing numbers of pointers to other chapters of this book will be added as soon as possibl

    Improved Approximation Algorithm for the Number of Queries Necessary to Identify a Permutation

    In the past three decades, deductive games have become interesting from the algorithmic point of view. Deductive games are two players zero sum games of imperfect information. The first player, called "codemaker", chooses a secret code and the second player, called "codebreaker", tries to break the secret code by making as few guesses as possible, exploiting information that is given by the codemaker after each guess. A well known deductive game is the famous Mastermind game. In this paper, we consider the so called Black-Peg variant of Mastermind, where the only information concerning a guess is the number of positions in which the guess coincides with the secret code. More precisely, we deal with a special version of the Black-Peg game with n holes and k >= n colors where no repetition of colors is allowed. We present a strategy that identifies the secret code in O(n log n) queries. Our algorithm improves the previous result of Ker-I Ko and Shia-Chung Teng (1985) by almost a factor of 2 for the case k = n. To our knowledge there is no previous work dealing with the case k > n. Keywords: Mastermind; combinatorial problems; permutations; algorithm

    Toward a complexity theory for randomized search heuristics : black-box models

    Randomized search heuristics are a broadly used class of general-purpose algorithms. Analyzing them via classical methods of theoretical computer science is a growing field. While several strong runtime bounds exist, a powerful complexity theory for such algorithms is yet to be developed. We contribute to this goal in several aspects. In a first step, we analyze existing black-box complexity models. Our results indicate that these models are not restrictive enough. This remains true if we restrict the memory of the algorithms under consideration. These results motivate us to enrich the existing notions of black-box complexity by the additional restriction that not actual objective values, but only the relative quality of the previously evaluated solutions may be taken into account by the algorithms. Many heuristics belong to this class of algorithms. We show that our ranking-based model gives more realistic complexity estimates for some problems, while for others the low complexities of the previous models still hold. Surprisingly, our results have an interesting game-theoretic aspect as well.We show that analyzing the black-box complexity of the OneMaxn function class—a class often regarded to analyze how heuristics progress in easy parts of the search space—is the same as analyzing optimal winning strategies for the generalized Mastermind game with 2 colors and length-n codewords. This connection was seemingly overlooked so far in the search heuristics community.Randomisierte Suchheuristiken sind vielseitig einsetzbare Algorithmen, die aufgrund ihrer hohen FlexibilitĂ€t nicht nur im industriellen Kontext weit verbreitet sind. Trotz zahlreicher erfolgreicher Anwendungsbeispiele steckt die Laufzeitanalyse solcher Heuristiken noch in ihren Kinderschuhen. Insbesondere fehlt es uns an einem guten VerstĂ€ndnis, in welchen Situationen problemunabhĂ€ngige Heuristiken in kurzer Laufzeit gute Lösungen liefern können. Eine KomplexitĂ€tstheorie Ă€hnlich wie es sie in der klassischen Algorithmik gibt, wĂ€re wĂŒnschenswert. Mit dieser Arbeit tragen wir zur Entwicklung einer solchen KomplexitĂ€tstheorie fĂŒr Suchheuristiken bei. Wir zeigen anhand verschiedener Beispiele, dass existierende Modelle die Schwierigkeit eines Problems nicht immer zufriedenstellend erfassen. Wir schlagen daher ein weiteres Modell vor. In unserem Ranking-Based Black-Box Model lernen die Algorithmen keine exakten Funktionswerte, sondern bloß die Rangordnung der bislang angefragten Suchpunkte. Dieses Modell gibt fĂŒr manche Probleme eine bessere EinschĂ€tzung der Schwierigkeit. Wir zeigen jedoch auch, dass auch im neuen Modell Probleme existieren, deren KomplexitĂ€t als zu gering einzuschĂ€tzen ist. Unsere Ergebnisse haben auch einen spieltheoretischen Aspekt. Optimale Gewinnstrategien fĂŒr den Rater im Mastermindspiel (auch SuperHirn) mit n Positionen entsprechen genau optimalen Algorithmen zur Maximierung von OneMaxn-Funktionen. Dieser Zusammenhang wurde scheinbar bislang ĂŒbersehen. Diese Arbeit ist in englischer Sprache verfasst

    Recovery from Non-Decomposable Distance Oracles

    A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s∈{0,1}≀ns \in \{0,1\}^{\leq n}, and one chooses a set of queries y∈{0,1}O(n)y \in \{0,1\}^{\mathcal{O}(n)} and receives d(s,y)d(s,y) for a distance function dd. The goal is to make as few queries as possible to recover ss. Although this problem is well-studied for decomposable distances, i.e., distances of the form d(s,y)=∑i=1nf(si,yi)d(s,y) = \sum_{i=1}^n f(s_i, y_i) for some function ff, which includes the important cases of Hamming distance, ℓp\ell_p-norms, and MM-estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important special cases such as edit distance, dynamic time warping (DTW), Frechet distance, earth mover's distance, and so on. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Frechet, exact recovery of the sequence ss is provably impossible, and so we show by allowing the characters in yy to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. We also study the role of adaptivity for a number of different distance functions. One motivation for understanding non-adaptivity is that the query sequence can be fixed and the distances of the input to the queries provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.Comment: This work has been presented at conference The 14th Innovations in Theoretical Computer Science (ITCS 2023) and accepted for publishing in the journal IEEE Transactions on Information Theor

    Secure and Efficient Comparisons between Untrusted Parties

    A vast number of online services is based on users contributing their personal information. Examples are manifold, including social networks, electronic commerce, sharing websites, lodging platforms, and genealogy. In all cases user privacy depends on a collective trust upon all involved intermediaries, like service providers, operators, administrators or even help desk staff. A single adversarial party in the whole chain of trust voids user privacy. Even more, the number of intermediaries is ever growing. Thus, user privacy must be preserved at every time and stage, independent of the intrinsic goals any involved party. Furthermore, next to these new services, traditional offline analytic systems are replaced by online services run in large data centers. Centralized processing of electronic medical records, genomic data or other health-related information is anticipated due to advances in medical research, better analytic results based on large amounts of medical information and lowered costs. In these scenarios privacy is of utmost concern due to the large amount of personal information contained within the centralized data. We focus on the challenge of privacy-preserving processing on genomic data, specifically comparing genomic sequences. The problem that arises is how to efficiently compare private sequences of two parties while preserving confidentiality of the compared data. It follows that the privacy of the data owner must be preserved, which means that as little information as possible must be leaked to any party participating in the comparison. Leakage can happen at several points during a comparison. The secured inputs for the comparing party might leak some information about the original input, or the output might leak information about the inputs. In the latter case, results of several comparisons can be combined to infer information about the confidential input of the party under observation. Genomic sequences serve as a use-case, but the proposed solutions are more general and can be applied to the generic field of privacy-preserving comparison of sequences. The solution should be efficient such that performing a comparison yields runtimes linear in the length of the input sequences and thus producing acceptable costs for a typical use-case. To tackle the problem of efficient, privacy-preserving sequence comparisons, we propose a framework consisting of three main parts. a) The basic protocol presents an efficient sequence comparison algorithm, which transforms a sequence into a set representation, allowing to approximate distance measures over input sequences using distance measures over sets. The sets are then represented by an efficient data structure - the Bloom filter -, which allows evaluation of certain set operations without storing the actual elements of the possibly large set. This representation yields low distortion for comparing similar sequences. Operations upon the set representation are carried out using efficient, partially homomorphic cryptographic systems for data confidentiality of the inputs. The output can be adjusted to either return the actual approximated distance or the result of an in-range check of the approximated distance. b) Building upon this efficient basic protocol we introduce the first mechanism to reduce the success of inference attacks by detecting and rejecting similar queries in a privacy-preserving way. This is achieved by generating generalized commitments for inputs. This generalization is done by treating inputs as messages received from a noise channel, upon which error-correction from coding theory is applied. This way similar inputs are defined as inputs having a hamming distance of their generalized inputs below a certain predefined threshold. We present a protocol to perform a zero-knowledge proof to assess if the generalized input is indeed a generalization of the actual input. Furthermore, we generalize a very efficient inference attack on privacy-preserving sequence comparison protocols and use it to evaluate our inference-control mechanism. c) The third part of the framework lightens the computational load of the client taking part in the comparison protocol by presenting a compression mechanism for partially homomorphic cryptographic schemes. It reduces the transmission and storage overhead induced by the semantically secure homomorphic encryption schemes, as well as encryption latency. The compression is achieved by constructing an asymmetric stream cipher such that the generated ciphertext can be converted into a ciphertext of an associated homomorphic encryption scheme without revealing any information about the plaintext. This is the first compression scheme available for partially homomorphic encryption schemes. Compression of ciphertexts of fully homomorphic encryption schemes are several orders of magnitude slower at the conversion from the transmission ciphertext to the homomorphically encrypted ciphertext. Indeed our compression scheme achieves optimal conversion performance. It further allows to generate keystreams offline and thus supports offloading to trusted devices. This way transmission-, storage- and power-efficiency is improved. We give security proofs for all relevant parts of the proposed protocols and algorithms to evaluate their security. A performance evaluation of the core components demonstrates the practicability of our proposed solutions including a theoretical analysis and practical experiments to show the accuracy as well as efficiency of approximations and probabilistic algorithms. Several variations and configurations to detect similar inputs are studied during an in-depth discussion of the inference-control mechanism. A human mitochondrial genome database is used for the practical evaluation to compare genomic sequences and detect similar inputs as described by the use-case. In summary we show that it is indeed possible to construct an efficient and privacy-preserving (genomic) sequences comparison, while being able to control the amount of information that leaves the comparison. To the best of our knowledge we also contribute to the field by proposing the first efficient privacy-preserving inference detection and control mechanism, as well as the first ciphertext compression system for partially homomorphic cryptographic systems
