12,556 research outputs found

    Notes on Information-Theoretic Privacy

    Full text link
    We investigate the tradeoff between privacy and utility in a situation where both privacy and utility are measured in terms of mutual information. For the binary case, we fully characterize this tradeoff in case of perfect privacy and also give an upper-bound for the case where some privacy leakage is allowed. We then introduce a new quantity which quantifies the amount of private information contained in the observable data and then connect it to the optimal tradeoff between privacy and utility.Comment: The corrected version of a paper appeared in Allerton 201

    Exponential and power laws in public procurement markets

    Full text link
    For the first time ever, we analyze a unique public procurement database, which includes information about a number of bidders for a contract, a final price, an identification of a winner and an identification of a contracting authority for each of more than 40,000 public procurements in the Czech Republic between 2006 and 2011, focusing on the distributional properties of the variables of interest. We uncover several scaling laws -- the exponential law for the number of bidders, and the power laws for the total revenues and total spendings of the participating companies, which even follows the Zipf's law for the 100 most spending institutions. We propose an analogy between extensive and non-extensive systems in physics and the public procurement market situations. Through an entropy maximization, such the analogy yields some interesting results and policy implications with respect to the Maxwell-Boltzmann and Pareto distributions in the analyzed quantities.Comment: 6 pages, 3 figure

    Statistical inference of the generation probability of T-cell receptors from sequence repertoires

    Full text link
    Stochastic rearrangement of germline DNA by VDJ recombination is at the origin of immune system diversity. This process is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Since any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on non-productive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our distribution predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.Comment: 20 pages, including Appendi
    • …
    corecore