117 research outputs found
Nearly Optimal Private Convolution
We study computing the convolution of a private input with a public input
, while satisfying the guarantees of -differential
privacy. Convolution is a fundamental operation, intimately related to Fourier
Transforms. In our setting, the private input may represent a time series of
sensitive events or a histogram of a database of confidential personal
information. Convolution then captures important primitives including linear
filtering, which is an essential tool in time series analysis, and aggregation
queries on projections of the data.
We give a nearly optimal algorithm for computing convolutions while
satisfying -differential privacy. Surprisingly, we follow
the simple strategy of adding independent Laplacian noise to each Fourier
coefficient and bounding the privacy loss using the composition theorem of
Dwork, Rothblum, and Vadhan. We derive a closed form expression for the optimal
noise to add to each Fourier coefficient using convex programming duality. Our
algorithm is very efficient -- it is essentially no more computationally
expensive than a Fast Fourier Transform.
To prove near optimality, we use the recent discrepancy lowerbounds of
Muthukrishnan and Nikolov and derive a spectral lower bound using a
characterization of discrepancy in terms of determinants
Urban Gravity: a Model for Intercity Telecommunication Flows
We analyze the anonymous communication patterns of 2.5 million customers of a
Belgian mobile phone operator. Grouping customers by billing address, we build
a social network of cities, that consists of communications between 571 cities
in Belgium. We show that inter-city communication intensity is characterized by
a gravity model: the communication intensity between two cities is proportional
to the product of their sizes divided by the square of their distance
On Delays in Management Frameworks: Metrics, Models and Analysis
Management performance evaluation means assessment of scalability, complexity, accuracy, throughput, delays and resources consumptions. In this paper, we focus on the evaluation of management frameworks delays through a set of specific metrics. We investigate the statistical properties of these metrics when the number of management nodes increases. We show that management delays measured at the application level are statistically modeled by distributions with heavy tails, especially the Weibull distribution. Given that delays can substantially degrade the capacity of management algorithms to react and resolve problems it is useful to get a finer model to describe them.We suggest theWeibull distribution as a model of delays for the analysis and simulations of such algorithms
Identifiability of flow distributions from link measurements with applications to computer networks
We study the problem of identifiability of distributions of flows on a graph from aggregate measurements collected on its edges. This is a canonical example of a statistical inverse problem motivated by recent developments in computer networks. In this paper (i) we introduce a number of models for multi-modal data that capture their spatio-temporal correlation, (ii) provide sufficient conditions for the identifiability of nth order cumulants and also for a special class of heavy tailed distributions. Further, we investigate conditions on network routing for the flows that prove sufficient for identifiability of their distributions (up to mean). Finally, we extend our results to directed acyclic graphs and discuss some open problems.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58107/2/ip7_5_004.pd
Draft Genome Sequence of Xanthomonas axonopodis pv. allii Strain CFBP 6369
We report here the draft genome sequence of Xanthomonas axonopodis pv. allii strain CFBP 6369, the causal agent of bacterial blight of onion. The draft genome has a size of 5,425,942 bp and a G+C content of 64.4%
Structure and expression analysis of rice paleo duplications
Having a well-known history of genome duplication, rice is a good model for studying structural and functional evolution of paleo duplications. Improved sequence alignment criteria were used to characterize 10 major chromosome-to-chromosome duplication relationships associated with 1440 paralogous pairs, covering 47.8% of the rice genome, with 12.6% of genes that are conserved within sister blocks. Using a micro-array experiment, a genome-wide expression map has been produced, in which 2382 genes show significant differences of expression in root, leaf and grain. By integrating both structural (1440 paralogous pairs) and functional information (2382 differentially expressed genes), we identified 115 paralogous gene pairs for which at least one copy is differentially expressed in one of the three tissues. A vast majority of the 115 paralogous gene pairs have been neofunctionalized or subfunctionalized as 88%, 89% and 96% of duplicates, respectively, expressed in grain, leaf and root show distinct expression patterns. On the basis of a Gene Ontology analysis, we have identified and characterized the gene families that have been structurally and functionally preferentially retained in the duplication showing that the vast majority (>85%) of duplicated have been either lost or have been subfunctionalized or neofunctionalized during 50–70 million years of evolution
Characterization of globulin storage proteins of a low prolamin cereal species in relation to celiac disease
Brachypodium distachyon, a small annual grass with seed storage globulins as primary protein reserves was used in our study to analyse the toxic nature of non-prolamin seed storage proteins related to celiac disease. The main storage proteins of B. distachyon are the 7S globulin type proteins and the 11S, 12S seed storage globulins similar to oat and rice. Immunoblot analyses using serum samples from celiac disease patients were carried out followed by the identification of immune-responsive proteins using mass spectrometry. Serum samples from celiac patients on a gluten-free diet, from patients with Crohn's disease and healthy subjects, were used as controls. The identified proteins with intense serum-IgA reactivity belong to the 7S and 11-12S seed globulin family. Structure prediction and epitope predictions analyses confirmed the presence of celiac disease-related linear B cell epitope homologs and the presence of peptide regions with strong HLA-DQ8 and DQ2 binding capabilities. These results highlight that both MHC-II presentation and B cell response may be developed not only to prolamins but also to seed storage globulins. This is the first study of the non-prolamin type seed storage proteins of Brachypodium from the aspect of the celiac disease
Specific patterns of gene space organisation revealed in wheat by using the combination of barley and wheat genomic resources
<p>Abstract</p> <p>Background</p> <p>Because of its size, allohexaploid nature and high repeat content, the wheat genome has always been perceived as too complex for efficient molecular studies. We recently constructed the first physical map of a wheat chromosome (3B). However gene mapping is still laborious in wheat because of high redundancy between the three homoeologous genomes. In contrast, in the closely related diploid species, barley, numerous gene-based markers have been developed. This study aims at combining the unique genomic resources developed in wheat and barley to decipher the organisation of gene space on wheat chromosome 3B.</p> <p>Results</p> <p>Three dimensional pools of the minimal tiling path of wheat chromosome 3B physical map were hybridised to a barley Agilent 15K expression microarray. This led to the fine mapping of 738 barley orthologous genes on wheat chromosome 3B. In addition, comparative analyses revealed that 68% of the genes identified were syntenic between the wheat chromosome 3B and barley chromosome 3 H and 59% between wheat chromosome 3B and rice chromosome 1, together with some wheat-specific rearrangements. Finally, it indicated an increasing gradient of gene density from the centromere to the telomeres positively correlated with the number of genes clustered in islands on wheat chromosome 3B.</p> <p>Conclusion</p> <p>Our study shows that novel structural genomics resources now available in wheat and barley can be combined efficiently to overcome specific problems of genetic anchoring of physical contigs in wheat and to perform high-resolution comparative analyses with rice for deciphering the organisation of the wheat gene space.</p
- …