95 research outputs found
Finding the Median (Obliviously) with Bounded Space
We prove that any oblivious algorithm using space to find the median of a
list of integers from requires time . This bound also applies to the problem of determining whether the median
is odd or even. It is nearly optimal since Chan, following Munro and Raman, has
shown that there is a (randomized) selection algorithm using only
registers, each of which can store an input value or -bit counter,
that makes only passes over the input. The bound also implies
a size lower bound for read-once branching programs computing the low order bit
of the median and implies the analog of for length oblivious branching programs
Oblivious Median Slope Selection
We study the median slope selection problem in the oblivious RAM model. In
this model memory accesses have to be independent of the data processed, i.e.,
an adversary cannot use observed access patterns to derive additional
information about the input. We show how to modify the randomized algorithm of
Matou\v{s}ek (1991) to obtain an oblivious version with O(n log^2 n) expected
time for n points in R^2. This complexity matches a theoretical upper bound
that can be obtained through general oblivious transformation. In addition,
results from a proof-of-concept implementation show that our algorithm is also
practically efficient.Comment: 14 pages, to appear in Proceedings of CCCG 202
The Bane of Low-Dimensionality Clustering
In this paper, we give a conditional lower bound of on
running time for the classic k-median and k-means clustering objectives (where
n is the size of the input), even in low-dimensional Euclidean space of
dimension four, assuming the Exponential Time Hypothesis (ETH). We also
consider k-median (and k-means) with penalties where each point need not be
assigned to a center, in which case it must pay a penalty, and extend our lower
bound to at least three-dimensional Euclidean space.
This stands in stark contrast to many other geometric problems such as the
traveling salesman problem, or computing an independent set of unit spheres.
While these problems benefit from the so-called (limited) blessing of
dimensionality, as they can be solved in time or
in d dimensions, our work shows that widely-used clustering
objectives have a lower bound of , even in dimension four.
We complete the picture by considering the two-dimensional case: we show that
there is no algorithm that solves the penalized version in time less than
, and provide a matching upper bound of .
The main tool we use to establish these lower bounds is the placement of
points on the moment curve, which takes its inspiration from constructions of
point sets yielding Delaunay complexes of high complexity
Turvalisel ĂŒhisarvutusel pĂ”hinev privaatsust sĂ€ilitav statistiline analĂŒĂŒs
VĂ€itekirja elektrooniline versioon ei sisalda publikatsioone.Kaasaegses ĂŒhiskonnas luuakse inimese kohta digitaalne kirje kohe pĂ€rast tema sĂŒndi. Sellest hetkest alates jĂ€lgitakse tema kĂ€itumist ning kogutakse andmeid erinevate eluvaldkondade kohta. Kui kasutate poes kliendikaarti, kĂ€ite arsti juures, tĂ€idate maksudeklaratsiooni vĂ”i liigute lihtsalt ringi mobiiltelefoni taskus kandes, koguvad ning salvestavad firmad ja riigiasutused teie tundlikke
andmeid.
Vahel anname selliseks jÀlitustegevuseks vabatahtlikult loa, et saada mingit kasu. NÀiteks vÔime saada soodustust, kui kasutame kliendikaarti. Teinekord on meil vaja teha keeruline otsus, kas loobuda vÔimalusest teha mobiiltelefonikÔnesid
vÔi lubada enda jÀlgimine mobiilimastide kaudu edastatava info abil. Riigiasutused haldavad infot meie tervise, hariduse ja sissetulekute kohta, et meid paremini ravida, harida ja meilt makse koguda. Me loodame, et meie andmeid kasutatakse mÔistlikult, aga samas eeldame, et meie privaatsus on tagatud.
KĂ€esolev töö uurib, kuidas teostada statistilist analĂŒĂŒsi nii, et tagada ĂŒksikisiku
privaatsus. Selle eesmĂ€rgi saavutamiseks kasutame turvalist ĂŒhisarvutust. See krĂŒptograafiline meetod lubab analĂŒĂŒsida andmeid nii, et ĂŒksikuid vÀÀrtuseid ei ole kunagi vĂ”imalik nĂ€ha. Hoolimata sellest, et turvalise ĂŒhisarvutuse kasutamine on aeganĂ”udev protsess, nĂ€itame, et see on piisavalt kiire ja seda on vĂ”imalik kasutada isegi vĂ€ga suurte andmemahtude puhul.
Me oleme teinud vĂ”imalikuks populaarseimate statistilise analĂŒĂŒsi meetodite kasutamise turvalise ĂŒhisarvutuse kontekstis. Me tutvustame privaatsust sĂ€ilitavat statistilise analĂŒĂŒsi tööriista Rmind, mis sisaldab kĂ”iki töö kĂ€igus loodud funktsioone. Rmind sarnaneb tööriistadele, millega statistikud on harjunud. See lubab neil viia lĂ€bi uuringuid ilma, et nad peaksid ĂŒksikasjalikult tundma allolevaid krĂŒptograafilisi protokolle.
Kasutame dissertatsioonis kirjeldatud meetodeid, et valmistada ette statistiline
uuring, mis ĂŒhendab kaht Eesti riiklikku andmekogu. Uuringu eesmĂ€rk on teada saada, kas Eesti tudengid, kes töötavad ĂŒlikooliĂ”pingute ajal, lĂ”petavad nominaalajaga vĂ€iksema tĂ”enĂ€osusega kui nende Ă”pingutele keskenduvad kaaslased.In a modern society, from the moment a person is born, a digital record is created. From there on, the personâs behaviour is constantly tracked and data are collected about the different aspects of his or her life. Whether one is swiping a customer loyalty card in a store, going to the doctor, doing taxes or simply moving around with a mobile phone in oneâs pocket, sensitive data are being gathered and stored by governments and companies.
Sometimes, we give our permission for this kind of surveillance for some benefit. For instance, we could get a discount using a customer loyalty card. Other times we have a difficult choice â either we cannot make phone calls or our movements are tracked based on cellular data. The government tracks information about our health, education and income to cure us, educate us and collect taxes. We hope that the data are used in a meaningful way, however, we also have an
expectation of privacy.
This work focuses on how to perform statistical analyses in a way that preserves the privacy of the individual. To achieve this goal, we use secure multi-Ââparty computation. This cryptographic technique allows data to be analysed without seeing the individual values. Even though using secure multi-Ââparty computation is a time-Ââconsuming process, we show that it is feasible even for large-Ââscale databases.
We have developed ways for using the most popular statistical analysis methods with secure multi-Ââparty computation. We introduce a privacy-Ââpreserving statistical analysis tool called Rmind that contains all of our resulting implementations. Rmind is similar to tools that statistical analysts are used to. This allows them to carry out studies on the data without having to know the details of the underlying cryptographic protocols.
The methods described in the thesis are used in practice to prepare for running a statistical study on large-Ââscale real-Ââlife data to find out whether Estonian students who are working during university studies are less likely to graduate in nominal time
- âŠ