Search CORE

12,184 research outputs found

Using unknowns to prevent discovery of association rules

Author: Clifton C.
Saygin Y.
Verykios V.S.
Publication venue
Publication date: 01/01/2001
Field of study

Data mining technology has given us new capabilities to identify correlations in large data sets. This introduces risks when the data is to be made public, but the correlations are private. We introduce a method for selectively removing individual values from a database to prevent the discovery of a set of rules, while preserving the data for other applications. The efficacy and complexity of this method are discussed. We also present an experiment showing an example of this methodology

Bilkent University Institutional Repository

Criminal Adjudication, Error Correction, and Hindsight Blind Spots

Author: Griffin Lisa Kern
Publication venue: Duke University School of Law
Publication date: 01/01/2016
Field of study

Concerns about hindsight in the law typically arise with regard to the bias that outcome knowledge can produce. But a more difficult problem than the clear view that hindsight appears to provide is the blind spot that it actually has. Because of the conventional wisdom about error review, there is a missed opportunity to ensure meaningful scrutiny. Beyond the confirmation biases that make convictions seem inevitable lies the question whether courts can see what they are meant to assess when they do look closely for error. Standards that require a retrospective showing of materiality, prejudice, or harm turn on what a judge imagines would have happened at trial under different circumstances. The interactive nature of the fact-finding process, however, means that the effect of error can rarely be assessed with confidence. Moreover, changing paradigms in criminal procedure scholarship make accuracy and error correction newly paramount. The empirical evidence of known innocents found guilty in the criminal justice system is mounting, and many of those wrongful convictions endured because errors were reviewed under hindsight standards. New insights about the cognitive psychology of decision-making, taken together with this heightened awareness of error, suggest that it is time to reevaluate some thresholds for reversal. The problem of hindsight blindness is particularly evident in the rules concerning the discovery of exculpatory evidence, the adequacy of defense counsel, and the harmfulness of erroneous rulings at trial. The standards applied in each of those contexts share a common flaw: a barrier between the mechanism for evaluation and the source of error. This essay concludes that reviewing courts should consider the trial that actually occurred rather than what “might have been” in a different proceeding and proposes some new vocabulary for weighing error

bepress Legal Repository

Washington and Lee University School of Law

Duke Law Scholarship Repository

Learning Language from a Large (Unannotated) Corpus

Author: Goertzel Ben
Vepstas Linas
Publication venue
Publication date: 14/01/2014
Field of study

A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa

arXiv.org e-Print Archive

CiteSeerX

Double Whammy - How ICT Projects are Fooled by Randomness and Screwed by Political Intent

Author: Budzier Alexander
Flyvbjerg Bent
Publication venue
Publication date: 01/01/2011
Field of study

The cost-benefit analysis formulates the holy trinity of objectives of project management - cost, schedule, and benefits. As our previous research has shown, ICT projects deviate from their initial cost estimate by more than 10% in 8 out of 10 cases. Academic research has argued that Optimism Bias and Black Swan Blindness cause forecasts to fall short of actual costs. Firstly, optimism bias has been linked to effects of deception and delusion, which is caused by taking the inside-view and ignoring distributional information when making decisions. Secondly, we argued before that Black Swan Blindness makes decision-makers ignore outlying events even if decisions and judgements are based on the outside view. Using a sample of 1,471 ICT projects with a total value of USD 241 billion - we answer the question: Can we show the different effects of Normal Performance, Delusion, and Deception? We calculated the cumulative distribution function (CDF) of (actual-forecast)/forecast. Our results show that the CDF changes at two tipping points - the first one transforms an exponential function into a Gaussian bell curve. The second tipping point transforms the bell curve into a power law distribution with the power of 2. We argue that these results show that project performance up to the first tipping point is politically motivated and project performance above the second tipping point indicates that project managers and decision-makers are fooled by random outliers, because they are blind to thick tails. We then show that Black Swan ICT projects are a significant source of uncertainty to an organisation and that management needs to be aware of

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

A Framework for High-Accuracy Privacy-Preserving Mining

Author: Agrawal Shipra
Haritsa Jayant R.
Publication venue
Publication date: 01/01/2004
Field of study

To preserve client privacy in the data mining process, a variety of techniques based on random perturbation of data records have been proposed recently. In this paper, we present a generalized matrix-theoretic model of random perturbation, which facilitates a systematic approach to the design of perturbation mechanisms for privacy-preserving mining. Specifically, we demonstrate that (a) the prior techniques differ only in their settings for the model parameters, and (b) through appropriate choice of parameter settings, we can derive new perturbation techniques that provide highly accurate mining results even under strict privacy guarantees. We also propose a novel perturbation mechanism wherein the model parameters are themselves characterized as random variables, and demonstrate that this feature provides significant improvements in privacy at a very marginal cost in accuracy. While our model is valid for random-perturbation-based privacy-preserving mining in general, we specifically evaluate its utility here with regard to frequent-itemset mining on a variety of real datasets. The experimental results indicate that our mechanisms incur substantially lower identity and support errors as compared to the prior techniques

arXiv.org e-Print Archive

CiteSeerX

Open Access Repository of IISc Research Publications

Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing

Author: Savas Erkay
Savaş Erkay
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2007
Field of study

In this paper, we propose a privacy preserving distributed clustering protocol for horizontally partitioned data based on a very efficient homomorphic additive secret sharing scheme. The model we use for the protocol is novel in the sense that it utilizes two non-colluding third parties. We provide a brief security analysis of our protocol from information theoretic point of view, which is a stronger security model. We show communication and computation complexity analysis of our protocol along with another protocol previously proposed for the same problem. We also include experimental results for computation and communication overhead of these two protocols. Our protocol not only outperforms the others in execution time and communication overhead on data holders, but also uses a more efficient model for many data mining applications

Sabanci University Research Database

Recommended from our members

Electronic Discovery/Disclosure: From Litigation to International Commercial Arbitration

Author: Devey C.
Publication venue: Sweet & Maxwell
Publication date: 01/01/2008
Field of study

City Research Online