65 research outputs found

    Allocation of Advertisement Extensions & Formats

    Get PDF
    This document describes a process of optimizing the selection of a budget-constrained advertiser\u27s formats when allocating the budget of a budget-constrained advertiser. Such optimization is performed by selectively choosing which formats to allocate to this advertiser based on the advertiser\u27s budget. In particular, this document describes allocating formats to budget-constrained advertisers in advertising auctions to create value for the advertisers and an advertiser manager

    A Polly Cracker system based on Satisfiability

    Get PDF
    This paper presents a public-key cryptosystem based on a subclass of the well-known satisfiability problem from propositional logic, namely the doubly-balanced 3-sat problem. We first describe the construction of an instance of our system starting from such a 3-sat formula. Then we discuss security issues: this is achieved on the one hand by exploring best methods to date for solving this particular problem, and on the other hand by studying (systems of multivariate) polynomial equation solving algorithms in this particular setting. The result of our investigations is that both types of method fail to break our instances. We end the paper with some complexity considerations and implementation results

    De-identifying a public use microdata file from the Canadian national discharge abstract database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Canadian Institute for Health Information (CIHI) collects hospital discharge abstract data (DAD) from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF) was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records.</p> <p>Methods</p> <p>Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy.</p> <p>Results</p> <p>Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression.</p> <p>Conclusions</p> <p>The strategies we used to maximize data utility and minimize information loss can result in a PUMF that would be useful for the specific purposes noted earlier. However, to create a more detailed file with less information loss suitable for more complex health services research, the risk would need to be mitigated by requiring the data recipient to commit to a data sharing agreement.</p

    Processing Queries for First-Few Answers

    No full text
    Special support for quickly finding the first-few answers of a query is already appearing in commercial database systems. This support is useful in active databases, when dealing with potentially unmanageable query results, and as a declarative alternative to navigational techniques. In this paper, we discuss query processing techniques for first-answer queries. We provide a method for predicting the cost of a first-answer query plan under an execution model that attempts to reduce wasted effort in join pipelining. We define new statistics necessary for accurate cost prediction, and discuss techniques for obtaining the statistics through traditional statistical measures (e.g. selectivity) and semantic data properties commonly specified through modern OODB and relational schemas. The proposed techniques also apply to all-answer query processing when optimizing for fast delivery of the initial query results. 1 Introduction Traditional methods for query processing, primarily those based ..

    Introducing Undergraduate Students to Science

    No full text
    Understanding the scientific method fosters the development of critical thinking and logical analysis of information. Additionally, proposing and testing a hypothesis is applicable not only to science, but also to ordinary facts of daily life. Knowing the way science is done and how its results are published is useful for all citizens and mandatory for science students. A 60-h course was created to offer undergraduate students a framework in which to learn the procedures of scientific production and publication. The course`s main focus was biochemistry, and it was comprised of two modules. Module I dealt with scientific articles, and Module II with research project writing. Module I covered the topics: 1) the difference between scientific knowledge and common sense, 2) different conceptions of science, 3) scientific methodology, 4) scientific publishing categories, 5) logical principles, 6) deductive and inductive approaches, and 7) critical reading of scientific articles. Module II dealt with 1) selection of an experimental problem for investigation, 2) bibliographic revision, 3) materials and methods, 4) project writing and presentation, 5) funding agencies, and 6) critical analysis of experimental results. The course adopted a collaborative learning strategy, and each topic was studied through activities performed by the students. Qualitative and quantitative course evaluations with Likert questionnaires were carried out at each stage, and the results showed the students` high approval of the course. The staff responsible for course planning and development also evaluated it positively. The Biochemistry Department of the Chemistry Institute of the University of Sao Paulo has offered the course four times
    corecore