31,811 research outputs found

    New probabilistic interest measures for association rules

    Full text link
    Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. In this paper, we start with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significantly better performance than lift for applications where spurious rules are problematic

    Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation

    Full text link
    The growing expanse of e-commerce and the widespread availability of online databases raise many fears regarding loss of privacy and many statistical challenges. Even with encryption and other nominal forms of protection for individual databases, we still need to protect against the violation of privacy through linkages across multiple databases. These issues parallel those that have arisen and received some attention in the context of homeland security. Following the events of September 11, 2001, there has been heightened attention in the United States and elsewhere to the use of multiple government and private databases for the identification of possible perpetrators of future attacks, as well as an unprecedented expansion of federal government data mining activities, many involving databases containing personal information. We present an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included. We also explore their link to the related literature on privacy-preserving data mining. In particular, we focus on the matching problem across databases and the concept of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    ATM automation: guidance on human technology integration

    Get PDF
    © Civil Aviation Authority 2016Human interaction with technology and automation is a key area of interest to industry and safety regulators alike. In February 2014, a joint CAA/industry workshop considered perspectives on present and future implementation of advanced automated systems. The conclusion was that whilst no additional regulation was necessary, guidance material for industry and regulators was required. Development of this guidance document was completed in 2015 by a working group consisting of CAA, UK industry, academia and industry associations (see Appendix B). This enabled a collaborative approach to be taken, and for regulatory, industry, and workforce perspectives to be collectively considered and addressed. The processes used in developing this guidance included: review of the themes identified from the February 2014 CAA/industry workshop1; review of academic papers, textbooks on automation, incidents and accidents involving automation; identification of key safety issues associated with automated systems; analysis of current and emerging ATM regulatory requirements and guidance material; presentation of emerging findings for critical review at UK and European aviation safety conferences. In December 2015, a workshop of senior management from project partner organisations reviewed the findings and proposals. EASA were briefed on the project before its commencement, and Eurocontrol contributed through membership of the Working Group.Final Published versio

    EBSLG Annual General Conference, 18. - 21.05.2010, Cologne. Selected papers

    Get PDF
    Am 18.-21. Mai 2010 fand in der Universitäts- und Stadtbibliothek (USB) Köln die „Annual General Conference“ der European Business Schools Librarians Group (EBSLG) statt. Die EBSLG ist eine relativ kleine, aber exklusive Gruppe von Bibliotheksdirektorinnen und –direktoren bzw. Bibliothekarinnen und Bibliothekaren in Leitungspositionen aus den Bibliotheken führender Business Schools. Im Mittelpunkt der Tagung standen zwei Themenschwerpunkte: Der erste Themenkreis beschäftigte sich mit Bibliotheksportalen und bibliothekarischen Suchmaschinen. Der zweite Themenschwerpunkt Fragen der Bibliotheksorganisation wie die Aufbauorganisation einer Bibliothek, Outsourcing und Relationship Management. Der vorliegende Tagungsband enthält ausgewählte Tagungsbeiträge

    Post-processing of association rules.

    Get PDF
    In this paper, we situate and motivate the need for a post-processing phase to the association rule mining algorithm when plugged into the knowledge discovery in databases process. Major research effort has already been devoted to optimising the initially proposed mining algorithms. When it comes to effectively extrapolating the most interesting knowledge nuggets from the standard output of these algorithms, one is faced with an extreme challenge, since it is not uncommon to be confronted with a vast amount of association rules after running the algorithms. The sheer multitude of generated rules often clouds the perception of the interpreters. Rightful assessment of the usefulness of the generated output introduces the need to effectively deal with different forms of data redundancy and data being plainly uninteresting. In order to do so, we will give a tentative overview of some of the main post-processing tasks, taking into account the efforts that have already been reported in the literature.

    Interactive analysis of high-dimensional association structures with graphical models

    Get PDF
    Graphical chain models are a capable tool for analyzing multivariate data. However, their practical use may still be cumbersome in some respect since fitting the model requires the application of an intensive selection strategy based on the calculation of an enormous number of different regressions. In this paper, we present a computer system especially designed for the calculation of graphical chain models which is not only planned to automatically carry out the model search but also to visualize the corresponding graph at each stage of the model fit on request by the user. It additionally allows to modify the graph and the model fit interactively
    • …
    corecore