Search CORE

7,207 research outputs found

Parallel Algorithm for Frequent Itemset Mining on Intel Many-core Systems

Author: Zymbler Mikhail
Publication venue
Publication date: 01/01/2018
Field of study

Frequent itemset mining leads to the discovery of associations and correlations among items in large transactional databases. Apriori is a classical frequent itemset mining algorithm, which employs iterative passes over database combining with generation of candidate itemsets based on frequent itemsets found at the previous iteration, and pruning of clearly infrequent itemsets. The Dynamic Itemset Counting (DIC) algorithm is a variation of Apriori, which tries to reduce the number of passes made over a transactional database while keeping the number of itemsets counted in a pass relatively low. In this paper, we address the problem of accelerating DIC on the Intel Xeon Phi many-core system for the case when the transactional database fits in main memory. Intel Xeon Phi provides a large number of small compute cores with vector processing units. The paper presents a parallel implementation of DIC based on OpenMP technology and thread-level parallelism. We exploit the bit-based internal layout for transactions and itemsets. This technique reduces the memory space for storing the transactional database, simplifies the support count via logical bitwise operation, and allows for vectorization of such a step. Experimental evaluation on the platforms of the Intel Xeon CPU and the Intel Xeon Phi coprocessor with large synthetic and real databases showed good performance and scalability of the proposed algorithm.Comment: Accepted for publication in Journal of Computing and Information Technology (http://cit.fer.hr

arXiv.org e-Print Archive

Directory of Open Access Journals

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Improving mining efficiency: A new scheme for extracting association rules

Author: Abdullah Azween
Dominic P D D.
Said Aiman Moyaid
Publication venue
Publication date: 29/06/2009
Field of study

In the age of information technology, the amount of accumulated data is tremendous. Extracting the association rule from this data is one of the important tasks in data mining.Most of the existing association rules in algorithms typically assume that the data set can fit in the memory.In this paper, we propose a practical and effective scheme to mine association rules from frequent patterns, called Prefixfoldtree scheme (PFT scheme).The original dataset is divided into folds, and then from each fold the frequent patterns are mined by using the tree projection approach.These frequent patterns are combined into one set and finally interestingness constraints are used to extract the association rules.The experiments will be conducted to illustrate the efficiency of our scheme

UUM Repository

Engineering design using game-enhanced CAD: The potential to augment the user experience with game elements

Author: Kosmadoudi Zoe
Lim Theodore
Liu Ying
Louchart Sandy
Ritchie James Millar
Sung Raymond
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Heriot Watt Pure

How Does Science Come to Speak in the Courts? Citations Intertexts, Expert Witnesses, Consequential Facts, and Reasoning

Author: Bazerman Charles
Publication venue: Duke University School of Law
Publication date: 01/01/2009
Field of study

Citations, in their highly conventionalized forms, visibly indicate each texts explicit use of the prior literature that embodies the knowledge and contentions of its field. This relation to prior texts has been called intertextuality in literary and literacy studies. Here, Bazerman discusses the citation practices and intertextuality in science and the law in theoretical and historical perspective, and considers the intersection of science and law by identifying the judicial rules that limit and shape the role of scientific literature in court proceedings. He emphasizes that from the historical and theoretical analysis, it is clear that, in the US, judicial reasoning is an intertextually tight and self-referring system that pays only limited attention to documents outside the laws, precedents, and judicial rules. The window for scientific literature to enter the courts is narrow, focused, and highly filtered. It serves as a warrant for the expert witnesses\u27 expertise, which in turn makes opinion admissible in a way not available to ordinary witnesses

bepress Legal Repository

Duke Law Scholarship Repository

What is Probable Cause, and Why Should We Care?: The Costs, Benefits, and Meaning of Individualized Suspicion

Author: Taslitz Andrew E.
Publication venue: Duke University School of Law
Publication date: 01/07/2010
Field of study

Taslitz defines probable cause as having four components: one quantitative, one qualitative, one temporal, and one moral. He focuses on the last of these components. Individualized suspicion, the US Supreme Court has suggested, is perhaps the most important of the four components of probable cause. That is a position with which he heartily agree. The other three components each play only a supporting role. But individualized suspicion is the beating heart that gives probable cause its vitality

bepress Legal Repository

Duke Law Scholarship Repository

The Knowledge Level Approach To Intelligent Information System Design

Author: Pohl Jens G.
Zang Michael A.
Publication venue: DigitalCommons@CalPoly
Publication date: 29/07/2003
Field of study

Traditional approaches to building intelligent information systems employ an ontology to define a representational structure for the data and information of interest within the target domain of the system. At runtime, the ontology provides a constrained template for the creation of the individual objects and relationships that together define the state of the system at a given point in time. The ontology also provides a vocabulary for expressing domain knowledge typically in the form of rules (declarative knowledge) or methods (procedural knowledge). The system utilizes the encoded knowledge, often in conjunction user input, to progress the state of the system towards the specific goals indicated by the users. While this approach has been very successful, it has some drawbacks. Regardless of the implementation paradigm the knowledge is essentially buried in the code and therefore inaccessible to most domain experts. The knowledge also tends to be very domain specific and is not extensible at runtime. This paper describes a variation on the traditional approach that employs an explicit knowledge level within the ontology to mitigate the identified drawbacks

DigitalCommons@CalPoly

Doctor of Philosophy

Author: Welch Susan Rea
Publication venue: University of Utah
Publication date: 01/05/2011
Field of study

dissertationWith the growing national dissemination of the electronic health record (EHR), there are expectations that the public will benefit from biomedical research and discovery enabled by electronic health data. Clinical data are needed for many diseases and conditions to meet the demands of rapidly advancing genomic and proteomic research. Many biomedical research advancements require rapid access to clinical data as well as broad population coverage. A fundamental issue in the secondary use of clinical data for scientific research is the identification of study cohorts of individuals with a disease or medical condition of interest. The problem addressed in this work is the need for generalized, efficient methods to identify cohorts in the EHR for use in biomedical research. To approach this problem, an associative classification framework was designed with the goal of accurate and rapid identification of cases for biomedical research: (1) a set of exemplars for a given medical condition are presented to the framework, (2) a predictive rule set comprised of EHR attributes is generated by the framework, and (3) the rule set is applied to the EHR to identify additional patients that may have the specified condition. iv Based on this functionality, the approach was termed the ‘cohort amplification' framework. The development and evaluation of the cohort amplification framework are the subject of this dissertation. An overview of the framework design is presented. Improvements to some standard associative classification methods are described and validated. A qualitative evaluation of predictive rules to identify diabetes cases and a study of the accuracy of identification of asthma cases in the EHR using frameworkgenerated prediction rules are reported. The framework demonstrated accurate and reliable rules to identify diabetes and asthma cases in the EHR and contributed to methods for identification of biomedical research cohorts

The University of Utah: J. Willard Marriott Digital Library

Asbestos Lessons: The Consequences of Asbestos Litigation

Author: Carrington Paul D.
Publication venue: Duke University School of Law
Publication date: 01/01/2007
Field of study

Abstract not availabl

bepress Legal Repository

Duke Law Scholarship Repository

Unlocking the “Virtual Cage” of Wildlife Surveillance

Author: Lininger Henry
Lininger Tom
Publication venue: Duke University School of Law
Publication date: 19/06/2017
Field of study

The electronic surveillance of wildlife has grown more extensive than ever. For instance, thousands of wolves wear collars transmitting signals to wildlife biologists. Some collars inject wolves with tranquilizers that allow for their immediate capture if they stray outside of the boundaries set by anthropocentric management policies. Hunters have intercepted the signals from surveillance collars and have used this information to track and slaughter the animals. While the ostensible reason for the surveillance programs is to facilitate the peaceful coexistence of humanity and wildlife, the reality is less benign—an outdoor version of Bentham’s Panopticon. This Article reconceptualizes the enterprise of wildlife surveillance. Without suggesting that animals have standing to assert constitutional rights, the Article posits a public interest in protecting the privacy of wildlife. The very notion of wildness implies privacy. The law already protects the bodily integrity of animals to some degree, and a protected zone of privacy is penumbral to this core protection, much the same way that human privacy emanates from narrower guarantees against government intrusion. Policy implications follow that are akin to the rules under the Fourth Amendment limiting the government’s encroachment on human privacy. Just as the police cannot install a wiretap without demonstrating a particularized investigative need for which all less intrusive methods would be insufficient, so too should surveillance of wildlife necessitate a specific showing of urgency. A detached, neutral authority should review all applications for electronic monitoring of wildlife. Violati ons of the rules should result in substantial sanctions. The Article concludes by considering—and refuting—foreseeable objections to heightened requirements for the surveillance of wildlife

bepress Legal Repository

Duke Law Scholarship Repository