Search CORE

564,642 research outputs found

Bagging ensemble selection

Author: Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Ensemble selection has recently appeared as a popular ensemble learning method, not only because its implementation is fairly straightforward, but also due to its excellent predictive performance on practical problems. The method has been highlighted in winning solutions of many data mining competitions, such as the Netix competition, the KDD Cup 2009 and 2010, the UCSD FICO contest 2010, and a number of data mining competitions on the Kaggle platform. In this paper we present a novel variant: bagging ensemble selection. Three variations of the proposed algorithm are compared to the original ensemble selection algorithm and other ensemble algorithms. Experiments with ten real world problems from diverse domains demonstrate the benefit of the bagging ensemble selection algorithm

Research Commons@Waikato

Data-mining chess databases

Author: Bleicher Eiko
Haworth Guy McCrossan
van der Heijden Harold M J F
Publication venue: The International Computer Games Association
Publication date: 31/12/2010
Field of study

This is a report on the data-mining of two chess databases, the objective being to compare their sub-7-man content with perfect play as documented in Nalimov endgame tables. Van der Heijden’s ENDGAME STUDY DATABASE IV is a definitive collection of 76,132 studies in which White should have an essentially unique route to the stipulated goal. Chessbase’s BIG DATABASE 2010 holds some 4.5 million games. Insight gained into both database content and data-mining has led to some delightful surprises and created a further agenda

Central Archive at the University of Reading

Rural demographic change in the new century: slower growth, increased diversity

Author: Johnson Kenneth M.
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 21/02/2012
Field of study

This brief examines rural demographic trends in the first decade of the twenty-first century using newly available data from the 2010 Census. The rural population grew by just 2.2 million between 2000 and 2010—a gain barely half as great as that during the 1990s. Population growth was particularly slow in farming and mining counties and sharply reduced in rural manufacturing counties. Rural population gains were largest in high-amenity counties and just beyond the metropolitan fringe. Diversity accelerated in rural America, with racial and ethnic minorities accounting for 83 percent of rural population growth between 2000 and 2010

UNH Scholars' Repository

Semi-Trusted Mixer Based Privacy Preserving Distributed Data Mining for Resource Constrained Devices

Author: Kaosar Md. Golam
Yi Xun
Publication venue
Publication date: 01/01/2010
Field of study

In this paper a homomorphic privacy preserving association rule mining algorithm is proposed which can be deployed in resource constrained devices (RCD). Privacy preserved exchange of counts of itemsets among distributed mining sites is a vital part in association rule mining process. Existing cryptography based privacy preserving solutions consume lot of computation due to complex mathematical equations involved. Therefore less computation involved privacy solutions are extremely necessary to deploy mining applications in RCD. In this algorithm, a semi-trusted mixer is used to unify the counts of itemsets encrypted by all mining sites without revealing individual values. The proposed algorithm is built on with a well known communication efficient association rule mining algorithm named count distribution (CD). Security proofs along with performance analysis and comparison show the well acceptability and effectiveness of the proposed algorithm. Efficient and straightforward privacy model and satisfactory performance of the protocol promote itself among one of the initiatives in deploying data mining application in RCD.Comment: IEEE Publication format, International Journal of Computer Science and Information Security, IJCSIS, Vol. 8 No. 1, April 2010, USA. ISSN 1947 5500, http://sites.google.com/site/ijcsis

arXiv.org e-Print Archive

Research Repository

Victoria University Eprints Repository

FP-tree and COFI Based Approach for Mining of Multiple Level Association Rules in Large Databases

Author: Kumar Parveen
Pardasani K. R.
Shrivastava Virendra Kumar
Publication venue: 'Research Publishing Services'
Publication date: 01/01/2010
Field of study

In recent years, discovery of association rules among itemsets in a large database has been described as an important database-mining problem. The problem of discovering association rules has received considerable research attention and several algorithms for mining frequent itemsets have been developed. Many algorithms have been proposed to discover rules at single concept level. However, mining association rules at multiple concept levels may lead to the discovery of more specific and concrete knowledge from data. The discovery of multiple level association rules is very much useful in many applications. In most of the studies for multiple level association rule mining, the database is scanned repeatedly which affects the efficiency of mining process. In this research paper, a new method for discovering multilevel association rules is proposed. It is based on FP-tree structure and uses cooccurrence frequent item tree to find frequent items in multilevel concept hierarchy.Comment: Pages IEEE format, International Journal of Computer Science and Information Security, IJCSIS, Vol. 7 No. 2, February 2010, USA. ISSN 1947 5500, http://sites.google.com/site/ijcsis

arXiv.org e-Print Archive

Crossref