Search CORE

39 research outputs found

Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

Author: A Agresti
A Agresti
A Silberschatz
AA Freitas
B Efron
B Liu
BG Buchanan
C Silverstein
D Klahr
D Tsur
FR Hampel
Hilderman and Hamilton
MD Gordon
Mir S Siadaty
MJ Zaki
ML Antonie
N Ye
OR Zaïane
P Srinivasan
PN Tan
R Bayardo
R Grossman
S Mitra
S Yoon
V Maojo
William A Knaus
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Data mining can be utilized to automate analysis of substantial amounts of data produced in many organizations. However, data mining produces large numbers of rules and patterns, many of which are not useful. Existing methods for pruning uninteresting patterns have only begun to automate the knowledge acquisition step (which is required for subjective measures of interestingness), hence leaving a serious bottleneck. In this paper we propose a method for automatically acquiring knowledge to shorten the pattern list by locating the novel and interesting ones. METHODS: The dual-mining method is based on automatically comparing the strength of patterns mined from a database with the strength of equivalent patterns mined from a relevant knowledgebase. When these two estimates of pattern strength do not match, a high "surprise score" is assigned to the pattern, identifying the pattern as potentially interesting. The surprise score captures the degree of novelty or interestingness of the mined pattern. In addition, we show how to compute p values for each surprise score, thus filtering out noise and attaching statistical significance. RESULTS: We have implemented the dual-mining method using scripts written in Perl and R. We applied the method to a large patient database and a biomedical literature citation knowledgebase. The system estimated association scores for 50,000 patterns, composed of disease entities and lab results, by querying the database and the knowledgebase. It then computed the surprise scores by comparing the pairs of association scores. Finally, the system estimated statistical significance of the scores. CONCLUSION: The dual-mining method eliminates more than 90% of patterns with strong associations, thus identifying them as uninteresting. We found that the pruning of patterns using the surprise score matched the biomedical evidence in the 100 cases that were examined by hand. The method automates the acquisition of knowledge, thus reducing dependence on the knowledge elicited from human expert, which is usually a rate-limiting step

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Does a pay-for-performance program for primary care physicians alleviate health inequity in childhood vaccination rates?

Author: A Katz
A Katz
A Scott
Alan Katz
AT Chien
C Gini
C Mills
CA Mustard
Carole Taylor
CG Victora
Colleen Metge
D Raphael
D You
Dan Chateau
Doug Jutte
DP Jutte
E Doherty
EC Chumney
Elaine Burland
IT Williams
J Huang
J Li
J Rosenthal
JE Shepherd
Jeanette Edwards
Jennifer Emily Enns
JL Mathew
JP Mackenbach
JS Weissman
KJ Mullen
L Glidewell
Lisa Lix
LL Roos
LL Roos
LL Roos
LT Fadnes
M Marmot
M Ueda
Marni Brownell
ME Canavan
N Kakwani
Nathan Nickel
NC Nickel
NC Nickel
NP Roos
NW Crawford
O O'Donnell
OO Odusanya
P Martens
PM Dixon
R Poulton
RF Schoeni
SAS Institute
SM Priedeman
T Hilderman
T Rieck
TJ Dummer
World Health Organization
World Health Organization
YW Jeong
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Algorithms for Association Rules

Author: B. Ganter
N. Pasquier
R. J. Hilderman
Publication venue
Publication date: 01/01/2002
Field of study

Association rules are "if-then rules" with two measures which quantify the support and confidence of the rule for a given data set

CiteSeerX

Crossref

Combining Quality Measures to Identify Interesting Association Rules

Author: B. Liu
J.M. Adamo
N. Lavrač
R. Kohavi
R.J. Hilderman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Crossref

On Semantic Properties of Interestingness Measures for Extracting Rules from Data

Author: G. Piatetsky-Shapiro
N. Lavrač
R.J. Hilderman
T. Fukuda
Y. Kodratoff
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

FormFormatter: A Software Tool for Constructing MouseOptional Visual Basic Applications

Author: Robert J. Hilderman
Trevor Mansuy And
Trevor N. Mansuy
Publication venue
Publication date
Field of study

Repeated use of a pointing device can cause repetitive stress injuries due to the nature of the movement required to manipulate the device. Several solutions for this problem exist; one such solution is a keyboard-based interface where use of the mouse is optional. That is, the keyboard is used as the primary input device to control all aspects of interacting with software. In order for an application to utilize this keyboard-based interface, it must be added to the source code of the application. In this paper we will look at the feasibility of a software tool that can automate this activity. Given the volume of Visual Basic source code in existence, such a tool would be extremely useful. We will discuss the implementation, capabilities, and limitations of FormFormatter, an automatic Visual Basic source code formatter designed to automatically convert any application to one where the use of the mouse is optional. 1

CiteSeerX