Search CORE

1,200 research outputs found

Signature-based Tree for Finding Frequent Itemsets

Author: Deye Mohamed Mahmoud
El Hadi Benelhadj Mohamed
Slimani Yahya
Publication venue: The Croatian Communications and Information Society (CCIS) in cooperation with Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, University of Split
Publication date: 01/01/2023
Field of study

The efficiency of a data mining process depends on the data structure used to find frequent itemsets. Two approaches are possible: use the original transaction dataset or transform it into another more compact structure. Many algorithms use trees as compact structure, like FP-Tree and the associated algorithm FP-Growth. Although this structure reduces the number of scans (only 2), its efficiency depends on two criteria: (i) the size of the support (small or large); (ii) the type of transaction dataset (sparse or dense). But these two criteria can generate very large trees. In this paper, we propose a new tree-based structure that emphasizes on transactions and not on itemsets. Hence, we avoid the problem of support values that have a negative impact on the generated tree

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Image Database Management System : Design Considerations, Algorithms and Architecture

Author: Nes N.J. (Niels)
Publication venue
Publication date: 14/12/2000
Field of study

CWI's Institutional Repository

Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

Author: Chang Ming-Yang
Publication venue: 'Academy Publisher'
Publication date
Field of study

[[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

Tamkang University Institutional Repository

SHIRAZ: an automated histology image annotation system for zebrafish phenomics

Author: AWM Smeulders
B Grünbaum
Brian A. Canada
G Salton
GC Cross
Georgia K. Thomas
GS Tsao-Wu
H Müller
HL Tang
I Daubechies
J Li
J Li
James Z. Wang
JZ Wang
Keith C. Cheng
M Petrou
M Plummer
MT Weirauch
NA Sabaliauskas
P Colquhoun
PW Hamilton
R Datta
RC Gonzalez
RE Plotnick
RM Haralick
S Mitra
W Lin
Y Liu
YA Lussier
Publication venue: Springer US
Publication date: 01/01/2010
Field of study

Histological characterization is used in clinical and research contexts as a highly sensitive method for detecting the morphological features of disease and abnormal gene function. Histology has recently been accepted as a phenotyping method for the forthcoming Zebrafish Phenome Project, a large-scale community effort to characterize the morphological, physiological, and behavioral phenotypes resulting from the mutations in all known genes in the zebrafish genome. In support of this project, we present a novel content-based image retrieval system for the automated annotation of images containing histological abnormalities in the developing eye of the larval zebrafish

Crossref

Springer - Publisher Connector

PubMed Central

IMPROVED INTEGRATED MINING OF HETEROGENEOUS DATA IN DECISION SUPPORT SYSTEMS

Author: Afolabi I. T.
Publication venue
Publication date: 01/03/2012
Field of study

Covenant University Repository

Dynamic Assembly for System Adaptability, Dependability, and Assurance

Author: Luqi
Publication venue: Naval Postgraduate School
Publication date: 01/12/2002
Field of study

(DASASA) ProjectAuthor-contributed print ite

Calhoun, Institutional Archive of the Naval Postgraduate School

DescribeX: A Framework for Exploring and Querying XML Web Collections

Author: Rizzolo Flavio
Publication venue
Publication date: 01/01/2008
Field of study

This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogeneous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of complex, heterogeneous XML collections and the behaviour of XPath queries evaluated on them. Experimental results demonstrate the scalability of DescribeX summary refinements and stabilizations (the key enablers for tailoring summaries) with multi-gigabyte web collections. A comparative study suggests that using a DescribeX summary created from a given workload can produce query evaluation times orders of magnitude better than using existing summaries. DescribeX's light-weight approach of combining summaries with a file-at-a-time XPath processor can be a very competitive alternative, in terms of performance, to conventional fully-fledged XML query engines that provide DB-like functionality such as security, transaction processing, and native storage.Comment: PhD thesis, University of Toronto, 2008, 163 page

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server

Efficiently indexing sparse wide tables in community systems

Author: HUI MEI
Publication venue
Publication date: 25/05/2010
Field of study

Master'sMASTER OF SCIENC

ScholarBank@NUS