3 research outputs found

    A Novel Algorithm for Discovering Frequent Closures and Generators

    Get PDF
    The Important construction of many association rules needs the calculation of Frequent Closed Item Sets and Frequent Generator Item Sets (FCIS/FGIS). However, these two odd jobs are joined very rarely. Most of the existing methods apply level wise Breadth-First search. Though the Depth-First search depends on different characteristics of data, it is often better than others. Hence, in this paper it is named as FCFG algorithm that combines the Frequent closed item sets and frequent generators. This proposed algorithm (FCFG) extracts frequent itemsets (FIs) in a Depth-First search method. Then this algorithm extracts FCIS and FGIS from FIs by a level wise approach. Then it associates the generators to their closures. In FCFG algorithm, a generic technique is extended from an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules. Experimental results indicate that FCFG algorithm performs better when compared with other level wise methods in most of the cases

    Detecting Outliers In High-Dimensional Datasets With Mixed Attributes

    No full text
    Outlier Detection has attracted substantial attention in many applications and research areas. Examples include detection of network intrusions or credit card fraud. Many of the existing approaches are based on pair-wise distances among all points in the dataset. These approaches cannot easily extend to current datasets that usually contain a mix of categorical and continuous attributes, and may be scattered over large geographical areas. In addition, current datasets usually have a large number of dimensions. These datasets tend to be sparse, and traditional concepts such as Euclidean distance or nearest neighbor become unsuitable. We propose ODMAD, a fast outlier detection strategy intended for datasets containing mixed attributes. ODMAD takes into consideration the sparseness of the dataset, and is experimentally shown to be highly scalable with the number of points and number of attributes in the dataset
    corecore