99 research outputs found

    Agents and stream data mining: a new perspective

    Full text link
    Many organizations struggle with the massive amount of data they collect. Today, data does more than serve as the ingredients for churning out statistical reports. They help support efficient operations in many organizations, and to some extent, data provide the competitive intelligence organizations need to survive in today\u27s economy. Data mining can\u27t always deliver timely and relevant results because data are constantly changing. However, stream-data processing might be more effective, judging by the Matrix project.<br /

    Mining relationship graphs for effective business objectives

    Get PDF
    Modern organization has two types of customer profiles: active and passive. Active customers contribute to the business goals of an organization, while passive customers are potential candidates that can be converted to active ones. Existing KDD techniques focused mainly on past data generated by active customers. The insights discovered apply well to active ones but may scale poorly with passive customers. This is because there is no attempt to generate know-how to convert passive customers into active ones. We propose an algorithm to discover relationship graphs using both types of profile. Using relationship graphs, an organization can be more effective in realizing its goals

    Mining multi-level rules with recurrent items using FP'-Tree

    Get PDF
    Association rule mining has received broad research in the academic and wide application in the real world. As a result, many variations exist and one such variant is the mining of multi-level rules. The mining of multi-level rules has proved to be useful in discovering important knowledge that conventional algorithms such as Apriori, SETM, DIC etc., miss. However, existing techniques for mining multi-level rules have failed to take into account the recurrence relationship that can occur in a transaction during the translation of an atomic item to a higher level representation. As a result, rules containing recurrent items go unnoticed. In this paper, we consider the notion of `quantity&apos; to an item, and present an algorithm based on an extension of the FP-Tree to find association rules with recurrent items at multiple concept levels.

    SCLOPE: An algorithm for clustering data streams of categorical attributes

    Get PDF
    Clustering is a difficult problem especially when we consider the task in the context of a data stream of categorical attributes. In this paper, we propose SCLOPE, a novel algorithm based on CLOPErsquos intuitive observation about cluster histograms. Unlike CLOPE however, our algo- rithm is very fast and operates within the constraints of a data stream environment. In particular, we designed SCLOPE according to the recent CluStream framework. Our evaluation of SCLOPE shows very promising results. It consistently outperforms CLOPE in speed and scalability tests on our data sets while maintaining high cluster purity; it also supports cluster analysis that other algorithms in its class do not.<br /

    A spectroscopy of texts for effective clustering

    Get PDF
    For many clustering algorithms, such as k-means, EM, and CLOPE, there is usually a requirement to set some parameters. Often, these parameters directly or indirectly control the number of clusters to return. In the presence of different data characteristics and analysis contexts, it is often difficult for the user to estimate the number of clusters in the data set. This is especially true in text collections such as Web documents, images or biological data. The fundamental question this paper addresses is: ldquoHow can we effectively estimate the natural number of clusters in a given text collection?rdquo. We propose to use spectral analysis, which analyzes the eigenvalues (not eigenvectors) of the collection, as the solution to the above. We first present the relationship between a text collection and its underlying spectra. We then show how the answer to this question enhances the clustering process. Finally, we conclude with empirical results and related work.<br /

    Modal gain control in a multimode erbium doped fiber amplifier incorporating ring doping

    No full text
    We theoretically demonstrate the performance of a step index multimode (two mode-group) erbium-doped fiber amplifier with a localized erbium doped ring distribution for Space Division Multiplexed (SDM) transmission

    CrystalClear: Active visualization of association rules

    Get PDF
    Effective visualization is an important aspect of active data mining. In the context of association rules, this need has been driven by the large amount of rules produced from a run of the algorithm. To be able to address real user needs, the rules need to be summarized and organized so that it can be interpreted and applied in a timely manner. In this paper, we propose two visualization techniques that is an improvement over those used by existing data mining packages. In particular, we address the visualization of &quot;differences&quot; in the set of rules due to incremental changes in the data source. We show that visualization in this aspect is important to active data mining as it uncovers new insights not possible from inspecting individual data mining results
    • …
    corecore