497 research outputs found

    Reducing UK-means to k-means

    Get PDF
    This paper proposes an optimisation to the UK-means algorithm, which generalises the k-means algorithm to handle objects whose locations are uncertain. The location of each object is described by a probability density function (pdf). The UK-means algorithm needs to compute expected distances (EDs) between each object and the cluster representatives. The evaluation of ED from first principles is very costly operation, because the pdf's are different and arbitrary. But UK-means needs to evaluate a lot of EDs. This is a major performance burden of the algorithm. In this paper, we derive a formula for evaluating EDs efficiently. This tremendously reduces the execution time of UK-means, as demonstrated by our preliminary experiments. We also illustrate that this optimised formula effectively reduces the UK-means problem to the traditional clustering algorithm addressed by the k-means algorithm. © 2007 IEEE.published_or_final_versionThe 7th IEEE International Conference on Data Mining (ICDM) Workshops 2007, Omaha, NE., 28-31 October 2007. In Proceedings of the 7th ICDM, 2007, p. 483-48

    Effective pattern discovery for text mining

    Get PDF
    Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance

    Personalized Temporal Medical Alert System

    No full text
    International audienceThe continuous increasing needs in telemedicine and healthcare, accentuate the need of well-adapted medical alert systems. Such alert systems may be used by a variety of patients and medical actors, and should allow monitoring a wide range of medical variables. This paper proposes Tempas, a personalized temporal alert system. It facilitates customized alert configuration by using linguistic trends. The trend detection algorithm is based on data normalization, time series segmentation, and segment classification. It improves state of the art by treating irregular and regular time series in an appropriate way, thanks to the introduction of an observation variable valid time. Alert detection is enriched with quality and applicability measures. They allow a personalized tuning of the system to help reducing false negatives and false positives alert

    TF-IDF Based Contextual Post-Filtering Recommendation Algorithm in Complex Interactive Situations of Online to Offline: An Empirical Study

    Get PDF
    O2O accelerates the integration of online and offline, promotes the upgrading of industrial structure and consumption pattern, meanwhile brings the information overload problem. This paper develops a post-context filtering recommendation algorithm based on TF-IDF, which improves the existing algorithms. Combined with contextual association probability and contextual universal importance, a contextual preference prediction model was constructed to adjust the initial score of the traditional recommendation combined with item category preference to generate the final result. The example of the catering industry shows that the proposed algorithm is more effective than the improved algorithm
    • …
    corecore