3 research outputs found
Effective pattern discovery for text mining
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance
Automatic pattern-taxonomy extraction for web mining
In this paper, we propose a model for discovering frequent sequential patterns, phrases, which can be used as profile descriptors of documents. It is indubitable that we can obtain numerous phrases using data mining algorithms. However, it is difficult to use these phrases effectively for answering what users want. Therefore, we present a pattern taxonomy extraction model which performs the task of extracting descriptive frequent sequential patterns by pruning the meaningless ones. The model then is extended and tested by applying it to the information filtering system. The results of the experiment show that pattern-based methods outperform the keyword-based methods. The results also indicate that removal of meaningless patterns not only reduces the cost of computation but also improves the effectiveness of the system. <br /
A pattern language for evolution reuse in component-based software architectures
Context: Modern software systems are prone to a continuous evolution under frequently varying requirements and changes in operational environments. Architecture-Centric Software Evolution (ACSE) enables changes in a system’s structure and behaviour while maintaining a global view of the software to address evolution-centric trade-offs. Lehman’s law of continuing change demands for long-living and continuously evolving architectures to prolong the productive life and economic value of software. Also some industrial research shows that evolution reuse can save approximately 40% effort of change implementation in ACSE process. However, a systematic review of existing research suggests a lack of solution(s) to support a continuous integration of
reuse knowledge in ACSE process to promote evolution-off-the-shelf in software architectures.
Objectives: We aim to unify the concepts of software repository mining and software evolution to discover evolution-reuse knowledge that can be shared and reused to guide ACSE.
Method: We exploit repository mining techniques (also architecture change mining) that investigates architecture change logs to discover change operationalisation and patterns. We apply
software evolution concepts (also architecture change execution) to support pattern-driven reuse in ACSE. Architecture change patterns support composition and application of a pattern language that exploits patterns and their relations to express evolution-reuse knowledge. Pattern language composition is enabled with a continuous discovery of patterns from architecture change logs and
formalising relations among discovered patterns. Pattern language application is supported with an incremental selection and application of patterns to achieve reuse in ACSE. The novelty of the research lies with a framework PatEvol that supports a round-trip approach for a continuous acquisition (mining) and application (execution) of reuse knowledge to enable ACSE. Prototype
support enables customisation and (semi-) automation for the evolution process.
Results: We evaluated the results based on the ISO/IEC 9126 - 1 quality model and a case study based validation of the architecture change mining and change execution processes. We observe consistency and reusability of change support with pattern-driven architecture evolution. Change patterns support efficiency for architecture evolution process but lack a fine-granular
change implementation. A critical challenge lies with the selection of appropriate patterns to form a pattern language during evolution.
Conclusions: The pattern language itself continuously evolves with an incremental discovery of new patterns from change logs over time. A systematic identification and resolution of change anti-patterns define the scope for future research