4 research outputs found

    Automatic identification of bottleneck tasks for business process management using fusion-based text clustering

    Get PDF
    With the arrival of the industrial big data era, it offers unprecedented opportunities for machine learning to intelligently uncover hidden tasks and restore the entire underlying process for business process modelling. While recent studies, e.g., process mining and ontologies, have advanced the research agenda of business process modelling and management, identifying a bottleneck task automatically needs more in-depth research. In this paper, a text mining-based bottleneck task identification approach is proposed. Firstly, to extract tasks from documents in different lengths, a dynamic sliding window is introduced to the biterm topic model. The sliding window size is adjusted according to document length during biterm selection process to ensure the two words in biterm comes from a context. Secondly, a fusion-based clustering algorithm is studied to uncover business tasks. The improved biterm topic model and the Doc2vec model are used to train two document vectors and then calculate two distances. The linear fusion of these two distances is used as the metric of clustering. Thirdly, the temporal frequency of each task at different periods is calculated to show the timeline and abnormal occurrence of tasks to identify bottleneck tasks. The proposed approach is evaluated using a data set containing the execution of a multi-year multidisciplinary student design project. The experiment results show the approach can effectively identify bottleneck tasks without manual intervention

    Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

    Full text link
    This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm

    Revisiting the challenges and surveys in text similarity matching and detection methods

    Get PDF
    The massive amount of information from the internet has revolutionized the field of natural language processing. One of the challenges was estimating the similarity between texts. This has been an open research problem although various studies have proposed new methods over the years. This paper surveyed and traced the primary studies in the field of text similarity. The aim was to give a broad overview of existing issues, applications, and methods of text similarity research. This paper identified four issues and several applications of text similarity matching. It classified current studies based on intrinsic, extrinsic, and hybrid approaches. Then, we identified the methods and classified them into lexical-similarity, syntactic-similarity, semantic-similarity, structural-similarity, and hybrid. Furthermore, this study also analyzed and discussed method improvement, current limitations, and open challenges on this topic for future research directions
    corecore