7 research outputs found

    Pattern mining under different conditions

    Get PDF
    New requirements and demands on pattern mining arise in modern applications, which cannot be fulfilled using conventional methods. For example, in scientific research, scientists are more interested in unknown knowledge, which usually hides in significant but not frequent patterns. However, existing itemset mining algorithms are designed for very frequent patterns. Furthermore, scientists need to repeat an experiment many times to ensure reproducibility. A series of datasets are generated at once, waiting for clustering, which can contain an unknown number of clusters with various densities and shapes. Using existing clustering algorithms is time-consuming because parameter tuning is necessary for each dataset. Many scientific datasets are extremely noisy. They contain considerably more noises than in-cluster data points. Most existing clustering algorithms can only handle noises up to a moderate level. Temporal pattern mining is also important in scientific research. Existing temporal pattern mining algorithms only consider pointbased events. However, most activities in the real-world are interval-based with a starting and an ending timestamp. This thesis developed novel pattern mining algorithms for various data mining tasks under different conditions. The first part of this thesis investigates the problem of mining less frequent itemsets in transactional datasets. In contrast to existing frequent itemset mining algorithms, this part focus on itemsets that occurred not that frequent. Algorithms NIIMiner, RaCloMiner, and LSCMiner are proposed to identify such kind of itemsets efficiently. NIIMiner utilizes the negative itemset tree to extract all patterns that occurred less than a given support threshold in a top-down depth-first manner. RaCloMiner combines existing bottom-up frequent itemset mining algorithms with a top-down itemset mining algorithm to achieve a better performance in mining less frequent patterns. LSCMiner investigates the problem of mining less frequent closed patterns. The second part of this thesis studied the problem of interval-based temporal pattern mining in the stream environment. Interval-based temporal patterns are sequential patterns in which each event is aligned with a starting and ending temporal information. The ability to handle interval-based events and stream data is lacking in existing approaches. A novel intervalbased temporal pattern mining algorithm for stream data is described in this part. The last part of this thesis studies new problems in clustering on numeric datasets. The first problem tackled in this part is shape alternation adaptivity in clustering. In applications such as scientific data analysis, scientists need to deal with a series of datasets generated from one experiment. Cluster sizes and shapes are different in those datasets. A kNN density-based clustering algorithm, kadaClus, is proposed to provide the shape alternation adaptability so that users do not need to tune parameters for each dataset. The second problem studied in this part is clustering in an extremely noisy dataset. Many real-world datasets contain considerably more noises than in-cluster data points. A novel clustering algorithm, kenClus, is proposed to identify clusters in arbitrary shapes from extremely noisy datasets. Both clustering algorithms are kNN-based, which only require one parameter k. In each part, the efficiency and effectiveness of the presented techniques are thoroughly analyzed. Intensive experiments on synthetic and real-world datasets are conducted to show the benefits of the proposed algorithms over conventional approaches

    A model for context awareness for mobile applications using multiple-input sources

    Get PDF
    Context-aware computing enables mobile applications to discover and benefit from valuable context information, such as user location, time of day and current activity. However, determining the users’ context throughout their daily activities is one of the main challenges of context-aware computing. With the increasing number of built-in mobile sensors and other input sources, existing context models do not effectively handle context information related to personal user context. The objective of this research was to develop an improved context-aware model to support the context awareness needs of mobile applications. An existing context-aware model was selected as the most complete model to use as a basis for the proposed model to support context awareness in mobile applications. The existing context-aware model was modified to address the shortcomings of existing models in dealing with context information related to personal user context. The proposed model supports four different context dimensions, namely Physical, User Activity, Health and User Preferences. A prototype, called CoPro was developed, based on the proposed model, to demonstrate the effectiveness of the model. Several experiments were designed and conducted to determine if CoPro was effective, reliable and capable. CoPro was considered effective as it produced low-level context as well as inferred context. The reliability of the model was confirmed by evaluating CoPro using Quality of Context (QoC) metrics such as Accuracy, Freshness, Certainty and Completeness. CoPro was also found to be capable of dealing with the limitations of the mobile computing platform such as limited processing power. The research determined that the proposed context-aware model can be used to successfully support context awareness in mobile applications. Design recommendations were proposed and future work will involve converting the CoPro prototype into middleware in the form of an API to provide easier access to context awareness support in mobile applications
    corecore