695 research outputs found

    Data Mining Techniques for Complex User-Generated Data

    Get PDF
    Nowadays, the amount of collected information is continuously growing in a variety of different domains. Data mining techniques are powerful instruments to effectively analyze these large data collections and extract hidden and useful knowledge. Vast amount of User-Generated Data (UGD) is being created every day, such as user behavior, user-generated content, user exploitation of available services and user mobility in different domains. Some common critical issues arise for the UGD analysis process such as the large dataset cardinality and dimensionality, the variable data distribution and inherent sparseness, and the heterogeneous data to model the different facets of the targeted domain. Consequently, the extraction of useful knowledge from such data collections is a challenging task, and proper data mining solutions should be devised for the problem under analysis. In this thesis work, we focus on the design and development of innovative solutions to support data mining activities over User-Generated Data characterised by different critical issues, via the integration of different data mining techniques in a unified frame- work. Real datasets coming from three example domains characterized by the above critical issues are considered as reference cases, i.e., health care, social network, and ur- ban environment domains. Experimental results show the effectiveness of the proposed approaches to discover useful knowledge from different domains

    SEQUEST: Mining frequent subsequences using DMA strips

    Get PDF
    Sequential patterns exist in data such as DNA string databases, occurrences of recurrent illness, etc. In this study, we present an algorithm, SEQUEST, to mine frequent subsequences from sequential patterns. The challenges of mining a very large database of sequences is computationally expensive and require large memory space. SEQUEST uses a Direct Memory Access Strips (DMA-Strips) structure to efficiently generate candidate subsequences. DMA-Strips structure provides direct access to each item to be manipulated and thus is optimized for speed and space performance. In addition, the proposed technique uses a hybrid principle of frequency counting by the vertical join approach and candidate generation by structure guided method. The structure guided method is adapted from the TMG approach used for enumerating subtrees in our previous work [8]. Experiments utilizing very large databases of sequences which compare our technique with the existing technique, PLWAP [4], demonstrate the effectiveness of our proposed technique

    Workshop sensing a changing world : proceedings workshop November 19-21, 2008

    Get PDF

    Enhancing the Prediction of Missing Targeted Items from the Transactions of Frequent, Known Users

    Get PDF
    The ability for individual grocery retailers to have a single view of its customers across all of their grocery purchases remains elusive, and is considered the “holy grail” of grocery retailing. This has become increasingly important in recent years, especially in the UK, where competition has intensified, shopping habits and demographics have changed, and price sensitivity has increased. Whilst numerous studies have been conducted on understanding independent items that are frequently bought together, there has been little research conducted on using this knowledge of frequent itemsets to support decision making for targeted promotions. Indeed, having an effective targeted promotions approach may be seen as an outcome of the “holy grail”, as it will allow retailers to promote the right item, to the right customer, using the right incentives to drive up revenue, profitability, and customer share, whilst minimising costs. Given this, the key and original contribution of this study is the development of the market target (mt) model, the clustering approach, and the computer-based algorithm to enhance targeted promotions. Tests conducted on large scale consumer panel data, with over 32000 customers and 51 million individual scanned items per year, show that the mt model and the clustering approach successfully identifies both the best items, and customers to target. Further, the algorithm segregates customers into differing categories of loyalty, in this case it is four, to enable retailers to offer customised incentives schemes to each group, thereby enhancing customer engagement, whilst preventing unnecessary revenue erosion. The proposed model is compared with both a recently published approach, and the cross-sectional shopping patterns of the customers on the consumer scanner panel. Tests show that the proposed approach outperforms the other approach in that it significantly reduces the probability of having “false negatives” and “false positives” in the target customer set. Tests also show that the customer segmentation approach is effective, in that customers who are classed as highly loyal to a grocery retailer, are indeed loyal, whilst those that are classified as “switchers” do indeed have low levels of loyalty to the selected grocery retailer. Applying the mt model to other fields has not only been novel but yielded success. School attendance is improved with the aid of the mt model being applied to attendance data. In this regard, an action research study, involving the proposed mt model and approach, conducted at a local UK primary school, has resulted in the school now meeting the required attendance targets set by the government, and it has halved its persistent absenteeism for the first time in four years. In medicine, the mt model is seen as a useful tool that could rapidly uncover associations that may lead to new research hypotheses, whilst in crime prevention, the mt value may be used as an effective, tangible, efficiency metric that will lead to enhanced crime prevention outcomes, and support stronger community engagement. Future work includes the development of a software program for improving school attendance that will be offered to all schools, while further progress will be made on demonstrating the effectiveness of the mt value as a tangible crime prevention metric

    Cross-chain collaboration in the fast moving consumer goods supply chain

    Get PDF

    SPARC 2016 Salford postgraduate annual research conference book of abstracts

    Get PDF

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
    • …
    corecore