452 research outputs found

    Rule Mining and Sequential Pattern Based Predictive Modeling with EMR Data

    Get PDF
    Electronic medical record (EMR) data is collected on a daily basis at hospitals and other healthcare facilities to track patients’ health situations including conditions, treatments (medications, procedures), diagnostics (labs) and associated healthcare operations. Besides being useful for individual patient care and hospital operations (e.g., billing, triaging), EMRs can also be exploited for secondary data analyses to glean discriminative patterns that hold across patient cohorts for different phenotypes. These patterns in turn can yield high level insights into disease progression with interventional potential. In this dissertation, using a large scale realistic EMR dataset of over one million patients visiting University of Kentucky healthcare facilities, we explore data mining and machine learning methods for association rule (AR) mining and predictive modeling with mood and anxiety disorders as use-cases. Our first work involves analysis of existing quantitative measures of rule interestingness to assess how they align with a practicing psychiatrist’s sense of novelty/surprise corresponding to ARs identified from EMRs. Our second effort involves mining causal ARs with depression and anxiety disorders as target conditions through matching methods accounting for computationally identified confounding attributes. Our final effort involves efficient implementation (via GPUs) and application of contrast pattern mining to predictive modeling for mental conditions using various representational methods and recurrent neural networks. Overall, we demonstrate the effectiveness of rule mining methods in secondary analyses of EMR data for identifying causal associations and building predictive models for diseases

    A MULTI-METHOD DESIGN TO INVESTIGATE THE ROLES OF READING STRATEGY USE AND READING INTEREST IN COMPREHENSION OF ENGLISH EXPOSITORY TEXTS FOR EIGHTH GRADERS IN THE EFL CONTEXT

    Get PDF
    This study aimed to address three research gaps revealed in previous studies on L2 reading comprehension and L2 reading strategy use: (a) a restricted use of methodology in assessing L2 reading strategies, (b) inadequate attention to the role of reading interest in L2 reading comprehension, and (c) a lack of comprehensive understanding about the relationships between reading strategy use and reading interest in L2 reading. A multi-method design was adapted to assess L2 reading strategy use and L2 reading interest. The assessment methods for strategy use included think-aloud protocols and a L2 reading strategy questionnaire, the Cognitive-Metacognitive Strategy Questionnaire. To quantify the data from the think-aloud protocols, three scoring procedures were developed based on the frequency counts of the strategy coding system: (1) Quantity of Total Strategy Use, (2) Quality of Total Strategy Use and (3) Sophistication of Strategy Use. In addition, the readers' reading interest was measured by semi-structural interviews and two interest scales: the Situational Interest Questionnaire and the Interest Experience Scale. Based on the multiple assessments with 36 participants, the study examined (1) the specific L2 reading strategies employed by eighth graders in Taiwan and how the results from different strategy assessments corresponded to each other, (2) the sources for L2 reading interest for the eighth graders, and (3) how L2 reading strategy use and reading interest interacted with each other to influence L2 reading comprehension. The results indicated that the L2 readers utilized three clusters of reading strategies during comprehension: (1) textbase comprehension strategies, such as translation and paraphrasing, (2) situation model construction strategies, such as elaboration, summarization and drawing inferences, and (3) metacognitive monitoring strategies. The study also found that the measure, Sophistication of Strategy Use, had the most satisfactory validity among the strategy measures. The degree of sophistication in strategy use was more associated with the readers' text recalls than the quantity of total strategy use, indicating how the readers intentionally and carefully processed each strategy played a significant role to improve reading comprehension. Moreover, the study found several content characteristics which had positive influences on L2 readers' interest in the text; they were relevance, importance, novelty and familiarity of the ideas contained in the text. Furthermore, the case analyses on three readers' profiles showed that reading interest was closely related to the depth of the readers' strategic engagement. The less proficient L2 reader, Alice, possessed high reading interest and demonstrated an attempt to employ more higher-order, situation model construction strategies during reading. By contrast, the proficient L2 reader, Stella, did not intend to comprehend the text in depth and utilized the strategies at the superficial level due to low reading interest. These findings presented a dynamic picture of the intertwined relationship between strategy use and reading interest in L2 reading comprehension

    Information dynamics: patterns of expectation and surprise in the perception of music

    Get PDF
    This is a postprint of an article submitted for consideration in Connection Science © 2009 [copyright Taylor & Francis]; Connection Science is available online at:http://www.tandfonline.com/openurl?genre=article&issn=0954-0091&volume=21&issue=2-3&spage=8

    Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm

    Get PDF
    The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest among patterns. Despite the mining of association rules being an area of great interest for many researchers, the search for well-grouped continuous values is still a challenge, discovering rules that do not comprise patterns which represent unnecessary ranges of values. Existing algorithms for mining association rules in continuous domains are mainly based on a non-deterministic search, requiring a high number of parameters to be optimised. These parameters hinder the mining process, and the algorithms themselves must be known to those data mining experts that want to use them. We therefore present a grammar guided genetic programming algorithm that does not require as many parameters as other existing approaches and enables the discovery of quantitative association rules comprising small-size gaps. The algorithm is verified over a varied set of data, comparing the results to other association rule mining algorithms from several paradigms. Additionally, some resulting rules from different paradigms are analysed, demonstrating the effectiveness of our model for reducing gaps in numerical features

    Enhancing the Prediction of Missing Targeted Items from the Transactions of Frequent, Known Users

    Get PDF
    The ability for individual grocery retailers to have a single view of its customers across all of their grocery purchases remains elusive, and is considered the “holy grail” of grocery retailing. This has become increasingly important in recent years, especially in the UK, where competition has intensified, shopping habits and demographics have changed, and price sensitivity has increased. Whilst numerous studies have been conducted on understanding independent items that are frequently bought together, there has been little research conducted on using this knowledge of frequent itemsets to support decision making for targeted promotions. Indeed, having an effective targeted promotions approach may be seen as an outcome of the “holy grail”, as it will allow retailers to promote the right item, to the right customer, using the right incentives to drive up revenue, profitability, and customer share, whilst minimising costs. Given this, the key and original contribution of this study is the development of the market target (mt) model, the clustering approach, and the computer-based algorithm to enhance targeted promotions. Tests conducted on large scale consumer panel data, with over 32000 customers and 51 million individual scanned items per year, show that the mt model and the clustering approach successfully identifies both the best items, and customers to target. Further, the algorithm segregates customers into differing categories of loyalty, in this case it is four, to enable retailers to offer customised incentives schemes to each group, thereby enhancing customer engagement, whilst preventing unnecessary revenue erosion. The proposed model is compared with both a recently published approach, and the cross-sectional shopping patterns of the customers on the consumer scanner panel. Tests show that the proposed approach outperforms the other approach in that it significantly reduces the probability of having “false negatives” and “false positives” in the target customer set. Tests also show that the customer segmentation approach is effective, in that customers who are classed as highly loyal to a grocery retailer, are indeed loyal, whilst those that are classified as “switchers” do indeed have low levels of loyalty to the selected grocery retailer. Applying the mt model to other fields has not only been novel but yielded success. School attendance is improved with the aid of the mt model being applied to attendance data. In this regard, an action research study, involving the proposed mt model and approach, conducted at a local UK primary school, has resulted in the school now meeting the required attendance targets set by the government, and it has halved its persistent absenteeism for the first time in four years. In medicine, the mt model is seen as a useful tool that could rapidly uncover associations that may lead to new research hypotheses, whilst in crime prevention, the mt value may be used as an effective, tangible, efficiency metric that will lead to enhanced crime prevention outcomes, and support stronger community engagement. Future work includes the development of a software program for improving school attendance that will be offered to all schools, while further progress will be made on demonstrating the effectiveness of the mt value as a tangible crime prevention metric

    Design: One, but in different forms

    Full text link
    This overview paper defends an augmented cognitively oriented generic-design hypothesis: there are both significant similarities between the design activities implemented in different situations and crucial differences between these and other cognitive activities; yet, characteristics of a design situation (related to the design process, the designers, and the artefact) introduce specificities in the corresponding cognitive activities and structures that are used, and in the resulting designs. We thus augment the classical generic-design hypothesis with that of different forms of designing. We review the data available in the cognitive design research literature and propose a series of candidates underlying such forms of design, outlining a number of directions requiring further elaboration

    Two new approaches to evaluate association rules

    Get PDF
    viii, 85 leaves : ill. ; 29 cmData mining aims to discover interesting and unknown patterns in large-volume data. Association rule mining is one of the major data mining tasks, which attempts to find inherent relationships among data items in an application domain, such as supermarket basket analysis. An essential post-process in an association rule mining task is the evaluation of association rules by measures for their interestingness. Different interestingness measures have been proposed and studied. Given an association rule mining task, measures are assessed against a set of user-specified properties. However, in practice, given the subjectivity and inconsistencies in property specifications, it is a non-trivial task to make appropriate measure selections. In this work, we propose two novel approaches to assess interestingness measures. Our first approach utilizes the analytic hierarchy process to capture quantitatively domain-dependent requirements on properties, which are later used in assessing measures. This approach not only eliminates any inconsistencies in an end user’s property specifications through consistency checking but also is invariant to the number of association rules. Our second approach dynamically evaluates association rules according to a composite and collective effect of multiple measures. It interactively snapshots the end user’s domain- dependent requirements in evaluating association rules. In essence, our approach uses neural networks along with back-propagation learning to capture the relative importance of measures in evaluating association rules. Case studies and simulations have been conducted to show the effectiveness of our two approaches
    • 

    corecore