273 research outputs found

    Towards Role Based Hypothesis Evaluation for Health Data Mining

    Get PDF
    Data mining researchers have long been concerned with the application of tools to facilitate and improve data analysis on large, complex data sets. The current challenge is to make data mining and knowledge discovery systems applicable to a wider range of domains, among them health. Early work was performed over transactional, retail based data sets, but the attraction of finding previously unknown knowledge from the ever increasing amounts of data collected from the health domain is an emerging area of interest and specialisation. The problem is finding a solution that is suitably flexible to allow for generalised application whilst being specific enough to provide functionality that caters for the nuances of each role within the domain. The need for a more granular approach to problem solving in other areas of information technology has resulted in the use of role based solutions. This paper discusses the progress to date in developing a role oriented solution to the problem of providing for the diverse requirements of health domain data miners and defining the foundation for determining what constitutes an interesting discovery in an area as complex as health

    Mining Closed Itemsets for Coherent Rules: An Inference Analysis Approach

    Get PDF
    Past observations have shown that a frequent item set mining algorithm are alleged to mine the closed ones because the finish offers a compact and a whole progress set and higher potency. Anyhow, the most recent closed item set mining algorithms works with candidate maintenance combined with check paradigm that is dear in runtime likewise as area usage when support threshold is a smaller amount or the item sets gets long. Here, we show, PEPP with inference analysis that could be a capable approach used for mining closed sequences for coherent rules while not candidate. It implements a unique sequence closure checking format with inference analysis that based mostly on Sequence Graph protruding by an approach labeled Parallel Edge projection and pruning in brief will refer as PEPP. We describe a novel inference analysis approach to prune patterns that tends to derive coherent rules. A whole observation having sparse and dense real-life information sets proved that PEPP with inference analysis performs larger compared to older algorithms because it takes low memory and is quicker than any algorithms those cited in literature frequently

    On the Application of Data Mining to Official Data

    Get PDF
    Retrieving valuable knowledge and statistical patterns from official data has a great potential in supporting strategic policy making. Data Mining (DM) techniques are well-known for providing flexible and efficient analytical tools for data processing. In this paper, we provide an introduction to applications of DM to official statistics and flag the important issues and challenges. Considering recent advancements in software projects for DM, we propose intelligent data control system design and specifications as an example of DM application in official data processing.Data mining, Official data, Intelligent data control system

    Data Mining and Official Statistics: The Past, the Present and the Future

    Full text link
    Along with the increasing availability of large databases under the purview of National Statistical Institutes, the application of data mining techniques to official statistics is now a hot topic that is far more important at present than it was ever before. Presented in this article is a thorough review of published work to date on the application of data mining in official statistics, and on identification of the techniques that have been explored. In addition, the importance of data mining to official statistics is flagged and a summary of the challenges that have hindered its development over the course of the last two decades is presented

    Selecting inflation indicators under an inflation targeting regime: evidence from the MCL method

    Get PDF
    This paper seeks to fill a gap in the literature by analyzing inflation in Poland, one of only two transition economies that have adopted a strict inflation-targeting policy. The paper also introduces a new method for selecting inflation indicators. Consistent with the earlier literature, empirical results find a strong link between the producer price index and consumer price index in Poland. This shows the importance of the manufacturing sector in determining the price level in the country. Overall, wages, broad money supply and the exchange rate are good indicators of inflation.inflation; Poland; MCL method

    Exploiting Data Mining Techniques for Broadcasting Data in Mobile Computing Environments

    Get PDF
    Cataloged from PDF version of article.Mobile computers can be equipped with wireless communication devices that enable users to access data services from any location. In wireless communication, the server-to-client (downlink) communication bandwidth is much higher than the client-to-server (uplink) communication bandwidth. This asymmetry makes the dissemination of data to client machines a desirable approach. However, dissemination of data by broadcasting may induce high access latency in case the number of broadcast data items is large. In this paper, we propose two methods aiming to reduce client access latency of broadcast data. Our methods are based on analyzing the broadcast history (i.e., the chronological sequence of items that have been requested by clients) using data mining techniques. With the first method, the data items in the broadcast disk are organized in such a way that the items requested subsequently are placed close to each other. The second method focuses on improving the cache hit ratio to be able to decrease the access latency. It enables clients to prefetch the data from the broadcast disk based on the rules extracted from previous data request patterns. The proposed methods are implemented on a Web log to estimate their effectiveness. It is shown through performance experiments that the proposed rule-based methods are effective in improving the system performance in terms of the average latency as well as the cache hit ratio of mobile clients

    Clustering of streaming time series is meaningless

    Get PDF
    • 

    corecore