287,721 research outputs found

    Extreme Data Mining: Inference from Small Datasets

    Get PDF
    Neural networks have been applied successfully in many fields. However, satisfactory results can only be found under large sample conditions. When it comes to small training sets, the performance may not be so good, or the learning task can even not be accomplished. This deficiency limits the applications of neural network severely. The main reason why small datasets cannot provide enough information is that there exist gaps between samples, even the domain of samples cannot be ensured. Several computational intelligence techniques have been proposed to overcome the limits of learning from small datasets. We have the following goals: i. To discuss the meaning of small in the context of inferring from small datasets. ii. To overview computational intelligence solutions for this problem. iii. To illustrate the introduced concepts with a real-life application

    Meta Analysis of Empirical Deterrence Studies: an explorative contest

    Get PDF
    A sample of 200 studies empirically analyzing deterrence in some way is evaluated. Various methods of data mining (stepwise regressions, Extreme Bounds Analysis, Bayesian Model Averaging, manual and naive selections) are used to explore different influences of various variables on the results of each study. The preliminary results of these methods are tested against each other in a competition of methodology to evaluate their performance in forecasting and fitting the data and to conclude which methods should be favored in an upcoming extensive meta-analysis. It seems to be the case that restrictive methods (which select fewer variables) are to be preferred when predicting data ex ante, and less parsimonious methods (which select more variables) when data has to be fitted (ex post). In the former case forward stepwise regression or Bayesian Model Selection perform very well, whereas backward stepwise regression and Extreme Bounds Analysis are to be preferred in the latter case.meta analysis, data mining, deterrence, criminometrics

    The miracle of the Septuagint and the promise of data mining in economics

    Get PDF
    This paper argues that the sometimes-conflicting results of a modern revisionist literature on data mining in econometrics reflect different approaches to solving the central problem of model uncertainty in a science of non-experimental data. The literature has entered an exciting phase with theoretical development, methodological reflection, considerable technological strides on the computing front and interesting empirical applications providing momentum for this branch of econometrics. The organising principle for this discussion of data mining is a philosophical spectrum that sorts the various econometric traditions according to their epistemological assumptions (about the underlying data-generating-process DGP) starting with nihilism at one end and reaching claims of encompassing the DGP at the other end; call it the DGP-spectrum. In the course of exploring this spectrum the reader will encounter various Bayesian, specific-to-general (S-G) as well general-to-specific (G-S) methods. To set the stage for this exploration the paper starts with a description of data mining, its potential risks and a short section on potential institutional safeguards to these problems.Data mining, model selection, automated model selection, general to specific modelling, extreme bounds analysis, Bayesian model selection

    Astroinformatics, data mining and the future of astronomical research

    Get PDF
    Astronomy, as many other scientific disciplines, is facing a true data deluge which is bound to change both the praxis and the methodology of every day research work. The emerging field of astroinformatics, while on the one end appears crucial to face the technological challenges, on the other is opening new exciting perspectives for new astronomical discoveries through the implementation of advanced data mining procedures. The complexity of astronomical data and the variety of scientific problems, however, call for innovative algorithms and methods as well as for an extreme usage of ICT technologies.Comment: To appear in the Proceedings of the 2-nd International Conference on Frontiers on diagnostic technologie

    Zugzwangs in chess studies

    Get PDF
    Van der Heijden’s ENDGAME STUDY DATABASE IV, HHDBIV, is the definitive collection of 76,132 chess studies. The zugzwang position or zug, one in which the side to move would prefer not to, is a frequent theme in the literature of chess studies. In this third data-mining of HHDBIV, we report on the occurrence of sub-7-man zugs there as discovered by the use of CQL and Nalimov endgame tables (EGTs). We also mine those Zugzwang Studies in which a zug more significantly appears in both its White-to-move (wtm) and Black-to-move (btm) forms. We provide some illustrative and extreme examples of zugzwangs in studies

    Post-processing of association rules.

    Get PDF
    In this paper, we situate and motivate the need for a post-processing phase to the association rule mining algorithm when plugged into the knowledge discovery in databases process. Major research effort has already been devoted to optimising the initially proposed mining algorithms. When it comes to effectively extrapolating the most interesting knowledge nuggets from the standard output of these algorithms, one is faced with an extreme challenge, since it is not uncommon to be confronted with a vast amount of association rules after running the algorithms. The sheer multitude of generated rules often clouds the perception of the interpreters. Rightful assessment of the usefulness of the generated output introduces the need to effectively deal with different forms of data redundancy and data being plainly uninteresting. In order to do so, we will give a tentative overview of some of the main post-processing tasks, taking into account the efforts that have already been reported in the literature.

    Data Mining and Life Science: A Survey

    Get PDF
    As we are into the age of digital information, the problem of data overload emerges so worryingly ahead. Our ability to analyze and understand immense datasets wrap extreme behind our ability together and stores the data. But a new age group of computational techniques and tools is required to support the extraction of useful knowledge from the rapidly increasing volumes of data. These techniques and tools are the focus of emerging fields of Knowledge Discovery in Databases (KDD) and also called data mining. Data mining is highly noticeable in the fields like marketing, e-commerce or e-business or the fame of its use in KDD in other sectors or industries also. Among these sectors that are just discovering data mining are the fields of medicine and public health also. This research paper provides a survey of current technique of data mining/KDD for healthcare
    corecore