11 research outputs found

    Analyzing supersaturated designs by means of an information based criterion

    No full text

    Neural networks for prediction of trauma victim's outcome. Comparison with the TRISS and Revised Trauma Score

    No full text

    Analyzing supersaturated designs with entropic measures

    No full text
    A supersaturated design is a design for which there are fewer runs than effects to be estimated. In this paper, we propose a method for screening out the important factors from a large set of potentially active variables, based on an information theoretical approach. Three entropy measures: Rényi entropy, Tsallis entropy and Havrda–Charvát entropy, have been associated with the measure of information gain, in order to identify the significant factors using data and assuming generalized linear models. The investigation of the proposed method performance and the comparison of each entropic measure application have been accomplished through simulation experiments. A noteworthy advantage of this paper is the use of generalized linear models for analyzing data from supersaturated designs, a fact that, to the best of our knowledge, has not yet been studied.<br/

    Supersaturated plans for variable selection in large databases

    No full text
    Over the last decades, the collection and storage of data has become massive with the advance of technology and variable selection has become a fundamental tool to large dimensional statistical modelling problems. In this study we implement data mining techniques, metaheuristics and use experimental designs in databases in order to determine the most relevant variables for classification in regression problems in cases where observations and labels of a large database are available. We propose a database-driven scheme for the encryption of specific fields of a database in order to select an optimal supersaturated design consisting of the variables of a large database which have been found to influence significantly the response outcome. The proposed design selection approach is quite promising, since we are able to retrieve an optimal supersaturated plan using a very small percentage of the available runs, a fact that makes the statistical analysis of a large database computationally feasible and affordable
    corecore