28 research outputs found

    Regularized model learning in EDAs for continuous and multi-objective optimization

    Get PDF
    Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods

    Global plant characterisation and distribution with evolution and climate

    Get PDF
    Since Arrhenius published seminal work in 1921, research interest in the description of plant traits and grouped characteristics of plant species has grown, underpinning diversity in trophic levels. Geographic exploration and diversity studies prior to and after 1921 culminated in biological, chemical and computer-simulated approaches describing rudiments of growth patterns within dynamic conditions of Earth. This thesis has two parts:- classical theory and multidisciplinary fusion to give mathematical strength to characterising plant species in space and time.Individual plant species occurrences are used to obtain a Species-Area Relationship. The use of both Boolean and logic-based mathematics is then integrated to describe classical methods and propose fuzzy logic control to predict species ordination. Having demonstrated a lack of significance between species and area for data modelled in this thesis a logic based approach is taken. Mamdani and T-S-K fuzzy system stability is verified by application to individual plant occurrences, validated by a multiple interfaced data portal. Quantitative mathematical models are differentiated with a genetic programming approach, enabling visualisation of multi-objective dispersal of plant strategies, plant metabolism and life-forms within the water-energy dynamic of a fixed time-scale scenario. The distributions of plant characteristics are functionally enriched through the use of Gaussian process models. A generic framework of a Geographic Information System is used to visualise distributions and it is noted that such systems can be used to assist in design and implementation of policies. The study has made use of field based data and the application of mathematic methods is shown to be appropriate and generative in the description of characteristics of plant species, with the aim of application of plant strategies, life-forms and photosynthetic types to a global framework. Novel application of fuzzy logic and related mathematic method to plant distribution and characteristics has been shown on a global scale. Quantification of the uncertainty gives novel insight through consequent trophic levels of biological systems, with great relevance to mathematic and geographic subject development. Informative value of Z matrices of plant distribution is increased substantiating sustainability and conservation policy value to ecosystems and human populations dependent upon them for their needs.Key words: sustainability, conservation policy, Boolean and logic-based, fuzzy logic, genetic programming, multi-objective dispersal, strategies, metabolism, life-forms

    SIS 2017. Statistics and Data Science: new challenges, new generations

    Get PDF
    The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data

    Solving Multi-objective Integer Programs using Convex Preference Cones

    Get PDF
    Esta encuesta tiene dos objetivos: en primer lugar, identificar a los individuos que fueron víctimas de algún tipo de delito y la manera en que ocurrió el mismo. En segundo lugar, medir la eficacia de las distintas autoridades competentes una vez que los individuos denunciaron el delito que sufrieron. Adicionalmente la ENVEI busca indagar las percepciones que los ciudadanos tienen sobre las instituciones de justicia y el estado de derecho en Méxic

    Fuzzy EOQ Model with Trapezoidal and Triangular Functions Using Partial Backorder

    Get PDF
    EOQ fuzzy model is EOQ model that can estimate the cost from existing information. Using trapezoid fuzzy functions can estimate the costs of existing and trapezoid membership functions has some points that have a value of membership . TR ̃C value results of trapezoid fuzzy will be higher than usual TRC value results of EOQ model . This paper aims to determine the optimal amount of inventory in the company, namely optimal Q and optimal V, using the model of partial backorder will be known optimal Q and V for the optimal number of units each time a message . EOQ model effect on inventory very closely by using EOQ fuzzy model with triangular and trapezoid membership functions with partial backorder. Optimal Q and optimal V values for the optimal fuzzy models will have an increase due to the use of trapezoid and triangular membership functions that have a different value depending on the requirements of each membership function value. Therefore, by using a fuzzy model can solve the company's problems in estimating the costs for the next term

    Essentials of Business Analytics

    Get PDF
    corecore