Search CORE

6,894 research outputs found

Population Subset Selection for the Use of a Validation Dataset for Overfitting Control in Genetic Programming

Author: Fernández-Blanco Enrique
Fernández-Lozano Carlos
Pazos A.
Rivero Daniel
Publication venue: 'Informa UK Limited'
Publication date: 31/07/2019
Field of study

[Abstract] Genetic Programming (GP) is a technique which is able to solve different problems through the evolution of mathematical expressions. However, in order to be applied, its tendency to overfit the data is one of its main issues. The use of a validation dataset is a common alternative to prevent overfitting in many Machine Learning (ML) techniques, including GP. But, there is one key point which differentiates GP and other ML techniques: instead of training a single model, GP evolves a population of models. Therefore, the use of the validation dataset has several possibilities because any of those evolved models could be evaluated. This work explores the possibility of using the validation dataset not only on the training-best individual but also in a subset with the training-best individuals of the population. The study has been conducted with 5 well-known databases performing regression or classification tasks. In most of the cases, the results of the study point out to an improvement when the validation dataset is used on a subset of the population instead of only on the training-best individual, which also induces a reduction on the number of nodes and, consequently, a lower complexity on the expressions.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431D 2017/23Instituto de Salud Carlos III; PI17/0182

Repositorio da Universidade da Coruña

Design Optimization Utilizing Dynamic Substructuring and Artificial Intelligence Techniques

Author: Akcay Perdahcioglu D.
Boer A. de
Ellenbroek M.H.M.
Hoogt P.J.M. van der
Publication venue
Publication date: 01/01/2008
Field of study

In mechanical and structural systems, resonance may cause large strains and stresses which can lead to the failure of the system. Since it is often not possible to change the frequency content of the external load excitation, the phenomenon can only be avoided by updating the design of the structure. In this paper, a design optimization strategy based on the integration of the Component Mode Synthesis (CMS) method with numerical optimization techniques is presented. For reasons of numerical efficiency, a Finite Element (FE) model is represented by a surrogate model which is a function of the design parameters. The surrogate model is obtained in four steps: First, the reduced FE models of the components are derived using the CMS method. Then the components are aassembled to obtain the entire structural response. Afterwards the dynamic behavior is determined for a number of design parameter settings. Finally, the surrogate model representing the dynamic behavior is obtained. In this research, the surrogate model is determined using the Backpropagation Neural Networks which is then optimized using the Genetic Algorithms and Sequential Quadratic Programming method. The application of the introduced techniques is demonstrated on a simple test problem

University of Twente Research Information

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Recommended from our members

Local search: A guide for the information retrieval practitioner

Author: Abramson
Althofer
Andrew MacFarlane
Andrew Tuson
Baeck
Battiti
Boughanem
Cartwright
Chen
Chen
Chen
Cleverdon
Collins
Cordon
Cordon
Corne
Darwin
Dorigo
Downsland
Dueck
Fan
Fan
Fan
Fan
Feo
Fernandez-Villacanas Martin
Fogel
Fogel
Frakes
Frakes
Garey
Glover
Glover
Glover
Goldberg
Hajek
Harman
Harman
Harman
Harman
Hasan
Hawking
Hertz
Hertz
Holland
Hooker
Horng
Kekäläinen
Kirkpatrick
Koza
Kuflik
Lam
Lopez-Pujalte
Lopez-Pujalte
Lopez-Pujalte
Luke
Lundy
Martin-Bautisata
Masters
Michalewicz
Mock
Mock
Newell
Ogbu
Oliveira
Osman
Osman
Osman
Osman
Papadimitriou
Pohlheim
Rechenburg
Reeves
Reeves
Robertson
Sebastiani
Semet
Sinclair
Smith
Sparck Jones
Stefik
Tamine
Thangiah
Trotman
Van Laarhoven
Vrajitoru
Wartik
Yang
Zweben
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR

City Research Online

Crossref

Machine Learning for Fluid Mechanics

Author: Brunton Steven
Koumoutsakos Petros
Noack Bernd
Publication venue: 'Annual Reviews'
Publication date: 04/01/2020
Field of study

The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202

arXiv.org e-Print Archive