Search CORE

1,547 research outputs found

Revisiting the Sequential Symbolic Regression Genetic Programming

Author: Miranda Luis F.
Oliveira Luiz Otavio V.B.
Otero Fernando E.B.
Pappa Gisele L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/07/2016
Field of study

Sequential Symbolic Regression (SSR) is a technique that recursively induces functions over the error of the current solution, concatenating them in an attempt to reduce the error of the resulting model. As proof of concept, the method was previously evaluated in one-dimensional problems and compared with canonical Genetic Programming (GP) and Geometric Semantic Genetic Programming (GSGP). In this paper we revisit SSR exploring the method behaviour in higher dimensional, larger and more heterogeneous datasets. We discuss the difficulties arising from the application of the method to more complex problems, e.g., overfitting, along with suggestions to overcome them. An experimental analysis was conducted comparing SSR to GP and GSGP, showing SSR solutions are smaller than those generated by the GSGP with similar performance and more accurate than those generated by the canonical GP

Kent Academic Repository

Recommended from our members

State-of-the-art on research and applications of machine learning in the building life cycle

Author: Hong T
Luo X
Wang Z
Zhang W
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science

eScholarship - University of California

Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning

Author: Ao Xiang
He Jia
He Qing
Pan Feiyang
Tu Dandan
Xue Hongyan
Yu Shuo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/05/2023
Field of study

In the field of quantitative trading, it is common practice to transform raw historical stock data into indicative signals for the market trend. Such signals are called alpha factors. Alphas in formula forms are more interpretable and thus favored by practitioners concerned with risk. In practice, a set of formulaic alphas is often used together for better modeling precision, so we need to find synergistic formulaic alpha sets that work well together. However, most traditional alpha generators mine alphas one by one separately, overlooking the fact that the alphas would be combined later. In this paper, we propose a new alpha-mining framework that prioritizes mining a synergistic set of alphas, i.e., it directly uses the performance of the downstream combination model to optimize the alpha generator. Our framework also leverages the strong exploratory capabilities of reinforcement learning~(RL) to better explore the vast search space of formulaic alphas. The contribution to the combination models' performance is assigned to be the return used in the RL process, driving the alpha generator to find better alphas that improve upon the current set. Experimental evaluations on real-world stock market data demonstrate both the effectiveness and the efficiency of our framework for stock trend forecasting. The investment simulation results show that our framework is able to achieve higher returns compared to previous approaches.Comment: Accepted by KDD '23, ADS trac

arXiv.org e-Print Archive

A Framework Based on Symbolic Regression Coupled with eXtended Physics-Informed Neural Networks for Gray-Box Learning of Equations of Motion from Data

Author: Karniadakis George Em
Karttunen Mikko
Kiyani Elham
Shukla Khemraj
Publication venue
Publication date: 18/05/2023
Field of study

We propose a framework and an algorithm to uncover the unknown parts of nonlinear equations directly from data. The framework is based on eXtended Physics-Informed Neural Networks (X-PINNs), domain decomposition in space-time, but we augment the original X-PINN method by imposing flux continuity across the domain interfaces. The well-known Allen-Cahn equation is used to demonstrate the approach. The Frobenius matrix norm is used to evaluate the accuracy of the X-PINN predictions and the results show excellent performance. In addition, symbolic regression is employed to determine the closed form of the unknown part of the equation from the data, and the results confirm the accuracy of the X-PINNs based approach. To test the framework in a situation resembling real-world data, random noise is added to the datasets to mimic scenarios such as the presence of thermal noise or instrument errors. The results show that the framework is stable against significant amount of noise. As the final part, we determine the minimal amount of data required for training the neural network. The framework is able to predict the correct form and coefficients of the underlying dynamical equation when at least 50\% data is used for training

arXiv.org e-Print Archive

An Integrated Low-Cost Agent-Based Environment for Simulating And Validating the Design of Interesting Behaviours for Robots

Author: Aslam Syed Kaleem
Publication venue
Publication date: 18/05/2022
Field of study

Bangor University Research Portal

ReFixar: Multi-version Reasoning for Automated Repair of Regression Errors

Author: Le QL
Le XBD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/02/2022
Field of study

Software programs evolve naturally as part of the ever-changing customer needs and fast-paced market. Software evolution, however, often introduces regression bugs, which un-duly break previously working functionalities of the software. To repair regression bugs, one needs to know when and where a bug emerged from, e.g., the bug-inducing code changes, to narrow down the search space. Unfortunately, existing state-of-the-art automated program repair (APR) techniques have not yet fully exploited this information, rendering them less efficient and effective to navigate through a potentially large search space containing many plausible but incorrect solutions. In this work, we revisit APR on repairing regression errors in Java programs. We empirically show that existing state-of-the-art APR techniques do not perform well on regression bugs due to their algorithm design and lack of knowledge on bug inducing changes. We subsequently present ReFixar, a novel repair technique that leverages software evolution history to generate high quality patches for Java regression bugs. The key novelty that empowers ReFixar to more efficiently and effectively traverse the search space is two-fold: (1) A systematic way for multi-version reasoning to capture how a software evolves through its history, and (2) A novel search algorithm over a set of generic repair templates, derived from the principle of incorrectness logic and informed by both past bug fixes and their bug-inducing code changes; this enables ReFixar to achieve a balance of both genericity and specificity, i.e., generic common fix patterns of bugs and their specific contexts. We compare ReFixar against the state-of-the-art APR techniques on a data set of 51 real regression bugs from 28 large real-world programs. Experiments show that ReFixar significantly outperforms the best baseline by a large margin, i.e., ReFixar can fix correctly 24 bugs while the best baseline can only correctly fix 9 bugs

UCL Discovery

Systems for AutoML Research

Author: Gijsbers Pieter
Publication venue: Eindhoven University of Technology
Publication date: 19/05/2022
Field of study

Pure OAI Repository

Past Before Future: A Comprehensive Review on Software Defined Networks Road Map

Author: Windhya Rankothge
Publication venue: Global Journals Inc. (US)
Publication date: 15/01/2019
Field of study

Software Defined Networking (SDN) is a paradigm that moves out the network switch2019;s control plane (routing protocols) from the switch and leaves only the data plane (user traffic) inside the switch. Since the control plane has been decoupled from hardware and given to a logically centralized software application called a controller; network devices become simple packet forwarding devices that can be programmed via open interfaces. The SDN2019;s concepts: decoupled control logic and programmable networks provide a range of benefits for management process and has gained significant attention from both academia and industry. Since the SDN field is growing very fast, it is an active research area. This review paper discusses the state of art in SDN, with a historic perspective of the field by describing the SDN paradigm, architecture and deployments in detail

Global Journal of Computer Science and Technology (GJCST)

Discovering Causal Relations and Equations from Data

Author: Balaguer-Ballester Emili
Camps-Valls Gustau
Diaz Emiliano
Gerhardus Andreas
Martius Georg
Ninad Urmi
Runge Jakob
Varando Gherardo
Vinuesa Ricardo
Zanna Laure
Publication venue
Publication date: 21/05/2023
Field of study

Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.Comment: 137 page

arXiv.org e-Print Archive