31,488 research outputs found

    Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems

    Get PDF
    In the specialized literature, researchers can find a large number of proposals for solving regression problems that come from different research areas. However, researchers tend to use only proposals from the area in which they are experts. This paper analyses the performance of a large number of the available regression algorithms from some of the most known and widely used software tools in order to help non-expert users from other areas to properly solve their own regression problems and to help specialized researchers developing well-founded future proposals by properly comparing and identifying algorithms that will enable them to focus on significant further developments. To sum up, we have analyzed 164 algorithms that come from 14 main different families available in 6 software tools (Neural Networks, Support Vector Machines, Regression Trees, Rule-Based Methods, Stacking, Random Forests, Model trees, Generalized Linear Models, Nearest Neighbor methods, Partial Least Squares and Principal Component Regression, Multivariate Adaptive Regression Splines, Bagging, Boosting, and other methods) over 52 datasets. A new measure has also been proposed to show the goodness of each algorithm with respect to the others. Finally, a statistical analysis by non-parametric tests has been carried out over all the algorithms and on the best 30 algorithms, both with and without bagging. Results show that the algorithms from Random Forest, Model Tree and Support Vector Machine families get the best positions in the rankings obtained by the statistical tests when bagging is not considered. In addition, the use of bagging techniques significantly improves the performance of the algorithms without excessive increase in computational times.This work was supported in part by the University of CĂłrdoba under the project PPG2019-UCOSOCIAL-03, and in part by the Spanish Ministry of Science, Innovation and Universities under Grant TIN2015- 68454-R and Grant TIN2017-89517-P

    Data-driven design of intelligent wireless networks: an overview and tutorial

    Get PDF
    Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves

    Data Driven Surrogate Based Optimization in the Problem Solving Environment WBCSim

    Get PDF
    Large scale, multidisciplinary, engineering designs are always difficult due to the complexity and dimensionality of these problems. Direct coupling between the analysis codes and the optimization routines can be prohibitively time consuming due to the complexity of the underlying simulation codes. One way of tackling this problem is by constructing computationally cheap(er) approximations of the expensive simulations, that mimic the behavior of the simulation model as closely as possible. This paper presents a data driven, surrogate based optimization algorithm that uses a trust region based sequential approximate optimization (SAO) framework and a statistical sampling approach based on design of experiment (DOE) arrays. The algorithm is implemented using techniques from two packages—SURFPACK and SHEPPACK that provide a collection of approximation algorithms to build the surrogates and three different DOE techniques—full factorial (FF), Latin hypercube sampling (LHS), and central composite design (CCD)—are used to train the surrogates. The results are compared with the optimization results obtained by directly coupling an optimizer with the simulation code. The biggest concern in using the SAO framework based on statistical sampling is the generation of the required database. As the number of design variables grows, the computational cost of generating the required database grows rapidly. A data driven approach is proposed to tackle this situation, where the trick is to run the expensive simulation if and only if a nearby data point does not exist in the cumulatively growing database. Over time the database matures and is enriched as more and more optimizations are performed. Results show that the proposed methodology dramatically reduces the total number of calls to the expensive simulation runs during the optimization process

    VI Workshop on Computational Data Analysis and Numerical Methods: Book of Abstracts

    Get PDF
    The VI Workshop on Computational Data Analysis and Numerical Methods (WCDANM) is going to be held on June 27-29, 2019, in the Department of Mathematics of the University of Beira Interior (UBI), CovilhĂŁ, Portugal and it is a unique opportunity to disseminate scientific research related to the areas of Mathematics in general, with particular relevance to the areas of Computational Data Analysis and Numerical Methods in theoretical and/or practical field, using new techniques, giving especial emphasis to applications in Medicine, Biology, Biotechnology, Engineering, Industry, Environmental Sciences, Finance, Insurance, Management and Administration. The meeting will provide a forum for discussion and debate of ideas with interest to the scientific community in general. With this meeting new scientific collaborations among colleagues, namely new collaborations in Masters and PhD projects are expected. The event is open to the entire scientific community (with or without communication/poster)
    • …
    corecore