6 research outputs found

    SUNNY-CP and the MiniZinc Challenge

    Get PDF
    In Constraint Programming (CP) a portfolio solver combines a variety of different constraint solvers for solving a given problem. This fairly recent approach enables to significantly boost the performance of single solvers, especially when multicore architectures are exploited. In this work we give a brief overview of the portfolio solver sunny-cp, and we discuss its performance in the MiniZinc Challenge---the annual international competition for CP solvers---where it won two gold medals in 2015 and 2016. Under consideration in Theory and Practice of Logic Programming (TPLP)Comment: Under consideration in Theory and Practice of Logic Programming (TPLP

    SUNNY-CP: a Portfolio Solver for Constraint Programming

    Get PDF
    In Constraint Programming (CP) a portfolio solver combines a variety of different constraint solvers for solving a given problem. This fairly recent approach enables to significantly boost the performance of single solvers, especially when multicore architectures are exploited. In this work we give a brief overview of the portfolio solver sunny-cp, and we discuss its performance in the last MiniZinc Challenge —the annual international competition for CP solvers— where it won a gold medal

    Universal performance bounds of restart

    Full text link
    As has long been known to computer scientists, the performance of probabilistic algorithms characterized by relatively large runtime fluctuations can be improved by applying a restart, i.e., episodic interruption of a randomized computational procedure followed by initialization of its new statistically independent realization. A similar effect of restart-induced process acceleration could potentially be possible in the context of enzymatic reactions, where dissociation of the enzyme-substrate intermediate corresponds to restarting the catalytic step of the reaction. To date, a significant number of analytical results have been obtained in physics and computer science regarding the effect of restart on the completion time statistics in various model problems, however, the fundamental limits of restart efficiency remain unknown. Here we derive a range of universal statistical inequalities that offer constraints on the effect that restart could impose on the completion time of a generic stochastic process. The corresponding bounds are expressed via simple statistical metrics of the original process such as harmonic mean hh, median value mm and mode MM, and, thus, are remarkably practical. We test our analytical predictions with multiple numerical examples, discuss implications arising from them and important avenues of future work.Comment: 12 pages, 2 figure

    sunny-as2: Enhancing SUNNY for Algorithm Selection

    Get PDF
    SUNNY is an Algorithm Selection (AS) technique originally tailored for Constraint Programming (CP). SUNNY enables to schedule, from a portfolio of solvers, a subset of solvers to be run on a given CP problem. This approach has proved to be effective for CP problems, and its parallel version won many gold medals in the Open category of the MiniZinc Challenge -- the yearly international competition for CP solvers. In 2015, the ASlib benchmarks were released for comparing AS systems coming from disparate fields (e.g., ASP, QBF, and SAT) and SUNNY was extended to deal with generic AS problems. This led to the development of sunny-as2, an algorithm selector based on SUNNY for ASlib scenarios. A preliminary version of sunny-as2 was submitted to the Open Algorithm Selection Challenge (OASC) in 2017, where it turned out to be the best approach for the runtime minimization of decision problems. In this work, we present the technical advancements of sunny-as2, including: (i) wrapper-based feature selection; (ii) a training approach combining feature selection and neighbourhood size configuration; (iii) the application of nested cross-validation. We show how sunny-as2 performance varies depending on the considered AS scenarios, and we discuss its strengths and weaknesses. Finally, we also show how sunny-as2 improves on its preliminary version submitted to OASC

    On the enhancement of Big Data Pipelines through Data Preparation, Data Quality, and the distribution of Optimisation Problems

    Get PDF
    Nowadays, data are fundamental for companies, providing operational support by facilitating daily transactions. Data has also become the cornerstone of strategic decision-making processes in businesses. For this purpose, there are numerous techniques that allow to extract knowledge and value from data. For example, optimisation algorithms excel at supporting decision-making processes to improve the use of resources, time and costs in the organisation. In the current industrial context, organisations usually rely on business processes to orchestrate their daily activities while collecting large amounts of information from heterogeneous sources. Therefore, the support of Big Data technologies (which are based on distributed environments) is required given the volume, variety and speed of data. Then, in order to extract value from the data, a set of techniques or activities is applied in an orderly way and at different stages. This set of techniques or activities, which facilitate the acquisition, preparation, and analysis of data, is known in the literature as Big Data pipelines. In this thesis, the improvement of three stages of the Big Data pipelines is tackled: Data Preparation, Data Quality assessment, and Data Analysis. These improvements can be addressed from an individual perspective, by focussing on each stage, or from a more complex and global perspective, implying the coordination of these stages to create data workflows. The first stage to improve is the Data Preparation by supporting the preparation of data with complex structures (i.e., data with various levels of nested structures, such as arrays). Shortcomings have been found in the literature and current technologies for transforming complex data in a simple way. Therefore, this thesis aims to improve the Data Preparation stage through Domain-Specific Languages (DSLs). Specifically, two DSLs are proposed for different use cases. While one of them is a general-purpose Data Transformation language, the other is a DSL aimed at extracting event logs in a standard format for process mining algorithms. The second area for improvement is related to the assessment of Data Quality. Depending on the type of Data Analysis algorithm, poor-quality data can seriously skew the results. A clear example are optimisation algorithms. If the data are not sufficiently accurate and complete, the search space can be severely affected. Therefore, this thesis formulates a methodology for modelling Data Quality rules adjusted to the context of use, as well as a tool that facilitates the automation of their assessment. This allows to discard the data that do not meet the quality criteria defined by the organisation. In addition, the proposal includes a framework that helps to select actions to improve the usability of the data. The third and last proposal involves the Data Analysis stage. In this case, this thesis faces the challenge of supporting the use of optimisation problems in Big Data pipelines. There is a lack of methodological solutions that allow computing exhaustive optimisation problems in distributed environments (i.e., those optimisation problems that guarantee the finding of an optimal solution by exploring the whole search space). The resolution of this type of problem in the Big Data context is computationally complex, and can be NP-complete. This is caused by two different factors. On the one hand, the search space can increase significantly as the amount of data to be processed by the optimisation algorithms increases. This challenge is addressed through a technique to generate and group problems with distributed data. On the other hand, processing optimisation problems with complex models and large search spaces in distributed environments is not trivial. Therefore, a proposal is presented for a particular case in this type of scenario. As a result, this thesis develops methodologies that have been published in scientific journals and conferences.The methodologies have been implemented in software tools that are integrated with the Apache Spark data processing engine. The solutions have been validated through tests and use cases with real datasets
    corecore