69 research outputs found

    The need for open source software in machine learning

    No full text
    Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for diverse applications. However, the true potential of these methods is not used, since existing implementations are not openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be significantly improved by increasing incentives for researchers to publish their software under an open source model. Additionally, we outline the problems authors are faced with when trying to publish algorithmic implementations of machine learning methods. We believe that a resource of peer reviewed software accompanied by short articles would be highly valuable to both the machine learning and the general scientific community

    Replicability is not Reproducibility:\ud Nor is it Good Science

    Get PDF
    At various machine learning conferences, at various times, there have been discussions arising from the inability to replicate the experimental results published in a paper. There seems to be a wide spread view that we need to do something to address this problem, as it is essential to the advancement of our field. The most compelling argument would seem to be that reproducibility of experimental results is the hallmark of science. Therefore, given that most of us regard machine learning as a scientific discipline, being able to replicate experiments is paramount. I want to challenge this view by separating the notion of reproducibility, a generally desirable property, from replicability, its poor cousin. I claim there are important differences between the two. Reproducibility requires changes; replicability avoids them. Although reproducibility is desirable, I contend that the impoverished version, replicability, is one not worth having

    OpenML: networked science in machine learning

    Full text link
    Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems. We discuss how OpenML relates to other examples of networked science and what benefits it brings for machine learning research, individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure

    The Importance of Reproduction in Evidence Based Policing: A Comment

    Get PDF
    In the following comment, the author examines the importance of reproduction in evidence based policing. Moreover, she argues that failure to reproduce studies – including through the use of varied methodologies – is antithetical to the development of a solid evidence base upon which to ground effective and efficient community safety practices

    JCLAL: A Java framework for active learning

    Get PDF
    Active Learning has become an important area of research owing to the increasing number of real-world problems which contain labelled and unlabelled examples at the same time. JCLAL is a Java Class Library for Active Learning which has an architecture that follows strong principles of object-oriented design. It is easy to use, and it allows the developers to adapt, modify and extend the framework according to their needs. The library offers a variety of active learning methods that have been proposed in the literature. The software is available under the GPL license

    PyPop7: A Pure-Python Library for Population-Based Black-Box Optimization

    Full text link
    In this paper, we present a pure-Python open-source library, called PyPop7, for black-box optimization (BBO). It provides a unified and modular interface for more than 60 versions and variants of different black-box optimization algorithms, particularly population-based optimizers, which can be classified into 12 popular families: Evolution Strategies (ES), Natural Evolution Strategies (NES), Estimation of Distribution Algorithms (EDA), Cross-Entropy Method (CEM), Differential Evolution (DE), Particle Swarm Optimizer (PSO), Cooperative Coevolution (CC), Simulated Annealing (SA), Genetic Algorithms (GA), Evolutionary Programming (EP), Pattern Search (PS), and Random Search (RS). It also provides many examples, interesting tutorials, and full-fledged API documentations. Through this new library, we expect to provide a well-designed platform for benchmarking of optimizers and promote their real-world applications, especially for large-scale BBO. Its source code and documentations are available at https://github.com/Evolutionary-Intelligence/pypop and https://pypop.readthedocs.io/en/latest, respectively.Comment: 5 page

    Free and Open Source Software underpinning the European Forest Data Centre

    Get PDF
    Worldwide, governments are growingly focusing on free and open source software (FOSS) as a move toward transparency and the freedom to run, copy, study, change and improve the software. The European Commission (EC) is also supporting the development of FOSS [...]. In addition to the financial savings, FOSS contributes to scientific knowledge freedom in computational science (CS) and is increasingly rewarded in the science-policy interface within the emerging paradigm of open science. Since complex computational science applications may be affected by software uncertainty, FOSS may help to mitigate part of the impact of software errors by CS community- driven open review, correction and evolution of scientific code. The continental scale of EC science-based policy support implies wide networks of scientific collaboration. Thematic information systems also may benefit from this approach within reproducible integrated modelling. This is supported by the EC strategy on FOSS: "for the development of new information systems, where deployment is foreseen by parties outside of the EC infrastructure, [F]OSS will be the preferred choice and in any case used whenever possible". The aim of this contribution is to highlight how a continental scale information system may exploit and integrate FOSS technologies within the transdisciplinary research underpinning such a complex system. A European example is discussed where FOSS innervates both the structure of the information system itself and the inherent transdisciplinary research for modelling the data and information which constitute the system content. [...
    • …
    corecore