54 research outputs found

    Human-in-the-Loop AI Reviewing: Feasibility, Opportunities, and Risks

    Get PDF
    The promise of AI for academic work is bewitching and easy to envisage, but the risks involved are often hard to detect and usually not readily exposed. In this opinion piece, we explore the feasibility, opportunities, and risks of using large language models (LLMs) for reviewing academic submissions, while keeping the human in the loop. We experiment with GPT-4 in the role of a reviewer to demonstrate the opportunities and the risks we experience and ways to mitigate them. The reviews are structured according to a conference review form with the dual purpose of evaluating submissions for editorial decisions and providing authors with constructive feedback according to predefined criteria, which include contribution, soundness, and presentation. We demonstrate feasibility by evaluating and comparing LLM reviews with human reviews, concluding that current AI-augmented reviewing is sufficiently accurate to alleviate the burden of reviewing but not completely and not for all cases. We then enumerate the opportunities of AI-augmented reviewing and present open questions. Next, we identify the risks of AI-augmented reviewing, highlighting bias, value misalignment, and misuse. We conclude with recommendations for managing these risks

    Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar

    Get PDF
    Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks. In the spirit of reproducible research we make our data, models, and code publicly available.Comment: ICML Workshop on Automated Machine Learnin

    DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering

    Full text link
    Automatic machine learning (AutoML) is an area of research aimed at automating machine learning (ML) activities that currently require human experts. One of the most challenging tasks in this field is the automatic generation of end-to-end ML pipelines: combining multiple types of ML algorithms into a single architecture used for end-to-end analysis of previously-unseen data. This task has two challenging aspects: the first is the need to explore a large search space of algorithms and pipeline architectures. The second challenge is the computational cost of training and evaluating multiple pipelines. In this study we present DeepLine, a reinforcement learning based approach for automatic pipeline generation. Our proposed approach utilizes an efficient representation of the search space and leverages past knowledge gained from previously-analyzed datasets to make the problem more tractable. Additionally, we propose a novel hierarchical-actions algorithm that serves as a plugin, mediating the environment-agent interaction in deep reinforcement learning problems. The plugin significantly speeds up the training process of our model. Evaluation on 56 datasets shows that DeepLine outperforms state-of-the-art approaches both in accuracy and in computational cost

    Multiscale Representations for Manifold-Valued Data

    Get PDF
    We describe multiscale representations for data observed on equispaced grids and taking values in manifolds such as the sphere S2S^2, the special orthogonal group SO(3)SO(3), the positive definite matrices SPD(n)SPD(n), and the Grassmann manifolds G(n,k)G(n,k). The representations are based on the deployment of Deslauriers--Dubuc and average-interpolating pyramids "in the tangent plane" of such manifolds, using the ExpExp and LogLog maps of those manifolds. The representations provide "wavelet coefficients" which can be thresholded, quantized, and scaled in much the same way as traditional wavelet coefficients. Tasks such as compression, noise removal, contrast enhancement, and stochastic simulation are facilitated by this representation. The approach applies to general manifolds but is particularly suited to the manifolds we consider, i.e., Riemannian symmetric spaces, such as Sn−1S^{n-1}, SO(n)SO(n), G(n,k)G(n,k), where the ExpExp and LogLog maps are effectively computable. Applications to manifold-valued data sources of a geometric nature (motion, orientation, diffusion) seem particularly immediate. A software toolbox, SymmLab, can reproduce the results discussed in this paper
    • …
    corecore