Search CORE

850,732 research outputs found

Active learning for approximation of expensive functions with normal distributed output uncertainty

Author: Couckuyt Ivo
Deschrijver Dirk
Dhaene Tom
van der Herten Joachim
Publication venue
Publication date: 01/01/2016
Field of study

When approximating a black-box function, sampling with active learning focussing on regions with non-linear responses tends to improve accuracy. We present the FLOLA-Voronoi method introduced previously for deterministic responses, and theoretically derive the impact of output uncertainty. The algorithm automatically puts more emphasis on exploration to provide more information to the models

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Active Learning and Best-Response Dynamics

Author: Balcan Maria-Florina
Berlind Chris
Blum Avrim
Cohen Emma
Patnaik Kaushik
Song Le
Publication venue
Publication date: 25/06/2014
Field of study

We examine an important setting for engineered systems in which low-power distributed sensors are each making highly noisy measurements of some unknown target function. A center wants to accurately learn this function by querying a small number of sensors, which ordinarily would be impossible due to the high noise rate. The question we address is whether local communication among sensors, together with natural best-response dynamics in an appropriately-defined game, can denoise the system without destroying the true signal and allow the center to succeed from only a small number of active queries. By using techniques from game theory and empirical processes, we prove positive (and negative) results on the denoising power of several natural dynamics. We then show experimentally that when combined with recent agnostic active learning algorithms, this process can achieve low error from very few queries, performing substantially better than active or passive learning without these denoising dynamics as well as passive learning with denoising

arXiv.org e-Print Archive

CiteSeerX

Collaborative analysis of multi-gigapixel imaging data using Cytomine

Author: Benjamin Stévens
Carpenter
de Souza
Gilles Louppe
Jean-Michel Begon
Leroi
Louis Wehenkel
Loïc Rollus
Marée
Philipp Kainz
Pierre Geurts
Raphaël Marée
Renaud Hoyoux
Rémy Vandaele
Weekers
Publication venue: 'Oxford University Press (OUP)'
Publication date: 10/01/2016
Field of study

Motivation: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries. Results: We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explore, share and analyze (semantically and quantitatively) multi-gigapixel imaging data over the internet. We illustrate how it has been used in several biomedical applications

Central Archive at the University of Reading

Crossref

PubMed Central

Open Repository and Bibliography - Liège

Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

Author: Pardede Eric
Pratama Mahardhika
Za'in Choiru
Publication venue
Publication date: 18/07/2018
Field of study

Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)