3,037 research outputs found

    Online Tool Condition Monitoring Based on Parsimonious Ensemble+

    Full text link
    Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches which cannot cope with a fast sampling rate of metal cutting process. Furthermore they require a retraining process to be completed from scratch when dealing with a new set of machining parameters. This paper presents an online tool condition monitoring approach based on Parsimonious Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly flexible principle where both ensemble structure and base-classifier structure can automatically grow and shrink on the fly based on the characteristics of data streams. Moreover, the online feature selection scenario is integrated to actively sample relevant input attributes. The paper presents advancement of a newly developed ensemble learning algorithm, pENsemble+, where online active learning scenario is incorporated to reduce operator labelling effort. The ensemble merging scenario is proposed which allows reduction of ensemble complexity while retaining its diversity. Experimental studies utilising real-world manufacturing data streams and comparisons with well known algorithms were carried out. Furthermore, the efficacy of pENsemble was examined using benchmark concept drift data streams. It has been found that pENsemble+ incurs low structural complexity and results in a significant reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic

    Space Warps: I. Crowd-sourcing the Discovery of Gravitational Lenses

    Get PDF
    We describe Space Warps, a novel gravitational lens discovery service that yields samples of high purity and completeness through crowd-sourced visual inspection. Carefully produced colour composite images are displayed to volunteers via a web- based classification interface, which records their estimates of the positions of candidate lensed features. Images of simulated lenses, as well as real images which lack lenses, are inserted into the image stream at random intervals; this training set is used to give the volunteers instantaneous feedback on their performance, as well as to calibrate a model of the system that provides dynamical updates to the probability that a classified image contains a lens. Low probability systems are retired from the site periodically, concentrating the sample towards a set of lens candidates. Having divided 160 square degrees of Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) imaging into some 430,000 overlapping 82 by 82 arcsecond tiles and displaying them on the site, we were joined by around 37,000 volunteers who contributed 11 million image classifications over the course of 8 months. This Stage 1 search reduced the sample to 3381 images containing candidates; these were then refined in Stage 2 to yield a sample that we expect to be over 90% complete and 30% pure, based on our analysis of the volunteers performance on training images. We comment on the scalability of the SpaceWarps system to the wide field survey era, based on our projection that searches of 105^5 images could be performed by a crowd of 105^5 volunteers in 6 days.Comment: 21 pages, 13 figures, MNRAS accepted, minor to moderate changes in this versio

    A survey on online active learning

    Full text link
    Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work

    SpaceWarps - I. Crowdsourcing the discovery of gravitational lenses

    Get PDF
    We describe SpaceWarps, a novel gravitational lens discovery service that yields samples of high purity and completeness through crowdsourced visual inspection. Carefully produced colour composite images are displayed to volunteers via a web-based classification interface, which records their estimates of the positions of candidate lensed features. Images of simulated lenses, as well as real images which lack lenses, are inserted into the image stream at random intervals; this training set is used to give the volunteers instantaneous feedback on their performance, as well as to calibrate a model of the system that provides dynamical updates to the probability that a classified image contains a lens. Low-probability systems are retired from the site periodically, concentrating the sample towards a set of lens candidates. Having divided 160 deg2 of Canada-France-Hawaii Telescope Legacy Survey imaging into some 430000 overlapping 82 by 82arcsec tiles and displaying them on the site, we were joined by around 37000 volunteers who contributed 11 million image classifications over the course of eight months. This stage 1 search reduced the sample to 3381 images containing candidates; these were then refined in stage 2 to yield a sample that we expect to be over 90 per cent complete and 30 per cent pure, based on our analysis of the volunteers performance on training images. We comment on the scalability of the SpaceWarps system to the wide field survey era, based on our projection that searches of 105 images could be performed by a crowd of 105 volunteers in 6

    Active Object Classification from 3D Range Data with Mobile Robots

    Get PDF
    This thesis addresses the problem of how to improve the acquisition of 3D range data with a mobile robot for the task of object classification. Establishing the identities of objects in unknown environments is fundamental for robotic systems and helps enable many abilities such as grasping, manipulation, or semantic mapping. Objects are recognised by data obtained from sensor observations, however, data is highly dependent on viewpoint; the variation in position and orientation of the sensor relative to an object can result in large variation in the perception quality. Additionally, cluttered environments present a further challenge because key data may be missing. These issues are not always solved by traditional passive systems where data are collected from a fixed navigation process then fed into a perception pipeline. This thesis considers an active approach to data collection by deciding where is most appropriate to make observations for the perception task. The core contributions of this thesis are a non-myopic planning strategy to collect data efficiently under resource constraints, and supporting viewpoint prediction and evaluation methods for object classification. Our approach to planning uses Monte Carlo methods coupled with a classifier based on non-parametric Bayesian regression. We present a novel anytime and non-myopic planning algorithm, Monte Carlo active perception, that extends Monte Carlo tree search to partially observable environments and the active perception problem. This is combined with a particle-based estimation process and a learned observation likelihood model that uses Gaussian process regression. To support planning, we present 3D point cloud prediction algorithms and utility functions that measure the quality of viewpoints by their discriminatory ability and effectiveness under occlusion. The utility of viewpoints is quantified by information-theoretic metrics, such as mutual information, and an alternative utility function that exploits learned data is developed for special cases. The algorithms in this thesis are demonstrated in a variety of scenarios. We extensively test our online planning and classification methods in simulation as well as with indoor and outdoor datasets. Furthermore, we perform hardware experiments with different mobile platforms equipped with different types of sensors. Most significantly, our hardware experiments with an outdoor robot are to our knowledge the first demonstrations of online active perception in a real outdoor environment. Active perception has broad significance in many applications. This thesis emphasises the advantages of an active approach to object classification and presents its assimilation with a wide range of robotic systems, sensors, and perception algorithms. By demonstration of performance enhancements and diversity, our hope is that the concept of considering perception and planning in an integrated manner will be of benefit in improving current systems that rely on passive data collection

    Detection of HTTPS brute-force attacks in high-speed computer networks

    Get PDF
    Tato práce představuje přehled metod pro detekci síťových hrozeb se zaměřením na útoky hrubou silou proti webovým aplikacím, jako jsou WordPress a Joomla. Byl vytvořen nový dataset, který se skládá z provozu zachyceného na páteřní síti a útoků generovaných pomocí open-source nástrojů. Práce přináší novou metodu pro detekci útoku hrubou silou, která je založena na charakteristikách jednotlivých paketů a používá moderní metody strojového učení. Metoda funguje s šifrovanou HTTPS komunikací, a to bez nutnosti dešifrování jednotlivých paketů. Stále více webových aplikací používá HTTPS pro zabezpečení komunikace, a proto je nezbytné aktualizovat detekční metody, aby byla zachována základní viditelnost do síťového provozu.This thesis presents a review of flow-based network threat detection, with the focus on brute-force attacks against popular web applications, such as WordPress and Joomla. A new dataset was created that consists of benign backbone network traffic and brute-force attacks generated with open-source attack tools. The thesis proposes a method for brute-force attack detection that is based on packet-level characteristics and uses modern machine-learning models. Also, it works with encrypted HTTPS traffic, even without decrypting the payload. More and more network traffic is being encrypted, and it is crucial to update our intrusion detection methods to maintain at least some level of network visibility
    • …
    corecore