3,037 research outputs found
Online Tool Condition Monitoring Based on Parsimonious Ensemble+
Accurate diagnosis of tool wear in metal turning process remains an open
challenge for both scientists and industrial practitioners because of
inhomogeneities in workpiece material, nonstationary machining settings to suit
production requirements, and nonlinear relations between measured variables and
tool wear. Common methodologies for tool condition monitoring still rely on
batch approaches which cannot cope with a fast sampling rate of metal cutting
process. Furthermore they require a retraining process to be completed from
scratch when dealing with a new set of machining parameters. This paper
presents an online tool condition monitoring approach based on Parsimonious
Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly
flexible principle where both ensemble structure and base-classifier structure
can automatically grow and shrink on the fly based on the characteristics of
data streams. Moreover, the online feature selection scenario is integrated to
actively sample relevant input attributes. The paper presents advancement of a
newly developed ensemble learning algorithm, pENsemble+, where online active
learning scenario is incorporated to reduce operator labelling effort. The
ensemble merging scenario is proposed which allows reduction of ensemble
complexity while retaining its diversity. Experimental studies utilising
real-world manufacturing data streams and comparisons with well known
algorithms were carried out. Furthermore, the efficacy of pENsemble was
examined using benchmark concept drift data streams. It has been found that
pENsemble+ incurs low structural complexity and results in a significant
reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic
Space Warps: I. Crowd-sourcing the Discovery of Gravitational Lenses
We describe Space Warps, a novel gravitational lens discovery service that
yields samples of high purity and completeness through crowd-sourced visual
inspection. Carefully produced colour composite images are displayed to
volunteers via a web- based classification interface, which records their
estimates of the positions of candidate lensed features. Images of simulated
lenses, as well as real images which lack lenses, are inserted into the image
stream at random intervals; this training set is used to give the volunteers
instantaneous feedback on their performance, as well as to calibrate a model of
the system that provides dynamical updates to the probability that a classified
image contains a lens. Low probability systems are retired from the site
periodically, concentrating the sample towards a set of lens candidates. Having
divided 160 square degrees of Canada-France-Hawaii Telescope Legacy Survey
(CFHTLS) imaging into some 430,000 overlapping 82 by 82 arcsecond tiles and
displaying them on the site, we were joined by around 37,000 volunteers who
contributed 11 million image classifications over the course of 8 months. This
Stage 1 search reduced the sample to 3381 images containing candidates; these
were then refined in Stage 2 to yield a sample that we expect to be over 90%
complete and 30% pure, based on our analysis of the volunteers performance on
training images. We comment on the scalability of the SpaceWarps system to the
wide field survey era, based on our projection that searches of 10 images
could be performed by a crowd of 10 volunteers in 6 days.Comment: 21 pages, 13 figures, MNRAS accepted, minor to moderate changes in
this versio
A survey on online active learning
Online active learning is a paradigm in machine learning that aims to select
the most informative data points to label from a data stream. The problem of
minimizing the cost associated with collecting labeled observations has gained
a lot of attention in recent years, particularly in real-world applications
where data is only available in an unlabeled form. Annotating each observation
can be time-consuming and costly, making it difficult to obtain large amounts
of labeled data. To overcome this issue, many active learning strategies have
been proposed in the last decades, aiming to select the most informative
observations for labeling in order to improve the performance of machine
learning models. These approaches can be broadly divided into two categories:
static pool-based and stream-based active learning. Pool-based active learning
involves selecting a subset of observations from a closed pool of unlabeled
data, and it has been the focus of many surveys and literature reviews.
However, the growing availability of data streams has led to an increase in the
number of approaches that focus on online active learning, which involves
continuously selecting and labeling observations as they arrive in a stream.
This work aims to provide an overview of the most recently proposed approaches
for selecting the most informative observations from data streams in the
context of online active learning. We review the various techniques that have
been proposed and discuss their strengths and limitations, as well as the
challenges and opportunities that exist in this area of research. Our review
aims to provide a comprehensive and up-to-date overview of the field and to
highlight directions for future work
SpaceWarps - I. Crowdsourcing the discovery of gravitational lenses
We describe SpaceWarps, a novel gravitational lens discovery service that yields samples of high purity and completeness through crowdsourced visual inspection. Carefully produced colour composite images are displayed to volunteers via a web-based classification interface, which records their estimates of the positions of candidate lensed features. Images of simulated lenses, as well as real images which lack lenses, are inserted into the image stream at random intervals; this training set is used to give the volunteers instantaneous feedback on their performance, as well as to calibrate a model of the system that provides dynamical updates to the probability that a classified image contains a lens. Low-probability systems are retired from the site periodically, concentrating the sample towards a set of lens candidates. Having divided 160 deg2 of Canada-France-Hawaii Telescope Legacy Survey imaging into some 430000 overlapping 82 by 82arcsec tiles and displaying them on the site, we were joined by around 37000 volunteers who contributed 11 million image classifications over the course of eight months. This stage 1 search reduced the sample to 3381 images containing candidates; these were then refined in stage 2 to yield a sample that we expect to be over 90 per cent complete and 30 per cent pure, based on our analysis of the volunteers performance on training images. We comment on the scalability of the SpaceWarps system to the wide field survey era, based on our projection that searches of 105 images could be performed by a crowd of 105 volunteers in 6
Active Object Classification from 3D Range Data with Mobile Robots
This thesis addresses the problem of how to improve the acquisition of 3D range data with a mobile robot for the task of object classification. Establishing the identities of objects in unknown environments is fundamental for robotic systems and helps enable many abilities such as grasping, manipulation, or semantic mapping. Objects are recognised by data obtained from sensor observations, however, data is highly dependent on viewpoint; the variation in position and orientation of the sensor relative to an object can result in large variation in the perception quality. Additionally, cluttered environments present a further challenge because key data may be missing. These issues are not always solved by traditional passive systems where data are collected from a fixed navigation process then fed into a perception pipeline. This thesis considers an active approach to data collection by deciding where is most appropriate to make observations for the perception task. The core contributions of this thesis are a non-myopic planning strategy to collect data efficiently under resource constraints, and supporting viewpoint prediction and evaluation methods for object classification. Our approach to planning uses Monte Carlo methods coupled with a classifier based on non-parametric Bayesian regression. We present a novel anytime and non-myopic planning algorithm, Monte Carlo active perception, that extends Monte Carlo tree search to partially observable environments and the active perception problem. This is combined with a particle-based estimation process and a learned observation likelihood model that uses Gaussian process regression. To support planning, we present 3D point cloud prediction algorithms and utility functions that measure the quality of viewpoints by their discriminatory ability and effectiveness under occlusion. The utility of viewpoints is quantified by information-theoretic metrics, such as mutual information, and an alternative utility function that exploits learned data is developed for special cases. The algorithms in this thesis are demonstrated in a variety of scenarios. We extensively test our online planning and classification methods in simulation as well as with indoor and outdoor datasets. Furthermore, we perform hardware experiments with different mobile platforms equipped with different types of sensors. Most significantly, our hardware experiments with an outdoor robot are to our knowledge the first demonstrations of online active perception in a real outdoor environment. Active perception has broad significance in many applications. This thesis emphasises the advantages of an active approach to object classification and presents its assimilation with a wide range of robotic systems, sensors, and perception algorithms. By demonstration of performance enhancements and diversity, our hope is that the concept of considering perception and planning in an integrated manner will be of benefit in improving current systems that rely on passive data collection
Detection of HTTPS brute-force attacks in high-speed computer networks
Tato práce pĹ™edstavuje pĹ™ehled metod pro detekci sĂĹĄovĂ˝ch hrozeb se zaměřenĂm na Ăştoky hrubou silou proti webovĂ˝m aplikacĂm, jako jsou WordPress a Joomla. Byl vytvoĹ™en novĂ˝ dataset, kterĂ˝ se skládá z provozu zachycenĂ©ho na páteĹ™nĂ sĂti a ĂştokĹŻ generovanĂ˝ch pomocĂ open-source nástrojĹŻ. Práce pĹ™inášà novou metodu pro detekci Ăştoku hrubou silou, která je zaloĹľena na charakteristikách jednotlivĂ˝ch paketĹŻ a pouĹľĂvá modernĂ metody strojovĂ©ho uÄŤenĂ. Metoda funguje s šifrovanou HTTPS komunikacĂ, a to bez nutnosti dešifrovánĂ jednotlivĂ˝ch paketĹŻ. Stále vĂce webovĂ˝ch aplikacĂ pouĹľĂvá HTTPS pro zabezpeÄŤenĂ komunikace, a proto je nezbytnĂ© aktualizovat detekÄŤnĂ metody, aby byla zachována základnĂ viditelnost do sĂĹĄovĂ©ho provozu.This thesis presents a review of flow-based network threat detection, with the focus on brute-force attacks against popular web applications, such as WordPress and Joomla. A new dataset was created that consists of benign backbone network traffic and brute-force attacks generated with open-source attack tools. The thesis proposes a method for brute-force attack detection that is based on packet-level characteristics and uses modern machine-learning models. Also, it works with encrypted HTTPS traffic, even without decrypting the payload. More and more network traffic is being encrypted, and it is crucial to update our intrusion detection methods to maintain at least some level of network visibility
- …