49,011 research outputs found
Searching for test data with feature diversity
There is an implicit assumption in software testing that more diverse and
varied test data is needed for effective testing and to achieve different types
and levels of coverage. Generic approaches based on information theory to
measure and thus, implicitly, to create diverse data have also been proposed.
However, if the tester is able to identify features of the test data that are
important for the particular domain or context in which the testing is being
performed, the use of generic diversity measures such as this may not be
sufficient nor efficient for creating test inputs that show diversity in terms
of these features. Here we investigate different approaches to find data that
are diverse according to a specific set of features, such as length, depth of
recursion etc. Even though these features will be less general than measures
based on information theory, their use may provide a tester with more direct
control over the type of diversity that is present in the test data. Our
experiments are carried out in the context of a general test data generation
framework that can generate both numerical and highly structured data. We
compare random sampling for feature-diversity to different approaches based on
search and find a hill climbing search to be efficient. The experiments
highlight many trade-offs that needs to be taken into account when searching
for diversity. We argue that recurrent test data generation motivates building
statistical models that can then help to more quickly achieve feature
diversity.Comment: This version was submitted on April 14th 201
Model Extraction Warning in MLaaS Paradigm
Cloud vendors are increasingly offering machine learning services as part of
their platform and services portfolios. These services enable the deployment of
machine learning models on the cloud that are offered on a pay-per-query basis
to application developers and end users. However recent work has shown that the
hosted models are susceptible to extraction attacks. Adversaries may launch
queries to steal the model and compromise future query payments or privacy of
the training data. In this work, we present a cloud-based extraction monitor
that can quantify the extraction status of models by observing the query and
response streams of both individual and colluding adversarial users. We present
a novel technique that uses information gain to measure the model learning rate
by users with increasing number of queries. Additionally, we present an
alternate technique that maintains intelligent query summaries to measure the
learning rate relative to the coverage of the input feature space in the
presence of collusion. Both these approaches have low computational overhead
and can easily be offered as services to model owners to warn them of possible
extraction attacks from adversaries. We present performance results for these
approaches for decision tree models deployed on BigML MLaaS platform, using
open source datasets and different adversarial attack strategies
k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)
Perhaps the most straightforward classifier in the arsenal or machine
learning techniques is the Nearest Neighbour Classifier -- classification is
achieved by identifying the nearest neighbours to a query example and using
those neighbours to determine the class of the query. This approach to
classification is of particular importance because issues of poor run-time
performance is not such a problem these days with the computational power that
is available. This paper presents an overview of techniques for Nearest
Neighbour classification focusing on; mechanisms for assessing similarity
(distance), computational issues in identifying nearest neighbours and
mechanisms for reducing the dimension of the data.
This paper is the second edition of a paper previously published as a
technical report. Sections on similarity measures for time-series, retrieval
speed-up and intrinsic dimensionality have been added. An Appendix is included
providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN
Deep learning approach to Fourier ptychographic microscopy
Convolutional neural networks (CNNs) have gained tremendous success in
solving complex inverse problems. The aim of this work is to develop a novel
CNN framework to reconstruct video sequence of dynamic live cells captured
using a computational microscopy technique, Fourier ptychographic microscopy
(FPM). The unique feature of the FPM is its capability to reconstruct images
with both wide field-of-view (FOV) and high resolution, i.e. a large
space-bandwidth-product (SBP), by taking a series of low resolution intensity
images. For live cell imaging, a single FPM frame contains thousands of cell
samples with different morphological features. Our idea is to fully exploit the
statistical information provided by this large spatial ensemble so as to make
predictions in a sequential measurement, without using any additional temporal
dataset. Specifically, we show that it is possible to reconstruct high-SBP
dynamic cell videos by a CNN trained only on the first FPM dataset captured at
the beginning of a time-series experiment. Our CNN approach reconstructs a
12800X10800 pixels phase image using only ~25 seconds, a 50X speedup compared
to the model-based FPM algorithm. In addition, the CNN further reduces the
required number of images in each time frame by ~6X. Overall, this
significantly improves the imaging throughput by reducing both the acquisition
and computational times. The proposed CNN is based on the conditional
generative adversarial network (cGAN) framework. Additionally, we also exploit
transfer learning so that our pre-trained CNN can be further optimized to image
other cell types. Our technique demonstrates a promising deep learning approach
to continuously monitor large live-cell populations over an extended time and
gather useful spatial and temporal information with sub-cellular resolution
Deep learning approach to Fourier ptychographic microscopy
Convolutional neural networks (CNNs) have gained tremendous success in solving complex inverse problems. The aim of this work is to develop a novel CNN framework to reconstruct video sequences of dynamic live cells captured using a computational microscopy technique, Fourier ptychographic microscopy (FPM). The unique feature of the FPM is its capability to reconstruct images with both wide field-of-view (FOV) and high resolution, i.e. a large space-bandwidth-product (SBP), by taking a series of low resolution intensity images. For live cell imaging, a single FPM frame contains thousands of cell samples with different morphological features. Our idea is to fully exploit the statistical information provided by these large spatial ensembles so as to make predictions in a sequential measurement, without using any additional temporal dataset. Specifically, we show that it is possible to reconstruct high-SBP dynamic cell videos by a CNN trained only on the first FPM dataset captured at the beginning of a time-series experiment. Our CNN approach reconstructs a 12800×10800 pixel phase image using only ∼25 seconds, a 50× speedup compared to the model-based FPM algorithm. In addition, the CNN further reduces the required number of images in each time frame by ∼ 6×. Overall, this significantly improves the imaging throughput by reducing both the acquisition and computational times. The proposed CNN is based on the conditional generative adversarial network (cGAN) framework. We further propose a mixed loss function that combines the standard image domain loss and a weighted Fourier domain loss, which leads to improved reconstruction of the high frequency information. Additionally, we also exploit transfer learning so that our pre-trained CNN can be further optimized to image other cell types. Our technique demonstrates a promising deep learning approach to continuously monitor large live-cell populations over an extended time and gather useful spatial and temporal information with sub-cellular resolution.We would like to thank NVIDIA Corporation for supporting us with the GeForce Titan Xp through the GPU Grant Program. (NVIDIA Corporation; GeForce Titan Xp through the GPU Grant Program)First author draf
- …