2,542 research outputs found
Certified Computation from Unreliable Datasets
A wide range of learning tasks require human input in labeling massive data.
The collected data though are usually low quality and contain inaccuracies and
errors. As a result, modern science and business face the problem of learning
from unreliable data sets.
In this work, we provide a generic approach that is based on
\textit{verification} of only few records of the data set to guarantee high
quality learning outcomes for various optimization objectives. Our method,
identifies small sets of critical records and verifies their validity. We show
that many problems only need verifications, to
ensure that the output of the computation is at most a factor of away from the truth. For any given instance, we provide an
\textit{instance optimal} solution that verifies the minimum possible number of
records to approximately certify correctness. Then using this instance optimal
formulation of the problem we prove our main result: "every function that
satisfies some Lipschitz continuity condition can be certified with a small
number of verifications". We show that the required Lipschitz continuity
condition is satisfied even by some NP-complete problems, which illustrates the
generality and importance of this theorem.
In case this certification step fails, an invalid record will be identified.
Removing these records and repeating until success, guarantees that the result
will be accurate and will depend only on the verified records. Surprisingly, as
we show, for several computation tasks more efficient methods are possible.
These methods always guarantee that the produced result is not affected by the
invalid records, since any invalid record that affects the output will be
detected and verified
The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race
Recent studies in social media spam and automation provide anecdotal
argumentation of the rise of a new generation of spambots, so-called social
spambots. Here, for the first time, we extensively study this novel phenomenon
on Twitter and we provide quantitative evidence that a paradigm-shift exists in
spambot design. First, we measure current Twitter's capabilities of detecting
the new social spambots. Later, we assess the human performance in
discriminating between genuine accounts, social spambots, and traditional
spambots. Then, we benchmark several state-of-the-art techniques proposed by
the academic literature. Results show that neither Twitter, nor humans, nor
cutting-edge applications are currently capable of accurately detecting the new
social spambots. Our results call for new approaches capable of turning the
tide in the fight against this raising phenomenon. We conclude by reviewing the
latest literature on spambots detection and we highlight an emerging common
research trend based on the analysis of collective behaviors. Insights derived
from both our extensive experimental campaign and survey shed light on the most
promising directions of research and lay the foundations for the arms race
against the novel social spambots. Finally, to foster research on this novel
phenomenon, we make publicly available to the scientific community all the
datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science
Track, Perth, Australia, 3-7 April, 2017
Are You Tampering With My Data?
We propose a novel approach towards adversarial attacks on neural networks
(NN), focusing on tampering the data used for training instead of generating
attacks on trained models. Our network-agnostic method creates a backdoor
during training which can be exploited at test time to force a neural network
to exhibit abnormal behaviour. We demonstrate on two widely used datasets
(CIFAR-10 and SVHN) that a universal modification of just one pixel per image
for all the images of a class in the training set is enough to corrupt the
training procedure of several state-of-the-art deep neural networks causing the
networks to misclassify any images to which the modification is applied. Our
aim is to bring to the attention of the machine learning community, the
possibility that even learning-based methods that are personally trained on
public datasets can be subject to attacks by a skillful adversary.Comment: 18 page
Tailored for Real-World: A Whole Slide Image Classification System Validated on Uncurated Multi-Site Data Emulating the Prospective Pathology Workload.
Standard of care diagnostic procedure for suspected skin cancer is microscopic examination of hematoxylin & eosin stained tissue by a pathologist. Areas of high inter-pathologist discordance and rising biopsy rates necessitate higher efficiency and diagnostic reproducibility. We present and validate a deep learning system which classifies digitized dermatopathology slides into 4 categories. The system is developed using 5,070 images from a single lab, and tested on an uncurated set of 13,537 images from 3 test labs, using whole slide scanners manufactured by 3 different vendors. The system\u27s use of deep-learning-based confidence scoring as a criterion to consider the result as accurate yields an accuracy of up to 98%, and makes it adoptable in a real-world setting. Without confidence scoring, the system achieved an accuracy of 78%. We anticipate that our deep learning system will serve as a foundation enabling faster diagnosis of skin cancer, identification of cases for specialist review, and targeted diagnostic classifications
On the Numerical Accuracy of Spreadsheets
This paper discusses the numerical precision of five spreadsheets (Calc, Excel, Gnumeric, NeoOffice and Oleo) running on two hardware platforms (i386 and amd64) and on three operating systems (Windows Vista, Ubuntu Intrepid and Mac OS Leopard). The methodology consists of checking the number of correct significant digits returned by each spreadsheet when computing the sample mean, standard deviation, first-order autocorrelation, F statistic in ANOVA tests, linear and nonlinear regression and distribution functions. A discussion about the algorithms for pseudorandom number generation provided by these platforms is also conducted. We conclude that there is no safe choice among the spreadsheets here assessed: they all fail in nonlinear regression and they are not suited for Monte Carlo experiments.
Blind Justice: Fairness with Encrypted Sensitive Attributes
Recent work has explored how to train machine learning models which do not
discriminate against any subgroup of the population as determined by sensitive
attributes such as gender or race. To avoid disparate treatment, sensitive
attributes should not be considered. On the other hand, in order to avoid
disparate impact, sensitive attributes must be examined, e.g., in order to
learn a fair model, or to check if a given model is fair. We introduce methods
from secure multi-party computation which allow us to avoid both. By encrypting
sensitive attributes, we show how an outcome-based fair model may be learned,
checked, or have its outputs verified and held to account, without users
revealing their sensitive attributes.Comment: published at ICML 201
On the Numerical Accuracy of Spreadsheets
This paper discusses the numerical precision of five spreadsheets (Calc, Excel, Gnumeric, NeoOffice and Oleo) running on two hardware platforms (i386 and amd64) and on three operating systems (Windows Vista, Ubuntu Intrepid and Mac OS Leopard). The methodology consists of checking the number of correct significant digits returned by each spreadsheet when computing the sample mean, standard deviation, first-order autocorrelation, F statistic in ANOVA tests, linear and nonlinear regression and distribution functions. A discussion about the algorithms for pseudorandom number generation provided by these platforms is also conducted. We conclude that there is no safe choice among the spreadsheets here assessed: they all fail in nonlinear regression and they are not suited for Monte Carlo experiments
DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems
Deep learning (DL) defines a new data-driven programming paradigm that
constructs the internal system logic of a crafted neuron network through a set
of training data. We have seen wide adoption of DL in many safety-critical
scenarios. However, a plethora of studies have shown that the state-of-the-art
DL systems suffer from various vulnerabilities which can lead to severe
consequences when applied to real-world applications. Currently, the testing
adequacy of a DL system is usually measured by the accuracy of test data.
Considering the limitation of accessible high quality test data, good accuracy
performance on test data can hardly provide confidence to the testing adequacy
and generality of DL systems. Unlike traditional software systems that have
clear and controllable logic and functionality, the lack of interpretability in
a DL system makes system analysis and defect detection difficult, which could
potentially hinder its real-world deployment. In this paper, we propose
DeepGauge, a set of multi-granularity testing criteria for DL systems, which
aims at rendering a multi-faceted portrayal of the testbed. The in-depth
evaluation of our proposed testing criteria is demonstrated on two well-known
datasets, five DL systems, and with four state-of-the-art adversarial attack
techniques against DL. The potential usefulness of DeepGauge sheds light on the
construction of more generic and robust DL systems.Comment: The 33rd IEEE/ACM International Conference on Automated Software
Engineering (ASE 2018
Learning Robust Kernel Ensembles with Kernel Average Pooling
Model ensembles have long been used in machine learning to reduce the
variance in individual model predictions, making them more robust to input
perturbations. Pseudo-ensemble methods like dropout have also been commonly
used in deep learning models to improve generalization. However, the
application of these techniques to improve neural networks' robustness against
input perturbations remains underexplored. We introduce Kernel Average Pooling
(KAP), a neural network building block that applies the mean filter along the
kernel dimension of the layer activation tensor. We show that ensembles of
kernels with similar functionality naturally emerge in convolutional neural
networks equipped with KAP and trained with backpropagation. Moreover, we show
that when trained on inputs perturbed with additive Gaussian noise, KAP models
are remarkably robust against various forms of adversarial attacks. Empirical
evaluations on CIFAR10, CIFAR100, TinyImagenet, and Imagenet datasets show
substantial improvements in robustness against strong adversarial attacks such
as AutoAttack without training on any adversarial examples
- …