1,838 research outputs found
Identifying Model Weakness with Adversarial Examiner
Machine learning models are usually evaluated according to the average case
performance on the test set. However, this is not always ideal, because in some
sensitive domains (e.g. autonomous driving), it is the worst case performance
that matters more. In this paper, we are interested in systematic exploration
of the input data space to identify the weakness of the model to be evaluated.
We propose to use an adversarial examiner in the testing stage. Different from
the existing strategy to always give the same (distribution of) test data, the
adversarial examiner will dynamically select the next test data to hand out
based on the testing history so far, with the goal being to undermine the
model's performance. This sequence of test data not only helps us understand
the current model, but also serves as constructive feedback to help improve the
model in the next iteration. We conduct experiments on ShapeNet object
classification. We show that our adversarial examiner can successfully put more
emphasis on the weakness of the model, preventing performance estimates from
being overly optimistic.Comment: To appear in AAAI-2
Can Fingerprints Lie?: Re-Weighing Fingerprint Evidence in Criminal Jury Trials
This article discusses fingerprint evidence and its use in criminal jury trials. It is commonly thought that fingerprints never lie ; however, this article reveals the little known fact that the science of fingerprint identification has never been empirically tested or proven to be reliable. It further exposes the seldom-discussed issue of fingerprint misidentification and latent print examiner error. The article explains the importance of fingerprint evidence and its extensive use in all phases of the criminal justice system. Specifically, the article plays out the dramatic courtroom scenario of incriminating fingerprints being found at a crime scene and matching the accused all while the defendant strongly claims innocence. The expert opinion testimony of the latent fingerprint examiner becomes seminal to the case and is often received as powerfully persuasive evidence of guilt - virtually guaranteeing conviction.
Notwithstanding the fact that fingerprints are nearly universally accepted as infallible proof of identity in court, defense attorneys are currently urging courts to exclude fingerprint identification evidence from criminal jury trials by arguing that the findings of latent fingerprint examiners are scientifically invalid and legally unreliable. Notwithstanding the nearly universal acceptable of fingerprint evidence, at least one state court in Maryland refused to admit fingerprint evidence at all. This article dissects the defense’s argument for exclusion and explains the larger debate over the correct application of the evidentiary rules for expert witnesses articulated by the United States Supreme Court in Daubert and Kumho Tire. Although the exclusion of fingerprint evidence may sound outlandish, this article illuminates the substance of the argument and why due process requires judges to closely monitor whether the opinions of latent fingerprint examiners are accurate and reliable evidence to be considered by the jury. Further, this article proposes a special jury instruction that would be given in certain criminal cases to guide jurors in weighting the value of the fingerprint evidence
Simulated Adversarial Testing of Face Recognition Models
Most machine learning models are validated and tested on fixed datasets. This
can give an incomplete picture of the capabilities and weaknesses of the model.
Such weaknesses can be revealed at test time in the real world. The risks
involved in such failures can be loss of profits, loss of time or even loss of
life in certain critical applications. In order to alleviate this issue,
simulators can be controlled in a fine-grained manner using interpretable
parameters to explore the semantic image manifold. In this work, we propose a
framework for learning how to test machine learning algorithms using simulators
in an adversarial manner in order to find weaknesses in the model before
deploying it in critical scenarios. We apply this model in a face recognition
scenario. We are the first to show that weaknesses of models trained on real
data can be discovered using simulated samples. Using our proposed method, we
can find adversarial synthetic faces that fool contemporary face recognition
models. This demonstrates the fact that these models have weaknesses that are
not measured by commonly used validation datasets. We hypothesize that this
type of adversarial examples are not isolated, but usually lie in connected
components in the latent space of the simulator. We present a method to find
these adversarial regions as opposed to the typical adversarial points found in
the adversarial example literature
On the Automation and Diagnosis of Visual Intelligence
One of the ultimate goals of computer vision is to equip machines with visual intelligence: the ability to understand a scene at the level that is indistinguishable from human's. This not only requires detecting the 2D or 3D locations of objects, but also recognizing their semantic categories, or even higher level interactions. Thanks to decades of vision research as well as recent developments in deep learning, we are closer to this goal than ever. But to keep closing the gap, more research is needed on two themes. One, current models are still far from perfect, so we need a mechanism to keep proposing new, better models to improve performance. Two, while we are pushing for performance, it is also important to do careful analysis and diagnosis of existing models, to make sure we are indeed moving in the right direction.
In this dissertation, I study either of the two research themes for various steps in the visual intelligence pipeline. The first part of the dissertation focuses on category-level understanding of 2D images, which is arguably the most critical step in the visual intelligence pipeline as it bridges vision and language. The theme is on automating the process of model improvement: in particular, the architecture of neural networks. The second part extends the visual intelligence pipeline along the language side, and focuses on the more challenging language-level understanding of 2D images. The theme also shifts to diagnosis, by examining existing models, proposing interpretable models, or building diagnostic datasets. The third part continues in the diagnosis theme, this time extending along the vision side, focusing on how incorporating 3D scene knowledge may facilitate the evaluation of image recognition models
Forensic Science Evidence and the Limits of Cross-Examination
The ability to confront witnesses through cross-examination is conventionally understood as the most powerful means of testing evidence, and one of the most important features of the adversarial trial. Popularly feted, cross-examination was immortalised in John Henry Wigmore’s (1863–1943) famous dictum that it is ‘the greatest legal engine ever invented for the discovery of truth’. Through a detailed review of the cross-examination of a forensic scientist, in the first scientifically-informed challenge to latent fingerprint evidence in Australia, this article offers a more modest assessment of its value. Drawing upon mainstream scientific research and advice, and contrasting scientific knowledge with answers obtained through cross-examination of a latent fingerprint examiner, it illuminates a range of serious and apparently unrecognised limitations with our current procedural arrangements. The article explains the limits of cross-examination and the difficulties trial and appellate judges — and by extension juries — experience when engaging with forensic science evidence
- …