237 research outputs found
Look, Listen and Learn
We consider the question: what can be learnt by looking at and listening to a
large number of unlabelled videos? There is a valuable, but so far untapped,
source of information contained in the video itself -- the correspondence
between the visual and the audio streams, and we introduce a novel
"Audio-Visual Correspondence" learning task that makes use of this. Training
visual and audio networks from scratch, without any additional supervision
other than the raw unconstrained videos themselves, is shown to successfully
solve this task, and, more interestingly, result in good visual and audio
representations. These features set the new state-of-the-art on two sound
classification benchmarks, and perform on par with the state-of-the-art
self-supervised approaches on ImageNet classification. We also demonstrate that
the network is able to localize objects in both modalities, as well as perform
fine-grained recognition tasks.Comment: Appears in: IEEE International Conference on Computer Vision (ICCV)
201
Convolutional neural network architecture for geometric matching
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2017
End-to-end weakly-supervised semantic alignment
We tackle the task of semantic alignment where the goal is to compute dense
semantic correspondence aligning two images depicting objects of the same
category. This is a challenging task due to large intra-class variation,
changes in viewpoint and background clutter. We present the following three
principal contributions. First, we develop a convolutional neural network
architecture for semantic alignment that is trainable in an end-to-end manner
from weak image-level supervision in the form of matching image pairs. The
outcome is that parameters are learnt from rich appearance variation present in
different but semantically related images without the need for tedious manual
annotation of correspondences at training time. Second, the main component of
this architecture is a differentiable soft inlier scoring module, inspired by
the RANSAC inlier scoring procedure, that computes the quality of the alignment
based on only geometrically consistent correspondences thereby reducing the
effect of background clutter. Third, we demonstrate that the proposed approach
achieves state-of-the-art performance on multiple standard benchmarks for
semantic alignment.Comment: In 2018 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2018
Pairwise Quantization
We consider the task of lossy compression of high-dimensional vectors through
quantization. We propose the approach that learns quantization parameters by
minimizing the distortion of scalar products and squared distances between
pairs of points. This is in contrast to previous works that obtain these
parameters through the minimization of the reconstruction error of individual
points. The proposed approach proceeds by finding a linear transformation of
the data that effectively reduces the minimization of the pairwise distortions
to the minimization of individual reconstruction errors. After such
transformation, any of the previously-proposed quantization approaches can be
used. Despite the simplicity of this transformation, the experiments
demonstrate that it achieves considerable reduction of the pairwise distortions
compared to applying quantization directly to the untransformed data
Clinical trial adaptation by matching evidence in complementary patient sub-groups of auxiliary blinding questionnaire responses
Clinical trial adaptation refers to any adjustment of the trial protocol after the onset of the trial. Such adjustment may take on various forms, including the change in the dose of administered medicines, the frequency of administering an intervention, the number of trial participants, or the duration of the trial, to name just some possibilities. The main goal is to make the process of introducing new medical interventions to patients more efficient, either by reducing the cost or the time associated with evaluating their safety and efficacy. The principal challenge, which is an outstanding research problem, is to be found in the question of how adaptation should be performed so as to minimize the chance of distorting the outcome of the trial. In this paper we propose a novel method for achieving this. Unlike most of the previously published work, our approach focuses on trial adaptation by sample size adjustment i.e. by reducing the number of trial participants in a statistically informed manner. We adopt a stratification framework recently proposed for the analysis of trial outcomes in the presence of imperfect blinding and based on the administration of a generic auxiliary questionnaire that allows the participants to express their belief concerning the assigned intervention (treatment or control). We show that this data, together with the primary measured variables, can be used to make the probabilistically optimal choice of the particular sub-group a participant should be removed from if trial size reduction is desired. Extensive experiments on a series of simulated trials are used to illustrate the effectiveness of our method
COVID-19 and science communication : the recording and reporting of disease mortality
The ongoing COVID-19 pandemic has brought science to the fore of public discourse and, considering the complexity of the issues involved, with it also the challenge of effective and informative science communication. This is a particularly contentious topic, in that it is both highly emotional in and of itself; sits at the nexus of the decision-making process regarding the handling of the pandemic, which has effected lockdowns, social behaviour measures, business closures, and others; and concerns the recording and reporting of disease mortality. To clarify a point that has caused much controversy and anger in the public debate, the first part of the present article discusses the very fundamentals underlying the issue of causative attribution with regards to mortality, lays out the foundations of the statistical means of mortality estimation, and concretizes these by analysing the recording and reporting practices adopted in England and their widespread misrepresentations. The second part of the article is empirical in nature. I present data and an analysis of how COVID-19 mortality has been reported in the mainstream media in the UK and the USA, including a comparative analysis both across the two countries as well as across different media outlets. The findings clearly demonstrate a uniform and worrying lack of understanding of the relevant technical subject matter by the media in both countries. Of particular interest is the finding that with a remarkable regularity (ρ>0.998), the greater the number of articles a media outlet has published on COVID-19 mortality, the greater the proportion of its articles misrepresented the disease mortality figures.Publisher PDFPeer reviewe
Crime and punishment : a rethink
Incarceration remains the foremost form of sentence for serious crimes in Western democracies. At the same time, the management of prisons and of the prison population has become a major real-world challenge, with growing concerns about overcrowding, the offenders’ well-being, and the failure of achieving the distal desideratum of reduced criminality, all of which have a moral dimension. In no small part motivated by these practical problems, the focus of the present article is on the ethical framework that we use in thinking about and administering criminal justice. I start with an analysis of imprisonment and its permissibility as a punitive tool of justice. In particular, I present a novel argument against punitive imprisonment, showing it to fall short in meeting two key criteria of just punishment, namely (i) that the appropriate individual is being punished, and (ii) that the punishment can be adequately moderated to reflect the seriousness of the crime. The principles I argue for and that the aforementioned analysis brings to the fore, rooted in the sentient experience, firstly of victims, and not only of victims but also of the offenders as well as the society at large, then lead me to elucidate the broader framework of jurisprudence that I then apply more widely. Hence, while rejecting punitive imprisonment, I use its identified shortcomings to argue for the reinstitution of forms of punishment that are, incongruently, presently not seen as permissible, such as corporal punishment and punishments dismissed on the basis of being seen as humiliating. I also present a novel view of capital punishment, which, in contradiction to its name, I reject for punitive aims, but which I argue is permissible on compassionate grounds.Publisher PDFPeer reviewe
Whole slide pathology image patch based deep classification : an investigation of the effects of the latent autoencoder representation and the loss function form
The analysis of whole-slide pathological images is a major area of deep learning applications in medicine. The automation of disease identification, prevention, diagnosis, and treatment selection from whole-slide images (WSIs) has seen many advances in the last decade due to the progress made in the areas of computer vision and machine learning. The focus of this work is on patch level to slide image level analysis of WSIs, popular in the existing literature. In particular, we investigate the nature of the information content present in images on the local level of individual patches using autoencoding. Driven by our findings at this stage, which raise questions about the us of autoencoders, we next address the challenge posed by what we argue is an overly coarse classification of patches as tumorous and non-tumorous, which leads to the loss of important information. We showed that task specific modifications of the loss function, which take into account the content of individual patches in a more nuanced manner, facilitate a dramatic reduction in the false negative classification rate.Postprin
Weighted linear fusion of multimodal data - a reasonable baseline?
The ever-increasing demand for reliable inference capable of handling unpredictable challenges of practical application in the real world, has made research on information fusion of major importance. There are few fields of application and research where this is more evident than in the sphere of multimedia which by its very nature inherently involves the use of multiple modalities, be it for learning, prediction, or human-computer interaction, say. In the development of the most common type, score-level fusion algorithms,it is virtually without an exception desirable to have as a reference starting point a simple and universally sound baseline benchmark which newly developed approaches can be compared to. One of the most pervasively used methods is that of weighted linear fusion.It has cemented itself as the default off-the-shelf baseline owing to its simplicity of implementation, interpretability, and surprisingly competitive performance across a wide range of application domains and information source types. In this paper I argue that despite this track record, weighted linear fusion is not a good baseline on the grounds that there is an equally simple and interpretable alternative – namely quadratic mean-based fusion – which is theoretically more principled and which is more successful in practice. I argue the former from first principles and demonstrate the latter using a series of experiments on a diverse set of fusion problems: computer vision-based object recognition, arrhythmia detection, and fatality prediction in motor vehicle accidents.Postprin
- …