175 research outputs found

    Look, Listen and Learn

    Full text link
    We consider the question: what can be learnt by looking at and listening to a large number of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the video itself -- the correspondence between the visual and the audio streams, and we introduce a novel "Audio-Visual Correspondence" learning task that makes use of this. Training visual and audio networks from scratch, without any additional supervision other than the raw unconstrained videos themselves, is shown to successfully solve this task, and, more interestingly, result in good visual and audio representations. These features set the new state-of-the-art on two sound classification benchmarks, and perform on par with the state-of-the-art self-supervised approaches on ImageNet classification. We also demonstrate that the network is able to localize objects in both modalities, as well as perform fine-grained recognition tasks.Comment: Appears in: IEEE International Conference on Computer Vision (ICCV) 201

    End-to-end weakly-supervised semantic alignment

    Get PDF
    We tackle the task of semantic alignment where the goal is to compute dense semantic correspondence aligning two images depicting objects of the same category. This is a challenging task due to large intra-class variation, changes in viewpoint and background clutter. We present the following three principal contributions. First, we develop a convolutional neural network architecture for semantic alignment that is trainable in an end-to-end manner from weak image-level supervision in the form of matching image pairs. The outcome is that parameters are learnt from rich appearance variation present in different but semantically related images without the need for tedious manual annotation of correspondences at training time. Second, the main component of this architecture is a differentiable soft inlier scoring module, inspired by the RANSAC inlier scoring procedure, that computes the quality of the alignment based on only geometrically consistent correspondences thereby reducing the effect of background clutter. Third, we demonstrate that the proposed approach achieves state-of-the-art performance on multiple standard benchmarks for semantic alignment.Comment: In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018

    Convolutional neural network architecture for geometric matching

    Get PDF
    We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017

    Pairwise Quantization

    Get PDF
    We consider the task of lossy compression of high-dimensional vectors through quantization. We propose the approach that learns quantization parameters by minimizing the distortion of scalar products and squared distances between pairs of points. This is in contrast to previous works that obtain these parameters through the minimization of the reconstruction error of individual points. The proposed approach proceeds by finding a linear transformation of the data that effectively reduces the minimization of the pairwise distortions to the minimization of individual reconstruction errors. After such transformation, any of the previously-proposed quantization approaches can be used. Despite the simplicity of this transformation, the experiments demonstrate that it achieves considerable reduction of the pairwise distortions compared to applying quantization directly to the untransformed data

    Clinical trial adaptation by matching evidence in complementary patient sub-groups of auxiliary blinding questionnaire responses

    Full text link
    Clinical trial adaptation refers to any adjustment of the trial protocol after the onset of the trial. Such adjustment may take on various forms, including the change in the dose of administered medicines, the frequency of administering an intervention, the number of trial participants, or the duration of the trial, to name just some possibilities. The main goal is to make the process of introducing new medical interventions to patients more efficient, either by reducing the cost or the time associated with evaluating their safety and efficacy. The principal challenge, which is an outstanding research problem, is to be found in the question of how adaptation should be performed so as to minimize the chance of distorting the outcome of the trial. In this paper we propose a novel method for achieving this. Unlike most of the previously published work, our approach focuses on trial adaptation by sample size adjustment i.e. by reducing the number of trial participants in a statistically informed manner. We adopt a stratification framework recently proposed for the analysis of trial outcomes in the presence of imperfect blinding and based on the administration of a generic auxiliary questionnaire that allows the participants to express their belief concerning the assigned intervention (treatment or control). We show that this data, together with the primary measured variables, can be used to make the probabilistically optimal choice of the particular sub-group a participant should be removed from if trial size reduction is desired. Extensive experiments on a series of simulated trials are used to illustrate the effectiveness of our method

    Crime and punishment : a rethink

    Get PDF
    Incarceration remains the foremost form of sentence for serious crimes in Western democracies. At the same time, the management of prisons and of the prison population has become a major real-world challenge, with growing concerns about overcrowding, the offenders’ well-being, and the failure of achieving the distal desideratum of reduced criminality, all of which have a moral dimension. In no small part motivated by these practical problems, the focus of the present article is on the ethical framework that we use in thinking about and administering criminal justice. I start with an analysis of imprisonment and its permissibility as a punitive tool of justice. In particular, I present a novel argument against punitive imprisonment, showing it to fall short in meeting two key criteria of just punishment, namely (i) that the appropriate individual is being punished, and (ii) that the punishment can be adequately moderated to reflect the seriousness of the crime. The principles I argue for and that the aforementioned analysis brings to the fore, rooted in the sentient experience, firstly of victims, and not only of victims but also of the offenders as well as the society at large, then lead me to elucidate the broader framework of jurisprudence that I then apply more widely. Hence, while rejecting punitive imprisonment, I use its identified shortcomings to argue for the reinstitution of forms of punishment that are, incongruently, presently not seen as permissible, such as corporal punishment and punishments dismissed on the basis of being seen as humiliating. I also present a novel view of capital punishment, which, in contradiction to its name, I reject for punitive aims, but which I argue is permissible on compassionate grounds.Publisher PDFPeer reviewe

    COVID-19 and science communication : the recording and reporting of disease mortality

    Get PDF
    The ongoing COVID-19 pandemic has brought science to the fore of public discourse and, considering the complexity of the issues involved, with it also the challenge of effective and informative science communication. This is a particularly contentious topic, in that it is both highly emotional in and of itself; sits at the nexus of the decision-making process regarding the handling of the pandemic, which has effected lockdowns, social behaviour measures, business closures, and others; and concerns the recording and reporting of disease mortality. To clarify a point that has caused much controversy and anger in the public debate, the first part of the present article discusses the very fundamentals underlying the issue of causative attribution with regards to mortality, lays out the foundations of the statistical means of mortality estimation, and concretizes these by analysing the recording and reporting practices adopted in England and their widespread misrepresentations. The second part of the article is empirical in nature. I present data and an analysis of how COVID-19 mortality has been reported in the mainstream media in the UK and the USA, including a comparative analysis both across the two countries as well as across different media outlets. The findings clearly demonstrate a uniform and worrying lack of understanding of the relevant technical subject matter by the media in both countries. Of particular interest is the finding that with a remarkable regularity (ρ>0.998), the greater the number of articles a media outlet has published on COVID-19 mortality, the greater the proportion of its articles misrepresented the disease mortality figures.Publisher PDFPeer reviewe

    Weighted linear fusion of multimodal data - a reasonable baseline?

    Get PDF
    The ever-increasing demand for reliable inference capable of handling unpredictable challenges of practical application in the real world, has made research on information fusion of major importance. There are few fields of application and research where this is more evident than in the sphere of multimedia which by its very nature inherently involves the use of multiple modalities, be it for learning, prediction, or human-computer interaction, say. In the development of the most common type, score-level fusion algorithms,it is virtually without an exception desirable to have as a reference starting point a simple and universally sound baseline benchmark which newly developed approaches can be compared to. One of the most pervasively used methods is that of weighted linear fusion.It has cemented itself as the default off-the-shelf baseline owing to its simplicity of implementation, interpretability, and surprisingly competitive performance across a wide range of application domains and information source types. In this paper I argue that despite this track record, weighted linear fusion is not a good baseline on the grounds that there is an equally simple and interpretable alternative – namely quadratic mean-based fusion – which is theoretically more principled and which is more successful in practice. I argue the former from first principles and demonstrate the latter using a series of experiments on a diverse set of fusion problems: computer vision-based object recognition, arrhythmia detection, and fatality prediction in motor vehicle accidents.Postprin

    Whole slide pathology image patch based deep classification : an investigation of the effects of the latent autoencoder representation and the loss function form

    Get PDF
    The analysis of whole-slide pathological images is a major area of deep learning applications in medicine. The automation of disease identification, prevention, diagnosis, and treatment selection from whole-slide images (WSIs) has seen many advances in the last decade due to the progress made in the areas of computer vision and machine learning. The focus of this work is on patch level to slide image level analysis of WSIs, popular in the existing literature. In particular, we investigate the nature of the information content present in images on the local level of individual patches using autoencoding. Driven by our findings at this stage, which raise questions about the us of autoencoders, we next address the challenge posed by what we argue is an overly coarse classification of patches as tumorous and non-tumorous, which leads to the loss of important information. We showed that task specific modifications of the loss function, which take into account the content of individual patches in a more nuanced manner, facilitate a dramatic reduction in the false negative classification rate.Postprin
    corecore