18 research outputs found

    Self Paced Deep Learning for Weakly Supervised Object Detection

    Full text link
    In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which the current classifier is used to select the highest-confidence boxes in each image, which are treated as pseudo-ground truth in the next training iteration. However, the errors of an immature classifier can make the process drift, usually introducing many of false positives in the training dataset. To alleviate this problem, we propose in this paper a training protocol based on the self-paced learning paradigm. The main idea is to iteratively select a subset of images and boxes that are the most reliable, and use them for training. While in the past few years similar strategies have been adopted for SVMs and other classifiers, we are the first showing that a self-paced approach can be used with deep-network-based classifiers in an end-to-end training pipeline. The method we propose is built on the fully-supervised Fast-RCNN architecture and can be applied to similar architectures which represent the input image as a bag of boxes. We show state-of-the-art results on Pascal VOC 2007, Pascal VOC 2010 and ILSVRC 2013. On ILSVRC 2013 our results based on a low-capacity AlexNet network outperform even those weakly-supervised approaches which are based on much higher-capacity networks.Comment: To appear at IEEE Transactions on PAM

    MR Image Based Approach for Metal Artifact Reduction in X-Ray CT

    Get PDF
    For decades, computed tomography (CT) images have been widely used to discover valuable anatomical information. Metallic implants such as dental fillings cause severe streaking artifacts which significantly degrade the quality of CT images. In this paper, we propose a new method for metal-artifact reduction using complementary magnetic resonance (MR) images. The method exploits the possibilities which arise from the use of emergent trimodality systems. The proposed algorithm corrects reconstructed CT images. The projected data which is affected by dental fillings is detected and the missing projections are replaced with data obtained from a corresponding MR image. A simulation study was conducted in order to compare the reconstructed images with images reconstructed through linear interpolation, which is a common metal-artifact reduction technique. The results show that the proposed method is successful in reducing severe metal artifacts without introducing significant amount of secondary artifacts

    Evaluating the Role of Content in Subjective Video Quality Assessment

    Get PDF
    Video quality as perceived by human observers is the ground truth when Video Quality Assessment (VQA) is in question. It is dependent on many variables, one of them being the content of the video that is being evaluated. Despite the evidence that content has an impact on the quality score the sequence receives from human evaluators, currently available VQA databases mostly comprise of sequences which fail to take this into account. In this paper, we aim to identify and analyze differences between human cognitive, affective, and conative responses to a set of videos commonly used for VQA and a set of videos specifically chosen to include video content which might affect the judgment of evaluators when perceived video quality is in question. Our findings indicate that considerable differences exist between the two sets on selected factors, which leads us to conclude that videos starring a different type of content than the currently employed ones might be more appropriate for VQA

    Advancing marine conservation in European and contiguous seas with the MarCons Action

    Get PDF
    Cumulative human impacts have led to the degradation of marine ecosystems and the decline of biodiversity in the European and contiguous seas. Effective conservation measures are urgently needed to reverse these trends. Conservation must entail societal choices, underpinned by human values and worldviews that differ between the countries bordering these seas. Social, economic and political heterogeneity adds to the challenge of balancing conservation with sustainable use of the seas. Comprehensive macro-regional coordination is needed to ensure effective conservation of marine ecosystems and biodiversity of this region. Under the European Union Horizon 2020 framework programme, the MarCons COST action aims to promote collaborative research to support marine management, conservation planning and policy development. This will be achieved by developing novel methods and tools to close knowledge gaps and advance marine conservation science. This action will provide support for the development of macro-regional and national policies through six key actions: to develop tools to analyse cumulative human impacts; to identify critical scientific and technical gaps in conservation efforts; to improve the resilience of the marine environment to global change and biological invasions; to develop frameworks for integrated conservation planning across terrestrial, freshwater, and marine environments; to coordinate marine conservation policy across national boundaries; and to identify effective governance approaches for marine protected area management. Achieving the objectives of these actions will facilitate the integration of marine conservation policy into macro-regional maritime spatial planning agendas for the European and contiguous seas, thereby offsetting the loss of biodiversity and ecosystem services in this region

    Monitoring the Impact of Large Transport Infrastructure on Land Use and Environment Using Deep Learning and Satellite Imagery

    No full text
    Large-scale infrastructure, such as China–Europe Railway Express (CER-Express), which connects countries and regions across Asia and Europe, has a potentially profound effect on land use, as evidenced by changes in land cover along the railway. To ensure sustainable development of such infrastructure and appropriate land administration, effective ways to monitor and assess its impact need to be developed. Remote sensing based on publicly available satellite imagery represents an obvious choice. In the study presented here, we employ a state-of-the-art deep-learning-based approach to automatically detect different types of land cover based on multispectral Sentinel-2 imagery. We then use these data to conduct and present a study of the changes in land use in two geopolitically diverse regions of interest (in Serbia and China and with and without CER-Express infrastructure) for the period of the last three years. Our results show that the standard image-patch-based land cover classification approaches suffer a significant drop in performance in our target scenario in which each pixel needs to be assigned a cove class, but still, validate the applicability of the proposed approach as a remote sensing tool to support the sustainable development of large infrastructure. We discuss the technical limitations of the proposed approach in detail and potential ways in which it can be improved

    V.Crnojevic, “Mining Web Videos for Video Quality Assessment

    No full text
    Abstract: Correlating estimates of objective measures related to the presence of different coding artifacts with the quality of video as perceived by human observers is a non-trivial task. There is no shortage of data to learn from, thanks to the Internet and web-sites such as YouTube tm . There has, however, been little done in the research community to try to use such resources to advance our understanding of perceived video quality. The problem is the fact that it is not easy to obtain the Mean Opinion Score (MOS), a standard measure of the perceived video quality, for more than a handful of videos. The paper presents an approach to determining the quality of a relatively large number of videos obtained randomly from YouTube tm . Several measures related to motion, saliency and coding artifacts are calculated for the frames of the video. Programmable graphics hardware is used to perform clustering: first, to create an artifacts-related signature of each video; then, to cluster the videos according to their signatures. To obtain an estimate for the video quality, MOS is obtained for representative videos, closest to the cluster centers. This is then used as an estimate of the quality of all other videos in the cluster. Results based on 2,107 videos containing some 90,000,000 frames are presented in the paper
    corecore