20,726 research outputs found

    Discriminative Training of Deep Fully-connected Continuous CRF with Task-specific Loss

    Full text link
    Recent works on deep conditional random fields (CRF) have set new records on many vision tasks involving structured predictions. Here we propose a fully-connected deep continuous CRF model for both discrete and continuous labelling problems. We exemplify the usefulness of the proposed model on multi-class semantic labelling (discrete) and the robust depth estimation (continuous) problems. In our framework, we model both the unary and the pairwise potential functions as deep convolutional neural networks (CNN), which are jointly learned in an end-to-end fashion. The proposed method possesses the main advantage of continuously-valued CRF, which is a closed-form solution for the Maximum a posteriori (MAP) inference. To better adapt to different tasks, instead of using the commonly employed maximum likelihood CRF parameter learning protocol, we propose task-specific loss functions for learning the CRF parameters. It enables direct optimization of the quality of the MAP estimates during the course of learning. Specifically, we optimize the multi-class classification loss for the semantic labelling task and the Turkey's biweight loss for the robust depth estimation problem. Experimental results on the semantic labelling and robust depth estimation tasks demonstrate that the proposed method compare favorably against both baseline and state-of-the-art methods. In particular, we show that although the proposed deep CRF model is continuously valued, with the equipment of task-specific loss, it achieves impressive results even on discrete labelling tasks

    Personalizing the design of computer‐based instruction to enhance learning

    Get PDF
    This paper reports two studies designed to investigate the effect on learning outcomes of matching individuals’ preferred cognitive styles to computer‐based instructional (CBI) material. Study 1 considered the styles individually as Verbalizer, Imager, Wholist and Analytic. Study 2 considered the bi‐dimensional nature of cognitive styles in order to assess the full ramification of cognitive styles on learning: Analytic/Imager, Analytic/ Verbalizer, Wholist/Imager and the Wholist/Verbalizer. The mix of images and text, the nature of the text material, use of advance organizers and proximity of information to facilitate meaningful connections between various pieces of information were some of the considerations in the design of the CBI material. In a quasi‐experimental format, students’ cognitive styles were analysed by Cognitive Style Analysis (CSA) software. On the basis of the CSA result, the system defaulted students to either matched or mismatched CBI material by alternating between the two formats. The instructional material had a learning and a test phase. Learning outcome was tested on recall, labelling, explanation and problem‐solving tasks. Comparison of the matched and mismatched instruction did not indicate significant difference between the groups, but the consistently better performance by the matched group suggests potential for further investigations where the limitations cited in this paper are eliminated. The result did indicate a significant difference between the four cognitive styles with the Wholist/Verbalizer group performing better then all other cognitive styles. Analysing the difference between cognitive styles on individual test tasks indicated significant difference on recall, labelling and explanation, suggesting that certain test tasks may suit certain cognitive styles

    A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images

    Full text link
    Semantic segmentation is the pixel-wise labelling of an image. Since the problem is defined at the pixel level, determining image class labels only is not acceptable, but localising them at the original image pixel resolution is necessary. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high level and hierarchical image features; excessive numbers of deep learning-based 2D semantic segmentation approaches have been proposed within the last decade. In this survey, we mainly focus on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images. We started with an analysis of the public image sets and leaderboards for 2D semantic segmantation, with an overview of the techniques employed in performance evaluation. In examining the evolution of the field, we chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era. We technically analysed the solutions put forward in terms of solving the fundamental problems of the field, such as fine-grained localisation and scale invariance. Before drawing our conclusions, we present a table of methods from all mentioned eras, with a brief summary of each approach that explains their contribution to the field. We conclude the survey by discussing the current challenges of the field and to what extent they have been solved.Comment: Updated with new studie

    Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity.

    Get PDF
    A framework for the logical and statistical analysis and annotation of dynamic scenes containing occlusion and other uncertainties is presented. This framework consists of three elements; an object tracker module, an object recognition/classification module and a logical consistency, ambiguity and error reasoning engine. The principle behind the object tracker and object recognition modules is to reduce error by increasing ambiguity (by merging objects in close proximity and presenting multiple hypotheses). The reasoning engine deals with error, ambiguity and occlusion in a unified framework to produce a hypothesis that satisfies fundamental constraints on the spatio-temporal continuity of objects. Our algorithm finds a globally consistent model of an extended video sequence that is maximally supported by a voting function based on the output of a statistical classifier. The system results in an annotation that is significantly more accurate than what would be obtained by frame-by-frame evaluation of the classifier output. The framework has been implemented and applied successfully to the analysis of team sports with a single camera. Key words: Visua

    Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

    Get PDF
    In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score actions from colour images and optical flow. In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap. In stage 3, sequences of detection boxes most likely to be associated with a single action instance, called action tubes, are constructed by solving two energy maximisation problems via dynamic programming. While in the first pass, action paths spanning the whole video are built by linking detection boxes over time using their class-specific scores and their spatial overlap, in the second pass, temporal trimming is performed by ensuring label consistency for all constituting detection boxes. We demonstrate the performance of our algorithm on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new state-of-the-art results across the board and significantly increasing detection speed at test time. We achieve a huge leap forward in action detection performance and report a 20% and 11% gain in mAP (mean average precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the state-of-the-art.Comment: Accepted by British Machine Vision Conference 201

    Intelligent search strategies based on adaptive Constraint Handling Rules

    Full text link
    The most advanced implementation of adaptive constraint processing with Constraint Handling Rules (CHR) allows the application of intelligent search strategies to solve Constraint Satisfaction Problems (CSP). This presentation compares an improved version of conflict-directed backjumping and two variants of dynamic backtracking with respect to chronological backtracking on some of the AIM instances which are a benchmark set of random 3-SAT problems. A CHR implementation of a Boolean constraint solver combined with these different search strategies in Java is thus being compared with a CHR implementation of the same Boolean constraint solver combined with chronological backtracking in SICStus Prolog. This comparison shows that the addition of ``intelligence'' to the search process may reduce the number of search steps dramatically. Furthermore, the runtime of their Java implementations is in most cases faster than the implementations of chronological backtracking. More specifically, conflict-directed backjumping is even faster than the SICStus Prolog implementation of chronological backtracking, although our Java implementation of CHR lacks the optimisations made in the SICStus Prolog system. To appear in Theory and Practice of Logic Programming (TPLP).Comment: Number of pages: 27 Number of figures: 14 Number of Tables:
    • 

    corecore