20,726 research outputs found
Discriminative Training of Deep Fully-connected Continuous CRF with Task-specific Loss
Recent works on deep conditional random fields (CRF) have set new records on
many vision tasks involving structured predictions. Here we propose a
fully-connected deep continuous CRF model for both discrete and continuous
labelling problems. We exemplify the usefulness of the proposed model on
multi-class semantic labelling (discrete) and the robust depth estimation
(continuous) problems.
In our framework, we model both the unary and the pairwise potential
functions as deep convolutional neural networks (CNN), which are jointly
learned in an end-to-end fashion. The proposed method possesses the main
advantage of continuously-valued CRF, which is a closed-form solution for the
Maximum a posteriori (MAP) inference.
To better adapt to different tasks, instead of using the commonly employed
maximum likelihood CRF parameter learning protocol, we propose task-specific
loss functions for learning the CRF parameters.
It enables direct optimization of the quality of the MAP estimates during the
course of learning.
Specifically, we optimize the multi-class classification loss for the
semantic labelling task and the Turkey's biweight loss for the robust depth
estimation problem.
Experimental results on the semantic labelling and robust depth estimation
tasks demonstrate that the proposed method compare favorably against both
baseline and state-of-the-art methods.
In particular, we show that although the proposed deep CRF model is
continuously valued, with the equipment of task-specific loss, it achieves
impressive results even on discrete labelling tasks
Personalizing the design of computerâbased instruction to enhance learning
This paper reports two studies designed to investigate the effect on learning outcomes of matching individualsâ preferred cognitive styles to computerâbased instructional (CBI) material. Study 1 considered the styles individually as Verbalizer, Imager, Wholist and Analytic. Study 2 considered the biâdimensional nature of cognitive styles in order to assess the full ramification of cognitive styles on learning: Analytic/Imager, Analytic/ Verbalizer, Wholist/Imager and the Wholist/Verbalizer. The mix of images and text, the nature of the text material, use of advance organizers and proximity of information to facilitate meaningful connections between various pieces of information were some of the considerations in the design of the CBI material. In a quasiâexperimental format, studentsâ cognitive styles were analysed by Cognitive Style Analysis (CSA) software. On the basis of the CSA result, the system defaulted students to either matched or mismatched CBI material by alternating between the two formats. The instructional material had a learning and a test phase. Learning outcome was tested on recall, labelling, explanation and problemâsolving tasks. Comparison of the matched and mismatched instruction did not indicate significant difference between the groups, but the consistently better performance by the matched group suggests potential for further investigations where the limitations cited in this paper are eliminated. The result did indicate a significant difference between the four cognitive styles with the Wholist/Verbalizer group performing better then all other cognitive styles. Analysing the difference between cognitive styles on individual test tasks indicated significant difference on recall, labelling and explanation, suggesting that certain test tasks may suit certain cognitive styles
A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
Semantic segmentation is the pixel-wise labelling of an image. Since the
problem is defined at the pixel level, determining image class labels only is
not acceptable, but localising them at the original image pixel resolution is
necessary. Boosted by the extraordinary ability of convolutional neural
networks (CNN) in creating semantic, high level and hierarchical image
features; excessive numbers of deep learning-based 2D semantic segmentation
approaches have been proposed within the last decade. In this survey, we mainly
focus on the recent scientific developments in semantic segmentation,
specifically on deep learning-based methods using 2D images. We started with an
analysis of the public image sets and leaderboards for 2D semantic
segmantation, with an overview of the techniques employed in performance
evaluation. In examining the evolution of the field, we chronologically
categorised the approaches into three main periods, namely pre-and early deep
learning era, the fully convolutional era, and the post-FCN era. We technically
analysed the solutions put forward in terms of solving the fundamental problems
of the field, such as fine-grained localisation and scale invariance. Before
drawing our conclusions, we present a table of methods from all mentioned eras,
with a brief summary of each approach that explains their contribution to the
field. We conclude the survey by discussing the current challenges of the field
and to what extent they have been solved.Comment: Updated with new studie
Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity.
A framework for the logical and statistical analysis and annotation of dynamic scenes containing occlusion and other uncertainties is presented. This framework consists
of three elements; an object tracker module, an object recognition/classification module and a logical consistency, ambiguity and error reasoning engine. The principle behind the object tracker and object recognition modules is to reduce error by increasing ambiguity (by merging objects in close proximity and presenting multiple
hypotheses). The reasoning engine deals with error, ambiguity and occlusion in a unified framework to produce a hypothesis that satisfies fundamental constraints
on the spatio-temporal continuity of objects. Our algorithm finds a globally consistent model of an extended video sequence that is maximally supported by a voting function based on the output of a statistical classifier. The system results
in an annotation that is significantly more accurate than what would be obtained
by frame-by-frame evaluation of the classifier output. The framework has been implemented
and applied successfully to the analysis of team sports with a single
camera.
Key words: Visua
Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos
In this work, we propose an approach to the spatiotemporal localisation
(detection) and classification of multiple concurrent actions within temporally
untrimmed videos. Our framework is composed of three stages. In stage 1,
appearance and motion detection networks are employed to localise and score
actions from colour images and optical flow. In stage 2, the appearance network
detections are boosted by combining them with the motion detection scores, in
proportion to their respective spatial overlap. In stage 3, sequences of
detection boxes most likely to be associated with a single action instance,
called action tubes, are constructed by solving two energy maximisation
problems via dynamic programming. While in the first pass, action paths
spanning the whole video are built by linking detection boxes over time using
their class-specific scores and their spatial overlap, in the second pass,
temporal trimming is performed by ensuring label consistency for all
constituting detection boxes. We demonstrate the performance of our algorithm
on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new
state-of-the-art results across the board and significantly increasing
detection speed at test time. We achieve a huge leap forward in action
detection performance and report a 20% and 11% gain in mAP (mean average
precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the
state-of-the-art.Comment: Accepted by British Machine Vision Conference 201
Intelligent search strategies based on adaptive Constraint Handling Rules
The most advanced implementation of adaptive constraint processing with
Constraint Handling Rules (CHR) allows the application of intelligent search
strategies to solve Constraint Satisfaction Problems (CSP). This presentation
compares an improved version of conflict-directed backjumping and two variants
of dynamic backtracking with respect to chronological backtracking on some of
the AIM instances which are a benchmark set of random 3-SAT problems. A CHR
implementation of a Boolean constraint solver combined with these different
search strategies in Java is thus being compared with a CHR implementation of
the same Boolean constraint solver combined with chronological backtracking in
SICStus Prolog. This comparison shows that the addition of ``intelligence'' to
the search process may reduce the number of search steps dramatically.
Furthermore, the runtime of their Java implementations is in most cases faster
than the implementations of chronological backtracking. More specifically,
conflict-directed backjumping is even faster than the SICStus Prolog
implementation of chronological backtracking, although our Java implementation
of CHR lacks the optimisations made in the SICStus Prolog system. To appear in
Theory and Practice of Logic Programming (TPLP).Comment: Number of pages: 27 Number of figures: 14 Number of Tables:
- âŠ