104,591 research outputs found
Measuring the Accuracy of Object Detectors and Trackers
The accuracy of object detectors and trackers is most commonly evaluated by
the Intersection over Union (IoU) criterion. To date, most approaches are
restricted to axis-aligned or oriented boxes and, as a consequence, many
datasets are only labeled with boxes. Nevertheless, axis-aligned or oriented
boxes cannot accurately capture an object's shape. To address this, a number of
densely segmented datasets has started to emerge in both the object detection
and the object tracking communities. However, evaluating the accuracy of object
detectors and trackers that are restricted to boxes on densely segmented data
is not straightforward. To close this gap, we introduce the relative
Intersection over Union (rIoU) accuracy measure. The measure normalizes the IoU
with the optimal box for the segmentation to generate an accuracy measure that
ranges between 0 and 1 and allows a more precise measurement of accuracies.
Furthermore, it enables an efficient and easy way to understand scenes and the
strengths and weaknesses of an object detection or tracking approach. We
display how the new measure can be efficiently calculated and present an
easy-to-use evaluation framework. The framework is tested on the DAVIS and the
VOT2016 segmentations and has been made available to the community.Comment: 10 pages, 7 Figure
Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders
Convolutional autoencoders have emerged as popular methods for unsupervised
defect segmentation on image data. Most commonly, this task is performed by
thresholding a pixel-wise reconstruction error based on an distance.
This procedure, however, leads to large residuals whenever the reconstruction
encompasses slight localization inaccuracies around edges. It also fails to
reveal defective regions that have been visually altered when intensity values
stay roughly consistent. We show that these problems prevent these approaches
from being applied to complex real-world scenarios and that it cannot be easily
avoided by employing more elaborate architectures such as variational or
feature matching autoencoders. We propose to use a perceptual loss function
based on structural similarity which examines inter-dependencies between local
image regions, taking into account luminance, contrast and structural
information, instead of simply comparing single pixel values. It achieves
significant performance gains on a challenging real-world dataset of
nanofibrous materials and a novel dataset of two woven fabrics over the state
of the art approaches for unsupervised defect segmentation that use pixel-wise
reconstruction error metrics
An Overview of Economic Approaches to Information Security Management
The increasing concerns of clients, particularly in online commerce, plus the impact of legislations on information security have compelled companies to put more resources in information security. As a result, senior managers in many organizations are now expressing a much greater interest in information security. However, the largest body of research related to preventing breaches is technical, focusing on such issues as encryption and access control. In contrast, research related to the economic aspects of information security is small but rapidly growing. The goal of this technical note is twofold: i) to provide the reader with an structured overview of the economic approaches to information security and ii) to identify potential research directions
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
We propose a weakly-supervised framework for action labeling in video, where
only the order of occurring actions is required during training time. The key
challenge is that the per-frame alignments between the input (video) and label
(action) sequences are unknown during training. We address this by introducing
the Extended Connectionist Temporal Classification (ECTC) framework to
efficiently evaluate all possible alignments via dynamic programming and
explicitly enforce their consistency with frame-to-frame visual similarities.
This protects the model from distractions of visually inconsistent or
degenerated alignments without the need of temporal supervision. We further
extend our framework to the semi-supervised case when a few frames are sparsely
annotated in a video. With less than 1% of labeled frames per video, our method
is able to outperform existing semi-supervised approaches and achieve
comparable performance to that of fully supervised approaches.Comment: To appear in ECCV 201
- ā¦