30,802 research outputs found
Diverse Knowledge Distillation for End-to-End Person Search
Person search aims to localize and identify a specific person from a gallery
of images. Recent methods can be categorized into two groups, i.e., two-step
and end-to-end approaches. The former views person search as two independent
tasks and achieves dominant results using separately trained person detection
and re-identification (Re-ID) models. The latter performs person search in an
end-to-end fashion. Although the end-to-end approaches yield higher inference
efficiency, they largely lag behind those two-step counterparts in terms of
accuracy. In this paper, we argue that the gap between the two kinds of methods
is mainly caused by the Re-ID sub-networks of end-to-end methods. To this end,
we propose a simple yet strong end-to-end network with diverse knowledge
distillation to break the bottleneck. We also design a spatial-invariant
augmentation to assist model to be invariant to inaccurate detection results.
Experimental results on the CUHK-SYSU and PRW datasets demonstrate the
superiority of our method against existing approaches -- it achieves on par
accuracy with state-of-the-art two-step methods while maintaining high
efficiency due to the single joint model. Code is available at:
https://git.io/DKD-PersonSearch.Comment: Accepted to AAAI, 2021. Code is available at:
https://git.io/DKD-PersonSearc
Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation
This paper proposes a new hybrid architecture that consists of a deep
Convolutional Network and a Markov Random Field. We show how this architecture
is successfully applied to the challenging problem of articulated human pose
estimation in monocular images. The architecture can exploit structural domain
constraints such as geometric relationships between body joint locations. We
show that joint training of these two model paradigms improves performance and
allows us to significantly outperform existing state-of-the-art techniques
SVS-JOIN : efficient spatial visual similarity join for geo-multimedia
In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
State-of-the-art object detection networks depend on region proposal
algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN
have reduced the running time of these detection networks, exposing region
proposal computation as a bottleneck. In this work, we introduce a Region
Proposal Network (RPN) that shares full-image convolutional features with the
detection network, thus enabling nearly cost-free region proposals. An RPN is a
fully convolutional network that simultaneously predicts object bounds and
objectness scores at each position. The RPN is trained end-to-end to generate
high-quality region proposals, which are used by Fast R-CNN for detection. We
further merge RPN and Fast R-CNN into a single network by sharing their
convolutional features---using the recently popular terminology of neural
networks with 'attention' mechanisms, the RPN component tells the unified
network where to look. For the very deep VGG-16 model, our detection system has
a frame rate of 5fps (including all steps) on a GPU, while achieving
state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS
COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015
competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning
entries in several tracks. Code has been made publicly available.Comment: Extended tech repor
Portunes: analyzing multi-domain insider threats
The insider threat is an important problem in securing information systems. Skilful insiders use attack vectors that yield the greatest chance of success, and thus do not limit themselves to a restricted set of attacks. They may use access rights to the facility where the system of interest resides, as well as existing relationships with employees. To secure a system, security professionals should therefore consider attacks that include non-digital aspects such as key sharing or exploiting trust relationships among employees. In this paper, we present Portunes, a framework for security design and audit, which incorporates three security domains: (1) the security of the computer system itself (the digital domain), (2) the security of the location where the system is deployed (the physical domain) and (3) the security awareness of the employees that use the system (the social domain). The framework consists of a model, a formal language and a logic. It allows security professionals to formally model elements from the three domains in a single framework, and to analyze possible attack scenarios. The logic enables formal specification of the attack scenarios in terms of state and transition properties
- …