33,181 research outputs found
Sparse Regression via Range Counting
The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set S of n points in ?^d, a point y? ?^d, and an integer 2 ? k ? d, find an affine combination of at most k points of S that is nearest to y. We describe a O(n^{k-1} log^{d-k+2} n)-time randomized (1+?)-approximation algorithm for this problem with d and ? constant. This is the first algorithm for this problem running in time o(n^k). Its running time is similar to the query time of a data structure recently proposed by Har-Peled, Indyk, and Mahabadi (ICALP\u2718), while not requiring any preprocessing. Up to polylogarithmic factors, it matches a conditional lower bound relying on a conjecture about affine degeneracy testing. In the special case where k = d = O(1), we provide a simple O_?(n^{d-1+?})-time deterministic exact algorithm, for any ? > 0. Finally, we show how to adapt the approximation algorithm for the sparse linear regression and sparse convex regression problems with the same running time, up to polylogarithmic factors
PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting
Crowd counting, i.e., estimating the number of people in a crowded area, has
attracted much interest in the research community. Although many attempts have
been reported, crowd counting remains an open real-world problem due to the
vast scale variations in crowd density within the interested area, and severe
occlusion among the crowd. In this paper, we propose a novel Pyramid
Density-Aware Attention-based network, abbreviated as PDANet, that leverages
the attention, pyramid scale feature and two branch decoder modules for
density-aware crowd counting. The PDANet utilizes these modules to extract
different scale features, focus on the relevant information, and suppress the
misleading ones. We also address the variation of crowdedness levels among
different images with an exclusive Density-Aware Decoder (DAD). For this
purpose, a classifier evaluates the density level of the input features and
then passes them to the corresponding high and low crowded DAD modules.
Finally, we generate an overall density map by considering the summation of low
and high crowded density maps as spatial attention. Meanwhile, we employ two
losses to create a precise density map for the input scene. Extensive
evaluations conducted on the challenging benchmark datasets well demonstrate
the superior performance of the proposed PDANet in terms of the accuracy of
counting and generated density maps over the well-known state of the arts
DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation
In real-world crowd counting applications, the crowd densities vary greatly
in spatial and temporal domains. A detection based counting method will
estimate crowds accurately in low density scenes, while its reliability in
congested areas is downgraded. A regression based approach, on the other hand,
captures the general density information in crowded regions. Without knowing
the location of each person, it tends to overestimate the count in low density
areas. Thus, exclusively using either one of them is not sufficient to handle
all kinds of scenes with varying densities. To address this issue, a novel
end-to-end crowd counting framework, named DecideNet (DEteCtIon and Density
Estimation Network) is proposed. It can adaptively decide the appropriate
counting mode for different locations on the image based on its real density
conditions. DecideNet starts with estimating the crowd density by generating
detection and regression based density maps separately. To capture inevitable
variation in densities, it incorporates an attention module, meant to
adaptively assess the reliability of the two types of estimations. The final
crowd counts are obtained with the guidance of the attention module to adopt
suitable estimations from the two kinds of density maps. Experimental results
show that our method achieves state-of-the-art performance on three challenging
crowd counting datasets.Comment: CVPR 201
Learning Less is More - 6D Camera Localization via 3D Surface Regression
Popular research areas like autonomous driving and augmented reality have
renewed the interest in image-based camera localization. In this work, we
address the task of predicting the 6D camera pose from a single RGB image in a
given 3D environment. With the advent of neural networks, previous works have
either learned the entire camera localization process, or multiple components
of a camera localization pipeline. Our key contribution is to demonstrate and
explain that learning a single component of this pipeline is sufficient. This
component is a fully convolutional neural network for densely regressing
so-called scene coordinates, defining the correspondence between the input
image and the 3D scene space. The neural network is prepended to a new
end-to-end trainable pipeline. Our system is efficient, highly accurate, robust
in training, and exhibits outstanding generalization capabilities. It exceeds
state-of-the-art consistently on indoor and outdoor datasets. Interestingly,
our approach surpasses existing techniques even without utilizing a 3D model of
the scene during training, since the network is able to discover 3D scene
geometry automatically, solely from single-view constraints.Comment: CVPR 201
- …