32 research outputs found
A Laminar Cortical Model for 3D Perception of Slanted and Curved Surfaces and of 2D Images: Developement, attention, and Bistability
A model of laminar visual cortical dynamics proposes how 3D boundary and surface representations of slated and curved 3D objects and 2D images arise. The 3D boundary representations emerge from interactions between non-classical horizontal receptive field interactions with intracorticcal and intercortical feedback circuits. Such non-classical interactions contextually disambiguate classical receptive field responses to ambiguous visual cues using cells that are sensitive to angles and disparity gradients with cortical areas V1 and V2. These cells are all variants of bipole grouping cells. Model simulations show how horizontal connections can develop selectively to angles, how slanted surfaces can activate 3D boundary representations that are sensitive to angles and disparity gradients, how 3D filling-in occurs across slanted surfaces, how a 2D Necker cube image can be represented in 3D, and how bistable Necker cuber percepts occur. The model also explains data about slant aftereffects and 3D neon color spreading. It shows how habituative transmitters that help to control developement also help to trigger bistable 3D percepts and slant aftereffects, and how attention can influence which of these percepts is perceived by propogating along some object boundaries.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-98-1-0108); Defense Advanced Research Projects Agency and the Office of Naval Research (N0014-95-1-0409, N00014-01-1-0624, N00014-95-1-0657); National Science Foundation (IIS-97-20333
Effects of Highlights on Gloss Perception
The perception of a glossy surface in a static monochromatic image can occur when a bright highlight is embedded in a compatible context of shading and a bounding contour. Some images naturally give rise to the impression that a surface has a uniform reflectance, characteristic of a shiny object, even though the highlight may only cover a small portion of the surface. Nonetheless, an observer may adopt an attitude of scrutiny in viewing a glossy surface, whereby the impression of gloss is partial and nonuniform at image regions outside of a higlight. Using a rating scale and small probe points to indicate image locations, differential perception of gloss within a single object is investigate in the present study. Observers' gloss ratings are not uniform across the surface, but decrease as a function of distance from highlight. When, by design, the distance from a highlight is uncoupled from the luminance value at corresponding probe points, the decrease in rated gloss correlates more with the distance than with the luminance change. Experiments also indicate that gloss ratings change as a function of estimated surface distance, rather than as a function of image distance. Surface continuity affects gloss ratings, suggesting that apprehension of 3D surface structure is crucial for gloss perception.Air Force Office of Scientific Research (F49620-98-1-0108), Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409), National Science Foundation (IIS-97-20333); Office of Naval Research (N00014-95-1-0657, N00014-01-1-0624); Whitaker Foundation (RG-99-0186
ReLU-QP: A GPU-Accelerated Quadratic Programming Solver for Model-Predictive Control
We present ReLU-QP, a GPU-accelerated solver for quadratic programs (QPs)
that is capable of solving high-dimensional control problems at real-time
rates. ReLU-QP is derived by exactly reformulating the Alternating Direction
Method of Multipliers (ADMM) algorithm for solving QPs as a deep, weight-tied
neural network with rectified linear unit (ReLU) activations. This
reformulation enables the deployment of ReLU-QP on GPUs using standard
machine-learning toolboxes. We evaluate the performance of ReLU-QP across three
model-predictive control (MPC) benchmarks: stabilizing random linear dynamical
systems with control limits, balancing an Atlas humanoid robot on a single
foot, and tracking whole-body reference trajectories on a quadruped equipped
with a six-degree-of-freedom arm. These benchmarks indicate that ReLU-QP is
competitive with state-of-the-art CPU-based solvers for small-to-medium-scale
problems and offers order-of-magnitude speed improvements for larger-scale
problems.Comment: submitted to ICRA 202
Exploiting Data and Human Knowledge for Predicting Wildlife Poaching
Poaching continues to be a significant threat to the conservation of wildlife
and the associated ecosystem. Estimating and predicting where the poachers have
committed or would commit crimes is essential to more effective allocation of
patrolling resources. The real-world data in this domain is often sparse, noisy
and incomplete, consisting of a small number of positive data (poaching signs),
a large number of negative data with label uncertainty, and an even larger
number of unlabeled data. Fortunately, domain experts such as rangers can
provide complementary information about poaching activity patterns. However,
this kind of human knowledge has rarely been used in previous approaches. In
this paper, we contribute new solutions to the predictive analysis of poaching
patterns by exploiting both very limited data and human knowledge. We propose
an approach to elicit quantitative information from domain experts through a
questionnaire built upon a clustering-based division of the conservation area.
In addition, we propose algorithms that exploit qualitative and quantitative
information provided by the domain experts to augment the dataset and improve
learning. In collaboration with World Wild Fund for Nature, we show that
incorporating human knowledge leads to better predictions in a conservation
area in Northeastern China where the charismatic species is Siberian Tiger. The
results show the importance of exploiting human knowledge when learning from
limited data.Comment: COMPASS 201
Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark
Most existing works on few-shot object detection (FSOD) focus on a setting
where both pre-training and few-shot learning datasets are from a similar
domain. However, few-shot algorithms are important in multiple domains; hence
evaluation needs to reflect the broad applications. We propose a Multi-dOmain
Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a
wide range of domains to evaluate FSOD algorithms. We comprehensively analyze
the impacts of freezing layers, different architectures, and different
pre-training datasets on FSOD performance. Our empirical results show several
key factors that have not been explored in previous works: 1) contrary to
previous belief, on a multi-domain benchmark, fine-tuning (FT) is a strong
baseline for FSOD, performing on par or better than the state-of-the-art (SOTA)
algorithms; 2) utilizing FT as the baseline allows us to explore multiple
architectures, and we found them to have a significant impact on down-stream
few-shot tasks, even with similar pre-training performances; 3) by decoupling
pre-training and few-shot learning, MoFSOD allows us to explore the impact of
different pre-training datasets, and the right choice can boost the performance
of the down-stream tasks significantly. Based on these findings, we list
possible avenues of investigation for improving FSOD performance and propose
two simple modifications to existing algorithms that lead to SOTA performance
on the MoFSOD benchmark. The code is available at
https://github.com/amazon-research/few-shot-object-detection-benchmark.Comment: Accepted at ECCV 202