1,231 research outputs found
Perception of Motion and Architectural Form: Computational Relationships between Optical Flow and Perspective
Perceptual geometry refers to the interdisciplinary research whose objectives
focuses on study of geometry from the perspective of visual perception, and in
turn, applies such geometric findings to the ecological study of vision.
Perceptual geometry attempts to answer fundamental questions in perception of
form and representation of space through synthesis of cognitive and biological
theories of visual perception with geometric theories of the physical world.
Perception of form, space and motion are among fundamental problems in vision
science. In cognitive and computational models of human perception, the
theories for modeling motion are treated separately from models for perception
of form.Comment: 10 pages, 13 figures, submitted and accepted in DoCEIS'2012
Conference: http://www.uninova.pt/doceis/doceis12/home/home.ph
AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving
Unlike humans, who can effortlessly estimate the entirety of objects even
when partially occluded, modern computer vision algorithms still find this
aspect extremely challenging. Leveraging this amodal perception for autonomous
driving remains largely untapped due to the lack of suitable datasets. The
curation of these datasets is primarily hindered by significant annotation
costs and mitigating annotator subjectivity in accurately labeling occluded
regions. To address these limitations, we introduce AmodalSynthDrive, a
synthetic multi-task multi-modal amodal perception dataset. The dataset
provides multi-view camera images, 3D bounding boxes, LiDAR data, and odometry
for 150 driving sequences with over 1M object annotations in diverse traffic,
weather, and lighting conditions. AmodalSynthDrive supports multiple amodal
scene understanding tasks including the introduced amodal depth estimation for
enhanced spatial understanding. We evaluate several baselines for each of these
tasks to illustrate the challenges and set up public benchmarking servers. The
dataset is available at http://amodalsynthdrive.cs.uni-freiburg.de
Abnormal Infant Movements Classification With Deep Learning on Pose-Based Features
The pursuit of early diagnosis of cerebral palsy has been an active research area with some very promising results using tools such as the General Movements Assessment (GMA). In our previous work, we explored the feasibility of extracting pose-based features from video sequences to automatically classify infant body movement into two categories, normal and abnormal. The classification was based upon the GMA, which was carried out on the video data by an independent expert reviewer. In this paper we extend our previous work by extracting the normalised pose-based feature sets, Histograms of Joint Orientation 2D (HOJO2D) and Histograms of Joint Displacement 2D (HOJD2D), for use in new deep learning architectures. We explore the viability of using these pose-based feature sets for automated classification within a deep learning framework by carrying out extensive experiments on five new deep learning architectures. Experimental results show that the proposed fully connected neural network FCNet performed robustly across different feature sets. Furthermore, the proposed convolutional neural network architectures demonstrated excellent performance in handling features in higher dimensionality. We make the code, extracted features and associated GMA labels publicly available
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving
Key-point-based scene understanding is fundamental for autonomous driving
applications. At the same time, optical flow plays an important role in many
vision tasks. However, due to the implicit bias of equal attention on all
points, classic data-driven optical flow estimation methods yield less
satisfactory performance on key points, limiting their implementations in
key-point-critical safety-relevant scenarios. To address these issues, we
introduce a points-based modeling method that requires the model to learn
key-point-related priors explicitly. Based on the modeling method, we present
FocusFlow, a framework consisting of 1) a mix loss function combined with a
classic photometric loss function and our proposed Conditional Point Control
Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned
controlling model which substitutes the conventional feature encoder by our
proposed Condition Control Encoder (CCE). CCE incorporates a Frame Feature
Encoder (FFE) that extracts features from frames, a Condition Feature Encoder
(CFE) that learns to control the feature extraction behavior of FFE from input
masks containing information of key points, and fusion modules that transfer
the controlling information between FFE and CFE. Our FocusFlow framework shows
outstanding performance with up to +44.5% precision improvement on various key
points such as ORB, SIFT, and even learning-based SiLK, along with exceptional
scalability for most existing data-driven optical flow methods like PWC-Net,
RAFT, and FlowFormer. Notably, FocusFlow yields competitive or superior
performances rivaling the original models on the whole frame. The source code
will be available at https://github.com/ZhonghuaYi/FocusFlow_official.Comment: The source code of FocusFlow will be available at
https://github.com/ZhonghuaYi/FocusFlow_officia
- …