4,824 research outputs found
Drift robust non-rigid optical flow enhancement for long sequences
It is hard to densely track a nonrigid object in long term, which is a
fundamental research issue in the computer vision community. This task often
relies on estimating pairwise correspondences between images over time where
the error is accumulated and leads to a drift issue. In this paper, we
introduce a novel optimization framework with an Anchor Patch constraint. It is
supposed to significantly reduce overall errors given long sequences containing
non-rigidly deformable objects. Our framework can be applied to any dense
tracking algorithm, e.g. optical flow. We demonstrate the success of our
approach by showing significant error reduction on 6 popular optical flow
algorithms applied to a range of real-world nonrigid benchmarks. We also
provide quantitative analysis of our approach given synthetic occlusions and
image noise.Comment: Preprint version of our paper accepted by Journal of Intelligent and
Fuzzy System
Video Interpolation using Optical Flow and Laplacian Smoothness
Non-rigid video interpolation is a common computer vision task. In this paper
we present an optical flow approach which adopts a Laplacian Cotangent Mesh
constraint to enhance the local smoothness. Similar to Li et al., our approach
adopts a mesh to the image with a resolution up to one vertex per pixel and
uses angle constraints to ensure sensible local deformations between image
pairs. The Laplacian Mesh constraints are expressed wholly inside the optical
flow optimization, and can be applied in a straightforward manner to a wide
range of image tracking and registration problems. We evaluate our approach by
testing on several benchmark datasets, including the Middlebury and Garg et al.
datasets. In addition, we show application of our method for constructing 3D
Morphable Facial Models from dynamic 3D data
Recommended from our members
Fly eyes are not still: a motion illusion in Drosophila flight supports parallel visual processing.
Most animals shift gaze by a 'fixate and saccade' strategy, where the fixation phase stabilizes background motion. A logical prerequisite for robust detection and tracking of moving foreground objects, therefore, is to suppress the perception of background motion. In a virtual reality magnetic tether system enabling free yaw movement, Drosophila implemented a fixate and saccade strategy in the presence of a static panorama. When the spatial wavelength of a vertical grating was below the Nyquist wavelength of the compound eyes, flies drifted continuously and gaze could not be maintained at a single location. Because the drift occurs from a motionless stimulus - thus any perceived motion stimuli are generated by the fly itself - it is illusory, driven by perceptual aliasing. Notably, the drift speed was significantly faster than under a uniform panorama, suggesting perceptual enhancement as a result of aliasing. Under the same visual conditions in a rigid-tether paradigm, wing steering responses to the unresolvable static panorama were not distinguishable from those to a resolvable static pattern, suggesting visual aliasing is induced by ego motion. We hypothesized that obstructing the control of gaze fixation also disrupts detection and tracking of objects. Using the illusory motion stimulus, we show that magnetically tethered Drosophila track objects robustly in flight even when gaze is not fixated as flies continuously drift. Taken together, our study provides further support for parallel visual motion processing and reveals the critical influence of body motion on visuomotor processing. Motion illusions can reveal important shared principles of information processing across taxa
Robust Registration of Dynamic Facial Sequences.
Accurate face registration is a key step for several image analysis applications. However, existing registration methods are prone to temporal drift errors or jitter among consecutive frames. In this paper, we propose an iterative rigid registration framework that estimates the misalignment with trained regressors. The input of the regressors is a robust motion representation that encodes the motion between a misaligned frame and the reference frame(s), and enables reliable performance under non-uniform illumination variations. Drift errors are reduced when the motion representation is computed from multiple reference frames. Furthermore, we use the L2 norm of the representation as a cue for performing coarse-to-fine registration efficiently. Importantly, the framework can identify registration failures and correct them. Experiments show that the proposed approach achieves significantly higher registration accuracy than the state-of-the-art techniques in challenging sequences.The research work of Evangelos Sariyanidi and Hatice Gunes has been partially supported by the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref.: EP/L00416X/1)
MFT: Long-Term Tracking of Every Pixel
We propose MFT -- Multi-Flow dense Tracker -- a novel method for dense,
pixel-level, long-term tracking. The approach exploits optical flows estimated
not only between consecutive frames, but also for pairs of frames at
logarithmically spaced intervals. It selects the most reliable sequence of
flows on the basis of estimates of its geometric accuracy and the probability
of occlusion, both provided by a pre-trained CNN. We show that MFT achieves
competitive performance on the TAP-Vid benchmark, outperforming baselines by a
significant margin, and tracking densely orders of magnitude faster than the
state-of-the-art point-tracking methods. The method is insensitive to
medium-length occlusions and it is robustified by estimating flow with respect
to the reference frame, which reduces drift.Comment: accepted to WACV 2024. Code at https://github.com/serycjon/MF
Automatic Structural Scene Digitalization
In this paper, we present an automatic system for the analysis and labeling
of structural scenes, floor plan drawings in Computer-aided Design (CAD)
format. The proposed system applies a fusion strategy to detect and recognize
various components of CAD floor plans, such as walls, doors, windows and other
ambiguous assets. Technically, a general rule-based filter parsing method is
fist adopted to extract effective information from the original floor plan.
Then, an image-processing based recovery method is employed to correct
information extracted in the first step. Our proposed method is fully automatic
and real-time. Such analysis system provides high accuracy and is also
evaluated on a public website that, on average, archives more than ten
thousands effective uses per day and reaches a relatively high satisfaction
rate.Comment: paper submitted to PloS On
- …