1,559 research outputs found
Coding local and global binary visual features extracted from video sequences
Binary local features represent an effective alternative to real-valued
descriptors, leading to comparable results for many visual analysis tasks,
while being characterized by significantly lower computational complexity and
memory requirements. When dealing with large collections, a more compact
representation based on global features is often preferred, which can be
obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW)
model. Several applications, including for example visual sensor networks and
mobile augmented reality, require visual features to be transmitted over a
bandwidth-limited network, thus calling for coding techniques that aim at
reducing the required bit budget, while attaining a target level of efficiency.
In this paper we investigate a coding scheme tailored to both local and global
binary features, which aims at exploiting both spatial and temporal redundancy
by means of intra- and inter-frame coding. In this respect, the proposed coding
scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC)
paradigm. That is, visual features are extracted from the acquired content,
encoded at remote nodes, and finally transmitted to a central controller that
performs visual analysis. This is in contrast with the traditional approach, in
which visual content is acquired at a node, compressed and then sent to a
central unit for further processing, according to the Compress-Then-Analyze
(CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of
rate-efficiency curves in the context of two different visual analysis tasks:
homography estimation and content-based retrieval. Our results show that the
novel ATC paradigm based on the proposed coding primitives can be competitive
with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin
Semantic Validation in Structure from Motion
The Structure from Motion (SfM) challenge in computer vision is the process
of recovering the 3D structure of a scene from a series of projective
measurements that are calculated from a collection of 2D images, taken from
different perspectives. SfM consists of three main steps; feature detection and
matching, camera motion estimation, and recovery of 3D structure from estimated
intrinsic and extrinsic parameters and features.
A problem encountered in SfM is that scenes lacking texture or with
repetitive features can cause erroneous feature matching between frames.
Semantic segmentation offers a route to validate and correct SfM models by
labelling pixels in the input images with the use of a deep convolutional
neural network. The semantic and geometric properties associated with classes
in the scene can be taken advantage of to apply prior constraints to each class
of object. The SfM pipeline COLMAP and semantic segmentation pipeline DeepLab
were used. This, along with planar reconstruction of the dense model, were used
to determine erroneous points that may be occluded from the calculated camera
position, given the semantic label, and thus prior constraint of the
reconstructed plane. Herein, semantic segmentation is integrated into SfM to
apply priors on the 3D point cloud, given the object detection in the 2D input
images. Additionally, the semantic labels of matched keypoints are compared and
inconsistent semantically labelled points discarded. Furthermore, semantic
labels on input images are used for the removal of objects associated with
motion in the output SfM models. The proposed approach is evaluated on a
data-set of 1102 images of a repetitive architecture scene. This project offers
a novel method for improved validation of 3D SfM models
Plant image retrieval using color, shape and texture features
We present a content-based image retrieval system for plant image retrieval, intended especially for the house plant identification problem. A plant image consists of a collection of overlapping leaves and possibly flowers, which makes the problem challenging.We studied the suitability of various well-known color, shape and texture features for this problem, as well as introducing some new texture matching techniques and shape features. Feature extraction is applied after segmenting the plant region from the background using the max-flow min-cut technique. Results on a database of 380 plant images belonging to 78 different types of plants show promise of the proposed new techniques
and the overall system: in 55% of the queries, the correct plant image is retrieved among the top-15 results. Furthermore, the accuracy goes up to 73% when a 132-image subset of well-segmented plant images are considered
A novel sketch based face recognition in unconstrained video for criminal investigation
Face recognition in video surveillance helps to identify an individual by comparing facial features of given photograph or sketch with a video for criminal investigations. Generally, face sketch is used by the police when suspect’s photo is not available. Manual matching of facial sketch with suspect’s image in a long video is tedious and time-consuming task. To overcome these drawbacks, this paper proposes an accurate face recognition technique to recognize a person based on his sketch in an unconstrained video surveillance. In the proposed method, surveillance video and sketch of suspect is taken as an input. Firstly, input video is converted into frames and summarized using the proposed quality indexed three step cross search algorithm. Next, faces are detected by proposed modified Viola-Jones algorithm. Then, necessary features are selected using the proposed salp-cat optimization algorithm. Finally, these features are fused with scale-invariant feature transform (SIFT) features and Euclidean distance is computed between feature vectors of sketch and each face in a video. Face from the video having lowest Euclidean distance with query sketch is considered as suspect’s face. The proposed method’s performance is analyzed on Chokepoint dataset and the system works efficiently with 89.02% of precision, 91.25% of recall and 90.13% of F-measure
Recommended from our members
Virtual viewpoint three-dimensional panorama
Conventional panoramic images are known to provide for an enhanced field of view in which the scene
always has a fixed appearance. The idea presented in this paper focuses on the use of the concept of virtual
viewpoint creation to generate different panoramic images of the same scene with three-dimensional
component. Three-dimensional effect in a resultant panorama is realized by superimposing a stereo-pair of
panoramic images
Large-scale environment mapping and immersive human-robot interaction for agricultural mobile robot teleoperation
Remote operation is a crucial solution to problems encountered in
agricultural machinery operations. However, traditional video streaming control
methods fall short in overcoming the challenges of single perspective views and
the inability to obtain 3D information. In light of these issues, our research
proposes a large-scale digital map reconstruction and immersive human-machine
remote control framework for agricultural scenarios. In our methodology, a DJI
unmanned aerial vehicle(UAV) was utilized for data collection, and a novel
video segmentation approach based on feature points was introduced. To tackle
texture richness variability, an enhanced Structure from Motion (SfM) using
superpixel segmentation was implemented. This method integrates the open
Multiple View Geometry (openMVG) framework along with Local Features from
Transformers (LoFTR). The enhanced SfM results in a point cloud map, which is
further processed through Multi-View Stereo (MVS) to generate a complete map
model. For control, a closed-loop system utilizing TCP for VR control and
positioning of agricultural machinery was introduced. Our system offers a fully
visual-based immersive control method, where upon connection to the local area
network, operators can utilize VR for immersive remote control. The proposed
method enhances both the robustness and convenience of the reconstruction
process, thereby significantly facilitating operators in acquiring more
comprehensive on-site information and engaging in immersive remote control
operations. The code is available at: https://github.com/LiuTao1126/Enhance-SF
Research on a modifeied RANSAC and its applications to ellipse detection from a static image and motion detection from active stereo video sequences
制度:新 ; 報告番号:甲3091号 ; 学位の種類:博士(国際情報通信学) ; 授与年月日:2010/2/24 ; 早大学位記番号:新535
- …