6 research outputs found
Segmentation of Moving Object with Uncovered Background, Temporary Poses and GMOB
AbstractVideo has to be segmented into objects for content-based processing. A number of video object segmentation algorithms have been proposed such as semiautomatic and automatic. Semiautomatic methods adds burden to users and also not suitable for some applications. Automatic segmentation systems are still a challenge, although they are required by many applications. The proposed work aims at contributing to identify the gaps that are present in the current segmentation system and also to give the possible solutions to overcome those gaps so that the accurate and efficient video segmentation system can be developed. The proposed system aims to resolve the issue of uncovered background, Temporary poses and Global motion of background
Deep-Learning-Based Computer Vision Approach For The Segmentation Of Ball Deliveries And Tracking In Cricket
There has been a significant increase in the adoption of technology in
cricket recently. This trend has created the problem of duplicate work being
done in similar computer vision-based research works. Our research tries to
solve one of these problems by segmenting ball deliveries in a cricket
broadcast using deep learning models, MobileNet and YOLO, thus enabling
researchers to use our work as a dataset for their research. The output from
our research can be used by cricket coaches and players to analyze ball
deliveries which are played during the match. This paper presents an approach
to segment and extract video shots in which only the ball is being delivered.
The video shots are a series of continuous frames that make up the whole scene
of the video. Object detection models are applied to reach a high level of
accuracy in terms of correctly extracting video shots. The proof of concept for
building large datasets of video shots for ball deliveries is proposed which
paves the way for further processing on those shots for the extraction of
semantics. Ball tracking in these video shots is also done using a separate
RetinaNet model as a sample of the usefulness of the proposed dataset. The
position on the cricket pitch where the ball lands is also extracted by
tracking the ball along the y-axis. The video shot is then classified as a
full-pitched, good-length or short-pitched delivery
Reconfigurable FPGA Architecture for Computer Vision Applications in Smart Camera Networks
Smart Camera Networks (SCNs) is nowadays an emerging research field which represents the
natural evolution of centralized computer vision applications towards full distributed and
pervasive systems. In this vision, one of the biggest effort is in the definition of a flexible and
reconfigurable SCN node architecture able to remotely update the application parameter and the
performed computer vision application at runtime. In this respect, we present a novel SCN node
architecture based on a device in which a microcontroller manage all the network functionality as
well as the remote configuration, while an FPGA implements all the necessary module of a full
computer vision pipeline. In this work the envisioned architecture is first detailed in general
terms, then a real implementation is presented to show the feasibility and the benefits of the
proposed solution. Finally, performance evaluation results underline the potential of an hardware
software codesign approach in reaching flexibility and reduced processing time
Linking Spatial Video and GIS
Spatial Video is any form of geographically referenced videographic data. The forms in which it is acquired, stored and used vary enormously; as does the standard of accuracy in the spatial data and the quality of the video footage. This research deals with a specific form of Spatial Video where these data have been captured from a moving road-network survey vehicle. The spatial data are GPS sentences while the video orientation is approximately orthogonal and coincident with the direction of travel.
GIS that use these data are usually bespoke standalone systems or third party extensions to existing platforms. They specialise in using the video as a visual enhancement with limited spatial functionality and interoperability. While enormous amounts of these data exist, they do not have a generalised, cross-platform spatial data structure that is suitable for use within a GIS. The objectives of this research have been to define, develop and implement a novel Spatial Video data structure and demonstrate how this can achieve a spatial approach to the study of video.
This data structure is called a Viewpoint and represents the capture location and geographical extent of each video frame. It is generalised to represent any form or format of Spatial Video. It is shown how a Viewpoint improves on existing data structure methodologies and how it can be theoretically defined in 3D space. A 2D implementation is then developed where Viewpoints are constructed from the spatial and camera parameters of each survey in the study area. A number of problems are defined and solutions provided towards the implementation of a post-processing system to calculate, index and store each video frame Viewpoint in a centralised spatial database.
From this spatial database a number of geospatial analysis approaches are demonstrated that represent novel ways of using and studying Spatial Video based on the Viewpoint data structure. Also, a unique application is developed where the Viewpoints are used as a spatial control to dynamically access and play video in a location aware system.
While video has been to date largely ignored as a GIS spatial data source; it is shown through this novel Viewpoint implementation and the geospatial analysis demonstrations that this need not be the case anymore