1,725 research outputs found
Algorithms for trajectory integration in multiple views
PhDThis thesis addresses the problem of deriving a coherent and accurate localization
of moving objects from partial visual information when data are generated by cameras
placed in di erent view angles with respect to the scene. The framework is built around
applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a
geometric-based solution exploits the relationships between corresponding feature points
across views and improves accuracy in object location. Then, we improve the estimation
of objects location with geometric transformations that account for lens distortions.
Additionally, we study the integration of the partial visual information generated by each
individual sensor and their combination into one single frame of observation that considers
object association and data fusion. Our approach is fully image-based, only relies on 2D
constructs and does not require any complex computation in 3D space. We exploit the
continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally,
we work under the assumption of planar ground plane and wide baseline (i.e.
cameras' viewpoints are far apart). The main contributions are: i) the development of a
framework for distributed visual sensing that accounts for inaccuracies in the geometry
of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based
homography estimation; iii) the integration of a polynomial method for correcting inaccuracies
caused by the cameras' lens distortion; iv) a global trajectory reconstruction
algorithm that associates and integrates fragments of trajectories generated by each camera
An automatic visual analysis system for tennis
This article presents a novel video analysis system for coaching tennis players of all levels, which uses computer vision algorithms to automatically edit and index tennis videos into meaningful annotations.
Existing tennis coaching software lacks the ability to automatically index a tennis match into key events, and therefore, a coach who uses existing software is burdened with time-consuming manual video editing. This work aims to explore the effectiveness of a system to automatically detect tennis events. A secondary aim of this work is to explore the bene- fits coaches experience in using an event retrieval system to retrieve the automatically indexed events. It was found that automatic event detection can significantly improve the experience of using video feedback as part of an instructional coaching session. In addition to the automatic detection of key tennis events, player and ball movements are automati- cally tracked throughout an entire match and this wealth of data allows users to find interesting patterns in play. Player and ball movement information are integrated with the automatically detected tennis events, and coaches can query the data to retrieve relevant key points during a match or analyse player patterns that need attention. This coaching software system allows coaches to build advanced queries, which cannot be facilitated with existing video coaching solutions, without tedious manual indexing. This article proves that the event detection algorithms in this work can detect the main events in tennis with an average precision and recall of 0.84 and 0.86, respectively, and can typically eliminate man- ual indexing of key tennis events
Prioritizing Content of Interest in Multimedia Data Compression
Image and video compression techniques make data transmission and storage in digital multimedia systems more efficient and feasible for the system's limited storage and bandwidth. Many generic image and video compression techniques such as JPEG and H.264/AVC have been standardized and are now widely adopted. Despite their great success, we observe that these standard compression techniques are not the best solution for data compression in special types of multimedia systems such as microscopy videos and low-power wireless broadcast systems. In these application-specific systems where the content of interest in the multimedia data is known and well-defined, we should re-think the design of a data compression pipeline. We hypothesize that by identifying and prioritizing multimedia data's content of interest, new compression methods can be invented that are far more effective than standard techniques. In this dissertation, a set of new data compression methods based on the idea of prioritizing the content of interest has been proposed for three different kinds of multimedia systems. I will show that the key to designing efficient compression techniques in these three cases is to prioritize the content of interest in the data. The definition of the content of interest of multimedia data depends on the application. First, I show that for microscopy videos, the content of interest is defined as the spatial regions in the video frame with pixels that don't only contain noise. Keeping data in those regions with high quality and throwing out other information yields to a novel microscopy video compression technique. Second, I show that for a Bluetooth low energy beacon based system, practical multimedia data storage and transmission is possible by prioritizing content of interest. I designed custom image compression techniques that preserve edges in a binary image, or foreground regions of a color image of indoor or outdoor objects. Last, I present a new indoor Bluetooth low energy beacon based augmented reality system that integrates a 3D moving object compression method that prioritizes the content of interest.Doctor of Philosoph
Selected Topics in Bayesian Image/Video Processing
In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos
Recommended from our members
Intelligent Side Information Generation in Distributed Video Coding
Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding to one where the computational cost is largely incurred by the decoder. This is attractive as the proven theoretical work of Wyner-Ziv (WZ) and Slepian-Wolf (SW) shows that the performance by such a system should be exactly the same as a conventional coder. Despite the solid theoretical foundations, current DVC qualitative and quantitative performance falls short of existing conventional coders and there remain crucial limitations. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have usually been based on linear motion compensated temporal interpolation (LMCTI), though these do not always produce satisfactory SI quality, especially in sequences exhibiting non-linear motion.
This thesis presents an intelligent higher order piecewise trajectory temporal interpolation (HOPTTI) framework for SI generation with original contributions that afford better SI quality in comparison to existing LMCTI-based approaches. The major elements in this framework are: (i) a cubic trajectory interpolation algorithm model that significantly improves the accuracy of motion vector estimations; (ii) an adaptive overlapped block motion compensation (AOBMC) model which reduces both blocking and overlapping artefacts in the SI emanating from the block matching algorithm; (iii) the development of an empirical mode switching algorithm; and (iv) an intelligent switching mechanism to construct SI by automatically selecting the best macroblock from the intermediate SI generated by HOPTTI and AOBMC algorithms. Rigorous analysis and evaluation confirms that significant quantitative and perceptual improvements in SI quality are achieved with the new framework
Recommended from our members
A Novel Multi-View Table Tennis Umpiring Framework
This research investigates the development of a low-cost multi-view umpiring framework, as an alternative to the current expensive systems that are almost exclusively restricted to elite professional sports. Table tennis has been selected as the testbed because, while automating the process is challenging, it has many different complex match elements including the service, return and rallies, which are governed by a strict set of regulations. The focus is mainly on the rally element rather than the whole match. Ball detection and tracking in video frames are undertaken to determine reliably the ball position relative to key reference objects like the table surface and net, and the ball’s flight path is used to determine the rally’s status.
While a low-cost option has benefits, it is technically challenging due to the limited number of cameras and generally low video resolution used. This thesis presents a portable multi-view umpiring framework that identifies each state change in a rally. It makes three significant contributions to knowledge: i) a reliable ball detection strategy that accurately detects the location of the ball in low-resolution sequences; ii) a novel framework for ball tracking using a multi-view system, and iii) a new state-machine based evaluation system for analysing table tennis rallies.
In a series of ten different test scenarios, the system achieved an average of 94% system detection rate and 100% accurate decisions. A test sequence of duration 1 s can be processed in 8 s, leading to a delay of only 7 s, which is considered acceptable for practical purposes. This solution has the potential to reform the way matches are umpired, providing objectivity in resolving disputed decisions. It affords an economic technology for amateur players, while the multi-view facility is extendible to other relevant ball-based sports. Finally, the ball flight path analysis mechanism can be a valuable training tool for skills development
Towards gestural understanding for intelligent robots
Fritsch JN. Towards gestural understanding for intelligent robots. Bielefeld: Universität Bielefeld; 2012.A strong driving force of scientific progress in the technical sciences is the quest for systems that assist humans in their daily life and make their life easier and more enjoyable. Nowadays smartphones are probably the most typical instances of such systems. Another class of systems that is getting increasing attention are intelligent robots. Instead of offering a smartphone touch screen to select actions, these systems are intended to offer a more natural human-machine interface to their users. Out of the large range of actions performed by humans, gestures performed with the hands play a very important role especially when humans interact with their direct surrounding like, e.g., pointing to an object or manipulating it. Consequently, a robot has to understand such gestures to offer an intuitive interface. Gestural understanding is, therefore, a key capability on the way to intelligent robots.
This book deals with vision-based approaches for gestural understanding. Over the past two decades, this has been an intensive field of research which has resulted in a variety of algorithms to analyze human hand motions. Following a categorization of different gesture types and a review of other sensing techniques, the design of vision systems that achieve hand gesture understanding for intelligent robots is analyzed. For each of the individual algorithmic steps – hand detection, hand tracking, and trajectory-based gesture recognition – a separate Chapter introduces common techniques and algorithms and provides example methods. The resulting recognition algorithms are considering gestures in isolation and are often not sufficient for interacting with a robot who can only understand such gestures when incorporating the context like, e.g., what object was pointed at or manipulated.
Going beyond a purely trajectory-based gesture recognition by incorporating context is an important prerequisite to achieve gesture understanding and is addressed explicitly in a separate Chapter of this book. Two types of context, user-provided context and situational context, are reviewed and existing approaches to incorporate context for gestural understanding are reviewed. Example approaches for both context types provide a deeper algorithmic insight into this field of research. An overview of recent robots capable of gesture recognition and understanding summarizes the currently realized human-robot interaction quality.
The approaches for gesture understanding covered in this book are manually designed while humans learn to recognize gestures automatically during growing up. Promising research targeted at analyzing developmental learning in children in order to mimic this capability in technical systems is highlighted in the last Chapter completing this book as this research direction may be highly influential for creating future gesture understanding systems
- …