4,564 research outputs found

    Table tennis event detection and classification

    Get PDF
    It is well understood that multiple video cameras and computer vision (CV) technology can be used in sport for match officiating, statistics and player performance analysis. A review of the literature reveals a number of existing solutions, both commercial and theoretical, within this domain. However, these solutions are expensive and often complex in their installation. The hypothesis for this research states that by considering only changes in ball motion, automatic event classification is achievable with low-cost monocular video recording devices, without the need for 3-dimensional (3D) positional ball data and representation. The focus of this research is a rigorous empirical study of low cost single consumer-grade video camera solutions applied to table tennis, confirming that monocular CV based detected ball location data contains sufficient information to enable key match-play events to be recognised and measured. In total a library of 276 event-based video sequences, using a range of recording hardware, were produced for this research. The research has four key considerations: i) an investigation into an effective recording environment with minimum configuration and calibration, ii) the selection and optimisation of a CV algorithm to detect the ball from the resulting single source video data, iii) validation of the accuracy of the 2-dimensional (2D) CV data for motion change detection, and iv) the data requirements and processing techniques necessary to automatically detect changes in ball motion and match those to match-play events. Throughout the thesis, table tennis has been chosen as the example sport for observational and experimental analysis since it offers a number of specific CV challenges due to the relatively high ball speed (in excess of 100kph) and small ball size (40mm in diameter). Furthermore, the inherent rules of table tennis show potential for a monocular based event classification vision system. As the initial stage, a proposed optimum location and configuration of the single camera is defined. Next, the selection of a CV algorithm is critical in obtaining usable ball motion data. It is shown in this research that segmentation processes vary in their ball detection capabilities and location out-puts, which ultimately affects the ability of automated event detection and decision making solutions. Therefore, a comparison of CV algorithms is necessary to establish confidence in the accuracy of the derived location of the ball. As part of the research, a CV software environment has been developed to allow robust, repeatable and direct comparisons between different CV algorithms. An event based method of evaluating the success of a CV algorithm is proposed. Comparison of CV algorithms is made against the novel Efficacy Metric Set (EMS), producing a measurable Relative Efficacy Index (REI). Within the context of this low cost, single camera ball trajectory and event investigation, experimental results provided show that the Horn-Schunck Optical Flow algorithm, with a REI of 163.5 is the most successful method when compared to a discrete selection of CV detection and extraction techniques gathered from the literature review. Furthermore, evidence based data from the REI also suggests switching to the Canny edge detector (a REI of 186.4) for segmentation of the ball when in close proximity to the net. In addition to and in support of the data generated from the CV software environment, a novel method is presented for producing simultaneous data from 3D marker based recordings, reduced to 2D and compared directly to the CV output to establish comparative time-resolved data for the ball location. It is proposed here that a continuous scale factor, based on the known dimensions of the ball, is incorporated at every frame. Using this method, comparison results show a mean accuracy of 3.01mm when applied to a selection of nineteen video sequences and events. This tolerance is within 10% of the diameter of the ball and accountable by the limits of image resolution. Further experimental results demonstrate the ability to identify a number of match-play events from a monocular image sequence using a combination of the suggested optimum algorithm and ball motion analysis methods. The results show a promising application of 2D based CV processing to match-play event classification with an overall success rate of 95.9%. The majority of failures occur when the ball, during returns and services, is partially occluded by either the player or racket, due to the inherent problem of using a monocular recording device. Finally, the thesis proposes further research and extensions for developing and implementing monocular based CV processing of motion based event analysis and classification in a wider range of applications

    Challenges in 3D scanning: Focusing on Ears and Multiple View Stereopsis

    Get PDF

    CMB-S4 Science Book, First Edition

    Full text link
    This book lays out the scientific goals to be addressed by the next-generation ground-based cosmic microwave background experiment, CMB-S4, envisioned to consist of dedicated telescopes at the South Pole, the high Chilean Atacama plateau and possibly a northern hemisphere site, all equipped with new superconducting cameras. CMB-S4 will dramatically advance cosmological studies by crossing critical thresholds in the search for the B-mode polarization signature of primordial gravitational waves, in the determination of the number and masses of the neutrinos, in the search for evidence of new light relics, in constraining the nature of dark energy, and in testing general relativity on large scales

    Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

    Get PDF
    Dans cette thĂšse, nous rĂ©solvons le problĂšme de reconstruire simultanĂ©ment une reprĂ©sentation de la gĂ©omĂ©trie du monde, de la trajectoire de l'observateur, et de la trajectoire des objets mobiles, Ă  l'aide de la vision. Nous divisons le problĂšme en trois Ă©tapes : D'abord, nous donnons une solution au problĂšme de la cartographie et localisation simultanĂ©es pour la vision monoculaire qui fonctionne dans les situations les moins bien conditionnĂ©es gĂ©omĂ©triquement. Ensuite, nous incorporons l'observabilitĂ© 3D instantanĂ©e en dupliquant le matĂ©riel de vision avec traitement monoculaire. Ceci Ă©limine les inconvĂ©nients inhĂ©rents aux systĂšmes stĂ©rĂ©o classiques. Nous ajoutons enfin la dĂ©tection et suivi des objets mobiles proches en nous servant de cette observabilitĂ© 3D. Nous choisissons une reprĂ©sentation Ă©parse et ponctuelle du monde et ses objets. La charge calculatoire des algorithmes de perception est allĂ©gĂ©e en focalisant activement l'attention aux rĂ©gions de l'image avec plus d'intĂ©rĂȘt. ABSTRACT : In this thesis we give new means for a machine to understand complex and dynamic visual scenes in real time. In particular, we solve the problem of simultaneously reconstructing a certain representation of the world's geometry, the observer's trajectory, and the moving objects' structures and trajectories, with the aid of vision exteroceptive sensors. We proceeded by dividing the problem into three main steps: First, we give a solution to the Simultaneous Localization And Mapping problem (SLAM) for monocular vision that is able to adequately perform in the most ill-conditioned situations: those where the observer approaches the scene in straight line. Second, we incorporate full 3D instantaneous observability by duplicating vision hardware with monocular algorithms. This permits us to avoid some of the inherent drawbacks of classic stereo systems, notably their limited range of 3D observability and the necessity of frequent mechanical calibration. Third, we add detection and tracking of moving objects by making use of this full 3D observability, whose necessity we judge almost inevitable. We choose a sparse, punctual representation of both the world and the moving objects in order to alleviate the computational payload of the image processing algorithms, which are required to extract the necessary geometrical information out of the images. This alleviation is additionally supported by active feature detection and search mechanisms which focus the attention to those image regions with the highest interest. This focusing is achieved by an extensive exploitation of the current knowledge available on the system (all the mapped information), something that we finally highlight to be the ultimate key to success

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator

    COrE (Cosmic Origins Explorer) A White Paper

    Full text link
    COrE (Cosmic Origins Explorer) is a fourth-generation full-sky, microwave-band satellite recently proposed to ESA within Cosmic Vision 2015-2025. COrE will provide maps of the microwave sky in polarization and temperature in 15 frequency bands, ranging from 45 GHz to 795 GHz, with an angular resolution ranging from 23 arcmin (45 GHz) and 1.3 arcmin (795 GHz) and sensitivities roughly 10 to 30 times better than PLANCK (depending on the frequency channel). The COrE mission will lead to breakthrough science in a wide range of areas, ranging from primordial cosmology to galactic and extragalactic science. COrE is designed to detect the primordial gravitational waves generated during the epoch of cosmic inflation at more than 3σ3\sigma for r=(T/S)>=10−3r=(T/S)>=10^{-3}. It will also measure the CMB gravitational lensing deflection power spectrum to the cosmic variance limit on all linear scales, allowing us to probe absolute neutrino masses better than laboratory experiments and down to plausible values suggested by the neutrino oscillation data. COrE will also search for primordial non-Gaussianity with significant improvements over Planck in its ability to constrain the shape (and amplitude) of non-Gaussianity. In the areas of galactic and extragalactic science, in its highest frequency channels COrE will provide maps of the galactic polarized dust emission allowing us to map the galactic magnetic field in areas of diffuse emission not otherwise accessible to probe the initial conditions for star formation. COrE will also map the galactic synchrotron emission thirty times better than PLANCK. This White Paper reviews the COrE science program, our simulations on foreground subtraction, and the proposed instrumental configuration.Comment: 90 pages Latex 15 figures (revised 28 April 2011, references added, minor errors corrected
    • 

    corecore