175 research outputs found

    Novel Motion Anchoring Strategies for Wavelet-based Highly Scalable Video Compression

    Full text link
    This thesis investigates new motion anchoring strategies that are targeted at wavelet-based highly scalable video compression (WSVC). We depart from two practices that are deeply ingrained in existing video compression systems. Instead of the commonly used block motion, which has poor scalability attributes, we employ piecewise-smooth motion together with a highly scalable motion boundary description. The combination of this more “physical” motion description together with motion discontinuity information allows us to change the conventional strategy of anchoring motion at target frames to anchoring motion at reference frames, which improves motion inference across time. In the proposed reference-based motion anchoring strategies, motion fields are mapped from reference to target frames, where they serve as prediction references; during this mapping process, disoccluded regions are readily discovered. Observing that motion discontinuities displace with foreground objects, we propose motion-discontinuity driven motion mapping operations that handle traditionally challenging regions around moving objects. The reference-based motion anchoring exposes an intricate connection between temporal frame interpolation (TFI) and video compression. When employed in a compression system, all anchoring strategies explored in this thesis perform TFI once all residual information is quantized to zero at a given temporal level. The interpolation performance is evaluated on both natural and synthetic sequences, where we show favourable comparisons with state-of-the-art TFI schemes. We explore three reference-based motion anchoring strategies. In the first one, the motion anchoring is “flipped” with respect to a hierarchical B-frame structure. We develop an analytical model to determine the weights of the different spatio-temporal subbands, and assess the suitability and benefits of this reference-based WSVC for (highly scalable) video compression. Reduced motion coding cost and improved frame prediction, especially around moving objects, result in improved rate-distortion performance compared to a target-based WSVC. As the thesis evolves, the motion anchoring is progressively simplified to one where all motion is anchored at one base frame; this central motion organization facilitates the incorporation of higher-order motion models, which improve the prediction performance in regions following motion with non-constant velocity

    Psychophysical investigations of visual density discrimination

    Get PDF
    Work in spatial vision is reviewed and a new effect of spatial averaging is reported. This shows that dot separation discriminations are improved if the cue is represented in the intervals within a collection of dots arranged in a lattice, compared to simple 2 dot separation discriminations. This phenomenon may be related to integrative processes that mediate texture density estimation. Four models for density discrimination are described. One involves measurements of spatial filter outputs. Computer simulations show that in principle, density cues can be encoded by a system of four DOG filters with peak sensitivities spanning a range of 3 octaves. Alternative models involve operations performed over representations in which spatial features are made explicit. One of these involves estimations of numerosity or coverage of the texture elements. Another involves averaging of the interval values between adjacent elements. A neural model for measuring the relevant intervals is described. It is argued that in principle the input to a density processor does not require the full sequence of operations in the MIRAGE transformation (eg. Watt and Morgan 1985). In particular, the regions of activity in the second derivative do not need to be interpreted in terms of edges, bars and blobs in order for density estimation to commence. This also implies that explicit coding of texture elements may be unnecessary. Data for density discrimination in regular and random dot patterns are reported. These do not support the coverage and counting models and observed performance shows significant departures from predictions based on an analysis of the statistics of the interval distribution in the stimuli. But this result can be understood in relation to other factors in the interval averaging process, and there is empirical support for the hypothesized method for measuring the intervals. Other experiments show that density is scaled according to stimulus size and possibly perceived depth. It is also shown that information from density analysis can be combined with size estimations to produce highly accurate discriminations of image expansion or object depth changes

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Variational methods and its applications to computer vision

    Get PDF
    Many computer vision applications such as image segmentation can be formulated in a ''variational'' way as energy minimization problems. Unfortunately, the computational task of minimizing these energies is usually difficult as it generally involves non convex functions in a space with thousands of dimensions and often the associated combinatorial problems are NP-hard to solve. Furthermore, they are ill-posed inverse problems and therefore are extremely sensitive to perturbations (e.g. noise). For this reason in order to compute a physically reliable approximation from given noisy data, it is necessary to incorporate into the mathematical model appropriate regularizations that require complex computations. The main aim of this work is to describe variational segmentation methods that are particularly effective for curvilinear structures. Due to their complex geometry, classical regularization techniques cannot be adopted because they lead to the loss of most of low contrasted details. In contrast, the proposed method not only better preserves curvilinear structures, but also reconnects some parts that may have been disconnected by noise. Moreover, it can be easily extensible to graphs and successfully applied to different types of data such as medical imagery (i.e. vessels, hearth coronaries etc), material samples (i.e. concrete) and satellite signals (i.e. streets, rivers etc.). In particular, we will show results and performances about an implementation targeting new generation of High Performance Computing (HPC) architectures where different types of coprocessors cooperate. The involved dataset consists of approximately 200 images of cracks, captured in three different tunnels by a robotic machine designed for the European ROBO-SPECT project.Open Acces

    Visual Tracking in Robotic Minimally Invasive Surgery

    Get PDF
    Intra-operative imaging and robotics are some of the technologies driving forward better and more effective minimally invasive surgical procedures. To advance surgical practice and capabilities further, one of the key requirements for computationally enhanced interventions is to know how instruments and tissues move during the operation. While endoscopic video captures motion, the complex appearance dynamic effects of surgical scenes are challenging for computer vision algorithms to handle with robustness. Tackling both tissue and instrument motion estimation, this thesis proposes a combined non-rigid surface deformation estimation method to track tissue surfaces robustly and in conditions with poor illumination. For instrument tracking, a keypoint based 2D tracker that relies on the Generalized Hough Transform is developed to initialize a 3D tracker in order to robustly track surgical instruments through long sequences that contain complex motions. To handle appearance changes and occlusion a patch-based adaptive weighting with segmentation and scale tracking framework is developed. It takes a tracking-by-detection approach and a segmentation model is used to assigns weights to template patches in order to suppress back- ground information. The performance of the method is thoroughly evaluated showing that without any offline-training, the tracker works well even in complex environments. Finally, the thesis proposes a novel 2D articulated instrument pose estimation framework, which includes detection-regression fully convolutional network and a multiple instrument parsing component. The framework achieves compelling performance and illustrates interesting properties includ- ing transfer between different instrument types and between ex vivo and in vivo data. In summary, the thesis advances the state-of-the art in visual tracking for surgical applications for both tissue and instrument motion estimation. It contributes to developing the technological capability of full surgical scene understanding from endoscopic video

    An exploration of hybrid art and design practice using computer-based design and fabrication tools.

    Get PDF
    The researchers previous experience suggested the use of computer-based design and fabrication tools might enable new models of practice that yield a greater integration between the 3D art and design disciplines. A critical, contextual review was conducted to assess what kinds of objects are being produced by art and design practitioners; what the significant characteristics of these objects might be; and what technological, theoretical and contextual frameworks support their making. A survey of international practitioners was undertaken to establish how practitioners use these tools and engage with other art and design disciplines. From these a formalised system of analysis was developed to derive evaluative criteria for these objects. The researcher developed a curatorial framework for a public exhibition and symposium that explored the direction that art and design practitioners are taking in relation to computer-based tools. These events allowed the researcher to survey existing works, explore future trends, gather audience and peer response and engage the broader community of interest around the field of enquiry. Interviews were conducted with practitioners whose work was included in this exhibition and project stakeholders to reveal patterns and themes relevant to the theoretical framework of this study. A model of the phases that practitioners go through when they integrate computer-based tools into their practice was derived from an existing technology adoption model. Also, a contemporary version of R. Krausss Klein Group was developed that considers developments in the field from the use of digital technologies. This was used to model the context within which the researchers practice is located. The research identifies a form of technologyled- practice and an increased capacity for a transdisciplinary discourse at the intersection of disciplinary domains. This study will be of interest to practitioners from across the 3D art and design disciplines that use computerbased tools

    Vertex classification for non-uniform geometry reduction.

    Get PDF
    Complex models created from isosurface extraction or CAD and highly accurate 3D models produced from high-resolution scanners are useful, for example, for medical simulation, Virtual Reality and entertainment. Often models in general require some sort of manual editing before they can be incorporated in a walkthrough, simulation, computer game or movie. The visualization challenges of a 3D editing tool may be regarded as similar to that of those of other applications that include an element of visualization such as Virtual Reality. However the rendering interaction requirements of each of these applications varies according to their purpose. For rendering photo-realistic images in movies computer farms can render uninterrupted for weeks, a 3D editing tool requires fast access to a model's fine data. In Virtual Reality rendering acceleration techniques such as level of detail can temporarily render parts of a scene with alternative lower complexity versions in order to meet a frame rate tolerable for the user. These alternative versions can be dynamic increments of complexity or static models that were uniformly simplified across the model by minimizing some cost function. Scanners typically have a fixed sampling rate for the entire model being scanned, and therefore may generate large amounts of data in areas not of much interest or that contribute little to the application at hand. It is therefore desirable to simplify such models non-uniformly. Features such as very high curvature areas or borders can be detected automatically and simplified differently to other areas without any interaction or visualization. However a problem arises when one wishes to manually select features of interest in the original model to preserve and create stand alone, non-uniformly reduced versions of large models, for example for medical simulation. To inspect and view such models the memory requirements of LoD representations can be prohibitive and prevent storage of a model in main memory. Furthermore, although asynchronous rendering of a base simplified model ensures a frame rate tolerable to the user whilst detail is paged, no guarantees can be made that what the user is selecting is at the original resolution of the model or of an appropriate LoD owing to disk lag or the complexity of a particular view selected by the user. This thesis presents an interactive method in the con text of a 3D editing application for feature selection from any model that fits in main memory. We present a new compression/decompression of triangle normals and colour technique which does not require dedicated hardware that allows for 87.4% memory reduction and allows larger models to fit in main memory with at most 1.3/2.5 degrees of error on triangle normals and to be viewed interactively. To address scale and available hardware resources, we reference a hierarchy of volumes of different sizes. The distances of the volumes at each level of the hierarchy to the intersection point of the line of sight with the model are calculated and these distances sorted. At startup an appropriate level of the tree is automatically chosen by separating the time required for rendering from that required for sorting and constraining the latter according to the resources available. A clustered navigation skin and depth buffer strategy allows for the interactive visualisation of models of any size, ensuring that triangles from the closest volumes are rendered over the navigation skin even when the clustered skin may be closer to the viewer than the original model. We show results with scanned models, CAD, textured models and an isosurface. This thesis addresses numerical issues arising from the optimisation of cost functions in LoD algorithms and presents a semi-automatic solution for selection of the threshold on the condition number of the matrix to be inverted for optimal placement of the new vertex created by an edge collapse. We show that the units in which a model is expressed may inadvertently affect the condition of these matrices, hence affecting the evaluation of different LoD methods with different solvers. We use the same solver with an automatically calibrated threshold to evaluate different uniform geometry reduction techniques. We then present a framework for non-uniform reduction of regular scanned models that can be used in conjunction with a variety of LoD algorithms. The benefits of non-uniform reduction are presented in the context of an animation system. (Abstract shortened by UMI.)
    • 

    corecore