111 research outputs found

    Image Local Features Description through Polynomial Approximation

    Get PDF
    This work introduces a novel local patch descriptor that remains invariant under varying conditions of orientation, viewpoint, scale, and illumination. The proposed descriptor incorporate polynomials of various degrees to approximate the local patch within the image. Before feature detection and approximation, the image micro-texture is eliminated through a guided image filter with the potential to preserve the edges of the objects. The rotation invariance is achieved by aligning the local patch around the Harris corner through the dominant orientation shift algorithm. Weighted threshold histogram equalization (WTHE) is employed to make the descriptor in-sensitive to illumination changes. The correlation coefficient is used instead of Euclidean distance to improve the matching accuracy. The proposed descriptor has been extensively evaluated on the Oxford's affine covariant regions dataset, and absolute and transition tilt dataset. The experimental results show that our proposed descriptor can categorize the feature with more distinctiveness in comparison to state-of-the-art descriptors. - 2013 IEEE.This work was supported by the Qatar National Library.Scopu

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Tracking and Mapping in Medical Computer Vision: A Review

    Full text link
    As computer vision algorithms are becoming more capable, their applications in clinical systems will become more pervasive. These applications include diagnostics such as colonoscopy and bronchoscopy, guiding biopsies and minimally invasive interventions and surgery, automating instrument motion and providing image guidance using pre-operative scans. Many of these applications depend on the specific visual nature of medical scenes and require designing and applying algorithms to perform in this environment. In this review, we provide an update to the field of camera-based tracking and scene mapping in surgery and diagnostics in medical computer vision. We begin with describing our review process, which results in a final list of 515 papers that we cover. We then give a high-level summary of the state of the art and provide relevant background for those who need tracking and mapping for their clinical applications. We then review datasets provided in the field and the clinical needs therein. Then, we delve in depth into the algorithmic side, and summarize recent developments, which should be especially useful for algorithm designers and to those looking to understand the capability of off-the-shelf methods. We focus on algorithms for deformable environments while also reviewing the essential building blocks in rigid tracking and mapping since there is a large amount of crossover in methods. Finally, we discuss the current state of the tracking and mapping methods along with needs for future algorithms, needs for quantification, and the viability of clinical applications in the field. We conclude that new methods need to be designed or combined to support clinical applications in deformable environments, and more focus needs to be put into collecting datasets for training and evaluation.Comment: 31 pages, 17 figure

    A Deep Moving-camera Background Model

    Full text link
    In video analysis, background models have many applications such as background/foreground separation, change detection, anomaly detection, tracking, and more. However, while learning such a model in a video captured by a static camera is a fairly-solved task, in the case of a Moving-camera Background Model (MCBM), the success has been far more modest due to algorithmic and scalability challenges that arise due to the camera motion. Thus, existing MCBMs are limited in their scope and their supported camera-motion types. These hurdles also impeded the employment, in this unsupervised task, of end-to-end solutions based on deep learning (DL). Moreover, existing MCBMs usually model the background either on the domain of a typically-large panoramic image or in an online fashion. Unfortunately, the former creates several problems, including poor scalability, while the latter prevents the recognition and leveraging of cases where the camera revisits previously-seen parts of the scene. This paper proposes a new method, called DeepMCBM, that eliminates all the aforementioned issues and achieves state-of-the-art results. Concretely, first we identify the difficulties associated with joint alignment of video frames in general and in a DL setting in particular. Next, we propose a new strategy for joint alignment that lets us use a spatial transformer net with neither a regularization nor any form of specialized (and non-differentiable) initialization. Coupled with an autoencoder conditioned on unwarped robust central moments (obtained from the joint alignment), this yields an end-to-end regularization-free MCBM that supports a broad range of camera motions and scales gracefully. We demonstrate DeepMCBM's utility on a variety of videos, including ones beyond the scope of other methods. Our code is available at https://github.com/BGU-CS-VIL/DeepMCBM .Comment: 26 paged, 5 figures. To be published in ECCV 202

    An investigation into web-based panoramic video virtual reality with reference to the virtual zoo.

    Get PDF
    Panoramic image Virtual Reality (VR) is a 360 degree image which has been interpreted as a kind of VR that allows users to navigate, view, hear and have remote access to a virtual environment. Panoramic Video VR builds on this, where filming is done in the real world to create a highly dynamic and immersive environment. This is proving to be a very attractive technology and has introduced many possible applications but still present a number of challenges, considered in this research. An initial literature survey identified limitations in panoramic video to date: these were the technology (e.g. filming and stitching) and the design of effective navigation methods. In particular, there is a tendency for users to become disoriented during way-finding. In addition, an effective interface design to embed contextual information is required. The research identified the need to have a controllable test environment in order to evaluate the production of the video and the optimal way of presenting and navigating within the scene. Computer Graphics (CG) simulation scenes were developed to establish a method of capturing, editing and stitching the video under controlled conditions. In addition, a novel navigation method, named the “image channel” was proposed and integrated within this environment. This replaced hotspots: the traditional navigational jumps between locations. Initial user testing indicated that the production was appropriate and did significantly improve user perception of position and orientation over jump-based navigation. The interface design combined with the environment view alone was sufficient for users to understand their location without the need to augment the view with an on screen map. After obtaining optimal methods in building and improving the technology, the research looked for a natural, complex, and dynamic real environment for testing. The web-based virtual zoo (World Association of Zoos and Aquariums) was selected as an ideal production: It had the purpose to allow people to get close to animals in their natural habitat and created particular interest to develop a system for knowledge delivery, raising protection concerns, and entertaining visitors: all key roles of a zoo. The design method established from CG was then used to develop a film rig and production unit for filming a real animal habitat: the Formosan rock monkey in Taiwan. A web-based panoramic video of this was built and tested though user experience testing and expert interviews. The results of this were essentially identical to the testing done in the prototype environment, and validated the production. Also was successfully attracting users to the site repeatedly. The research has contributed to new knowledge in improvement to the production process, improvement to presentation and navigating within panoramic videos through the proposed Image Channel method, and has demonstrated that web-based virtual zoo can be improved to help address considerable pressure on animal extinction and animal habitat degradation that affect humans by using this technology. Further studies were addressed. The research was sponsored by Taiwan’s Government and Twycross Zoo UK was a collaborator
    corecore