158,249 research outputs found

    Quality metrics for sensor images

    Get PDF
    Methods are needed for evaluating the quality of augmented visual displays (AVID). Computational quality metrics will help summarize, interpolate, and extrapolate the results of human performance tests with displays. The FLM Vision group at NASA Ames has been developing computational models of visual processing and using them to develop computational metrics for similar problems. For example, display modeling systems use metrics for comparing proposed displays, halftoning optimizing methods use metrics to evaluate the difference between the halftone and the original, and image compression methods minimize the predicted visibility of compression artifacts. The visual discrimination models take as input two arbitrary images A and B and compute an estimate of the probability that a human observer will report that A is different from B. If A is an image that one desires to display and B is the actual displayed image, such an estimate can be regarded as an image quality metric reflecting how well B approximates A. There are additional complexities associated with the problem of evaluating the quality of radar and IR enhanced displays for AVID tasks. One important problem is the question of whether intruding obstacles are detectable in such displays. Although the discrimination model can handle detection situations by making B the original image A plus the intrusion, this detection model makes the inappropriate assumption that the observer knows where the intrusion will be. Effects of signal uncertainty need to be added to our models. A pilot needs to make decisions rapidly. The models need to predict not just the probability of a correct decision, but the probability of a correct decision by the time the decision needs to be made. That is, the models need to predict latency as well as accuracy. Luce and Green have generated models for auditory detection latencies. Similar models are needed for visual detection. Most image quality models are designed for static imagery. Watson has been developing a general spatial-temporal vision model to optimize video compression techniques. These models need to be adapted and calibrated for AVID applications

    Linear Facial Expression Transfer With Active Appearance Models

    Get PDF
    The issue of transferring facial expressions from one person's face to another's has been an area of interest for the movie industry and the computer graphics community for quite some time. In recent years, with the proliferation of online image and video collections and web applications, such as Google Street View, the question of preserving privacy through face de-identification has gained interest in the computer vision community. In this paper, we focus on the problem of real-time dynamic facial expression transfer using an Active Appearance Model framework. We provide a theoretical foundation for a generalisation of two well-known expression transfer methods and demonstrate the improved visual quality of the proposed linear extrapolation transfer method on examples of face swapping and expression transfer using the AVOZES data corpus. Realistic talking faces can be generated in real-time at low computational cost

    Placement, visibility and coverage analysis of dynamic pan/tilt/zoom camera sensor networks

    Get PDF
    Multi-camera vision systems have important application in a number of fields, including robotics and security. One interesting problem related to multi-camera vision systems is to determine the effect of camera placement on the quality of service provided by a network of Pan/Tilt/Zoom (PTZ) cameras with respect to a specific image processing application. The goal of this work is to investigate how to place a team of PTZ cameras, potentially used for collaborative tasks, such as surveillance, and analyze the dynamic coverage that can be provided by them. Computational Geometry approaches to various formulations of sensor placement problems have been shown to offer very elegant solutions; however, they often involve unrealistic assumptions about real-world sensors, such as infinite sensing range and infinite rotational speed. Other solutions to camera placement have attempted to account for the constraints of real-world computer vision applications, but offer solutions that are approximations over a discrete problem space. A contribution of this work is an algorithm for camera placement that leverages Computational Geometry principles over a continuous problem space utilizing a model for dynamic camera coverage that is simple, yet representative. This offers a balance between accounting for real-world application constraints and creating a problem that is tractable

    Dynamic MLP for MRI Reconstruction

    Full text link
    As convolutional neural networks (CNN) become the most successful reconstruction technique for accelerated Magnetic Resonance Imaging (MRI), CNN reaches its limit on image quality especially in sharpness. Further improvement on image quality often comes at massive computational costs, hindering their practicability in the clinic setting. MRI reconstruction is essentially a deconvolution problem, which demands long-distance information that is difficult to be captured by CNNs with small convolution kernels. The multi-layer perceptron (MLP) is able to model such long-distance information, but it restricts a fixed input size while the reconstruction of images in flexible resolutions is required in the clinic setting. In this paper, we proposed a hybrid CNN and MLP reconstruction strategy, featured by dynamic MLP (dMLP) that accepts arbitrary image sizes. Experiments were conducted using 3D multi-coil MRI. Our results suggested the proposed dMLP can improve image sharpness compared to its pure CNN counterpart, while costing minor additional GPU memory and computation time. We further compared the proposed dMLP with CNNs using large kernels and studied pure MLP-based reconstruction using a stack of 1D dMLPs, as well as its CNN counterpart using only 1D convolutions. We observed the enlarged receptive field has noticeably improved image quality, while simply using CNN with a large kernel leads to difficulties in training. Noticeably, the pure MLP-based method has been outperformed by CNN-involved methods, which matches the observations in other computer vision tasks for natural images
    corecore