5,637 research outputs found

    Fast intra prediction in the transform domain

    Get PDF
    In this paper, we present a fast intra prediction method based on separating the transformed coefficients. The prediction block can be obtained from the transformed and quantized neighboring block generating minimum distortion for each DC and AC coefficients independently. Two prediction methods are proposed, one is full block search prediction (FBSP) and the other is edge based distance prediction (EBDP), that find the best matched transformed coefficients on additional neighboring blocks. Experimental results show that the use of transform coefficients greatly enhances the efficiency of intra prediction whilst keeping complexity low compared to H.264/AVC

    Decoding visual information from high-density diffuse optical tomography neuroimaging data

    Get PDF
    BACKGROUND: Neural decoding could be useful in many ways, from serving as a neuroscience research tool to providing a means of augmented communication for patients with neurological conditions. However, applications of decoding are currently constrained by the limitations of traditional neuroimaging modalities. Electrocorticography requires invasive neurosurgery, magnetic resonance imaging (MRI) is too cumbersome for uses like daily communication, and alternatives like functional near-infrared spectroscopy (fNIRS) offer poor image quality. High-density diffuse optical tomography (HD-DOT) is an emerging modality that uses denser optode arrays than fNIRS to combine logistical advantages of optical neuroimaging with enhanced image quality. Despite the resulting promise of HD-DOT for facilitating field applications of neuroimaging, decoding of brain activity as measured by HD-DOT has yet to be evaluated. OBJECTIVE: To assess the feasibility and performance of decoding with HD-DOT in visual cortex. METHODS AND RESULTS: To establish the feasibility of decoding at the single-trial level with HD-DOT, a template matching strategy was used to decode visual stimulus position. A receiver operating characteristic (ROC) analysis was used to quantify the sensitivity, specificity, and reproducibility of binary visual decoding. Mean areas under the curve (AUCs) greater than 0.97 across 10 imaging sessions in a highly sampled participant were observed. ROC analyses of decoding across 5 participants established both reproducibility in multiple individuals and the feasibility of inter-individual decoding (mean AUCs \u3e 0.7), although decoding performance varied between individuals. Phase-encoded checkerboard stimuli were used to assess more complex, non-binary decoding with HD-DOT. Across 3 highly sampled participants, the phase of a 60° wide checkerboard wedge rotating 10° per second through 360° was decoded with a within-participant error of 25.8±24.7°. Decoding between participants was also feasible based on permutation-based significance testing. CONCLUSIONS: Visual stimulus information can be decoded accurately, reproducibly, and across a range of detail (for both binary and non-binary outcomes) at the single-trial level (without needing to block-average test data) using HD-DOT data. These results lay the foundation for future studies of more complex decoding with HD-DOT and applications in clinical populations

    Fusion-Based Versatile Video Coding Intra Prediction Algorithm with Template Matching and Linear Prediction

    Get PDF
    The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper. Specifically, to better predict areas with similar texture information, we propose a fusion-based adaptive template matching method, which directly takes the error between reference and objective templates into account. Furthermore, to better utilize the correlation between reference pixels and the pixels to be predicted, we propose a fusion-based linear prediction method, which can compensate for the deficiency of single linear prediction. We implemented our algorithm on top of the VVC Test Model (VTM) 9.1. When compared with the VVC, our proposed fusion-based algorithm saves a bitrate of 0.89%, 0.84%, and 0.90% on average for the Y, Cb, and Cr components, respectively. In addition, when compared with some other existing works, our algorithm showed superior performance in bitrate savings

    Distributed Video Coding for Multiview and Video-plus-depth Coding

    Get PDF

    Image analysis using visual saliency with applications in hazmat sign detection and recognition

    Get PDF
    Visual saliency is the perceptual process that makes attractive objects stand out from their surroundings in the low-level human visual system. Visual saliency has been modeled as a preprocessing step of the human visual system for selecting the important visual information from a scene. We investigate bottom-up visual saliency using spectral analysis approaches. We present separate and composite model families that generalize existing frequency domain visual saliency models. We propose several frequency domain visual saliency models to generate saliency maps using new spectrum processing methods and an entropy-based saliency map selection approach. A group of saliency map candidates are then obtained by inverse transform. A final saliency map is selected among the candidates by minimizing the entropy of the saliency map candidates. The proposed models based on the separate and composite model families are also extended to various color spaces. We develop an evaluation tool for benchmarking visual saliency models. Experimental results show that the proposed models are more accurate and efficient than most state-of-the-art visual saliency models in predicting eye fixation.^ We use the above visual saliency models to detect the location of hazardous material (hazmat) signs in complex scenes. We develop a hazmat sign location detection and content recognition system using visual saliency. Saliency maps are employed to extract salient regions that are likely to contain hazmat sign candidates and then use a Fourier descriptor based contour matching method to locate the border of hazmat signs in these regions. This visual saliency based approach is able to increase the accuracy of sign location detection, reduce the number of false positive objects, and speed up the overall image analysis process. We also propose a color recognition method to interpret the color inside the detected hazmat sign. Experimental results show that our proposed hazmat sign location detection method is capable of detecting and recognizing projective distorted, blurred, and shaded hazmat signs at various distances.^ In other work we investigate error concealment for scalable video coding (SVC). When video compressed with SVC is transmitted over loss-prone networks, the decompressed video can suffer severe visual degradation across multiple frames. In order to enhance the visual quality, we propose an inter-layer error concealment method using motion vector averaging and slice interleaving to deal with burst packet losses and error propagation. Experimental results show that the proposed error concealment methods outperform two existing methods

    Analog parallel processor solutions for video encoding

    Get PDF
    This thesis deals with Cellular Nonlinear Network (CNN) analog parallel processor networks and their implementations in current video coding standards. The target applications are low-power video encoders within 3rd generation mobile terminals. The video codecs of such mobile terminals are defined by either the MPEG-4/H.263 or H.264 video standard. All of these standards are based on the block-based hybrid approach. As block-based motion estimation (ME) is responsible for most of the power consumption of such hybrid video encoders, this thesis deals mostly with low-power ME implementations. Low-power solutions are introduced at both the algorithmic and hardware levels. On the algorithmic level, the introduced implementations are derived from a segmentation algorithm, which has previously been partly realized. The first introduced algorithm reduces the computational complexity of ME within an object-based MPEG-4 encoder. The use of this algorithm enables a 60% drop in the power consumption of Full Search ME. The second algorithm calculates a near-optimal block-size partition for H.264 motion estimation. With this algorithm, the use of computationally complex Lagrange optimization in H.264 ME is not required. The third algorithm reduces the shape bit-rate of an object-based MPEG-4 encoder. On the hardware level a CNN-type ME architecture is introduced. The architecture includes connections and circuitry to fully realize block-based ME. The analog ME implemented with this architecture is capable of lower power than comparable digital realizations. A 9×9 test chip has also been realized. Additionally implemented is a digital predictive ME realization that takes advantage of the introduced partition algorithm. Although the IC layout of the ME algorithm was drawn, the design was verified as an FPGA.reviewe

    Ultrafast Ultrasound Imaging

    Get PDF
    Among medical imaging modalities, such as computed tomography (CT) and magnetic resonance imaging (MRI), ultrasound imaging stands out due to its temporal resolution. Owing to the nature of medical ultrasound imaging, it has been used for not only observation of the morphology of living organs but also functional imaging, such as blood flow imaging and evaluation of the cardiac function. Ultrafast ultrasound imaging, which has recently become widely available, significantly increases the opportunities for medical functional imaging. Ultrafast ultrasound imaging typically enables imaging frame-rates of up to ten thousand frames per second (fps). Due to the extremely high temporal resolution, this enables visualization of rapid dynamic responses of biological tissues, which cannot be observed and analyzed by conventional ultrasound imaging. This Special Issue includes various studies of improvements to the performance of ultrafast ultrasoun
    corecore