81 research outputs found

    Enhanced low bitrate H.264 video coding using decoder-side super-resolution and frame interpolation

    Get PDF
    Advanced inter-prediction modes are introduced recently in literature to improve video coding performances of both H.264 and High Efficiency Video Coding standards. Decoder-side motion analysis and motion vector derivation are proposed to reduce coding costs of motion information. Here, we introduce enhanced skip and direct modes for H.264 coding using decoder-side super-resolution (SR) and frame interpolation. P-and B-frames are downsampled and H.264 encoded at lower resolution (LR). Then reconstructed LR frames are super-resolved using decoder-side motion estimation. Alternatively for B-frames, bidirectional true motion estimation is performed to synthesize a B-frame from its reference frames. For P-frames, bicubic interpolation of the LR frame is used as an alternative to SR reconstruction. A rate-distortion optimal mode selection algorithm is developed to decide for each MB which of the two reconstructions to use as skip/direct mode prediction. Simulations indicate an average of 1.04 dB peak signal-to-noise ratio (PSNR) improvement or 23.0% bitrate reduction at low bitrates when compared with H.264 standard. The PSNR gains reach as high as 3.00 dB for inter-predicted frames and 3.78 dB when only B-frames are considered. Decoded videos exhibit significantly better visual quality as well.This research was supported by TUBITAK Career Grant 108E201Publisher's Versio

    Rate-distortion and complexity optimized motion estimation for H.264 video coding

    Get PDF
    11.264 video coding standard supports several inter-prediction coding modes that use macroblock (MB) partitions with variable block sizes. Rate-distortion (R-D) optimal selection of both the motion vectors (MVs) and the coding mode of each MB is essential for an H.264 encoder to achieve superior coding efficiency. Unfortunately, searching for optimal MVs of each possible subblock incurs a heavy computational cost. In this paper, in order to reduce the computational burden of integer-pel motion estimation (ME) without sacrificing from the coding performance, we propose a R-D and complexity joint optimization framework. Within this framework, we develop a simple method that determines for each MB which partitions are likely to be optimal. MV search is carried out for only the selected partitions, thus reducing the complexity of the ME step. The mode selection criteria is based on a measure of spatiotemporal activity within the MB. The procedure minimizes the coding loss at a given level of computational complexity either for the full video sequence or for each single frame. For the latter case, the algorithm provides a tight upper bound on the worst case complexity/execution time of the ME module. Simulation results show that the algorithm speeds up integer-pel ME by a factor of up to 40 with less than 0.2 dB loss in coding efficiency.Publisher's Versio

    Hierarchical quantization indexing for wavelet and wavelet packet image coding

    Get PDF
    In this paper, we introduce the quantization index hierarchy, which is used for efficient coding of quantized wavelet and wavelet packet coefficients. A hierarchical classification map is defined in each wavelet subband, which describes the quantized data through a series of index classes. Going from bottom to the top of the tree, neighboring coefficients are combined to form classes that represent some statistics of the quantization indices of these coefficients. Higher levels of the tree are constructed iteratively by repeating this class assignment to partition the coefficients into larger Subsets. The class assignments are optimized using a rate-distortion cost analysis. The optimized tree is coded hierarchically from top to bottom by coding the class membership information at each level of the tree. Context-adaptive arithmetic coding is used to improve coding efficiency. The developed algorithm produces PSNR results that are better than the state-of-art wavelet-based and wavelet packet-based coders in literature.This research was supported by Isik University BAP-05B302 GrantPublisher's Versio

    Spherical coding algorithm for wavelet image compression

    Get PDF
    PubMed ID: 19342336In recent literature, there exist many high-performance wavelet coders that use different spatially adaptive coding techniques in order to exploit the spatial energy compaction property of the wavelet transform. Two crucial issues in adaptive methods are the level of flexibility and the coding efficiency achieved while modeling different image regions and allocating bitrate within the wavelet subbands. In this paper, we introduce the "spherical coder," which provides a new adaptive framework for handling these issues in a simple and effective manner. The coder uses local energy as a direct measure to differentiate between parts of the wavelet subband and to decide how to allocate the available bitrate. As local energy becomes available at finer resolutions, i.e., in smaller size windows, the coder automatically updates its decisions about how to spend the bitrate. We use a hierarchical set of variables to specify and code the local energy up to the highest resolution, i.e., the energy of individual wavelet coefficients. The overall scheme is nonredundant, meaning that the subband information is conveyed using this equivalent set of variables without the need for any side parameters. Despite its simplicity, the algorithm produces PSNR results that are competitive with the state-of-art coders in literature.Publisher's VersionAuthor Post Prin

    Görüntü ayrıştırma için özgün süperpiksel bölütleme algoritmaları geliştirilmesi

    Get PDF
    Süperpikseller görüntü bölütleme ve ayrıştırma problemlerinde yaygın olarak kullanılmaktadır. Sahne etiketlemede görüntü bir süperpiksel algoritması ile görsel olarak tutarlı küçük parçalara bölütlenmekte; daha sonra süperpikseller farklı sınıflara ayrıştırılmaktadır. Bu projede bölütleme ve etiketleme bütünsel bir bakış açısı ile ele alınarak görüntü ayrıştırmanın farklı adımları için özgün yaklaşımlar geliştirilmiştir. Yapılan çalışmalar, süperpikseller için alternatif bölütleme, öznitelik çıkarımı, sınıf-olabilirlik hesaplama ve bağlamsal modelleme yöntemleri geliştirilmesini kapsamaktadır. Projede öncelikle farklı bölütleme yöntem ve parametrelerinin etiketleme doğruluğu üzerindeki etkisi test edilmiştir. Daha sonra süperpiksel özniteliklerinin seçimi ve kodlanması, sınıf etiketlerinin olabilirlik hesabının modellenmesi üzerinde durulmuştur. Son olarak, alternatif bölütleme sonuçlarının kaynaştırılması için genelleştirilmiş bağlamsal modelleme yaklaşımı gelişirilmiştir. Önerilen yöntemler çeşitli anlambilimsel görüntü veritabanlarında test edilmiş ve eniyilenmiştir. Ayrıca projenin son döneminde, yapılan çalışmalar uydu görüntülerinden arazi örtüsü sınıflandırma problemine uyarlanmıştır. Benzetim sonuçları, farklı bölütleme yöntemlerindeki tümler bilginin doğru şekilde birleştirilmesiyle görüntü etiketleme doğruluğunda ciddi artışlar elde edildiğini ortaya koymuştur.Superpixels are widely used in image segmentation and parsing problems. In scene labeling, image is initially divided into visually consistent small pieces by using a superpixel algorithm; later superpixels are parsed into different classes. In this project, segmentation and labeling are considered together in a global perspective, and novel approaches are proposed for different steps of image parsing. In particular, several methods are developed for alternative segmentation, feature extraction, class-likelihood computation and contextual modeling of superpixels. Initially the effect of different segmentation methods and parameters on the labeling accuracy is thoroughly tested. Later superpixel feature selection and coding, modeling of likelihood computation for class label likelihoods are investigated. Finally a generalized contextual modeling framework is developed for the fusion of alternative segmentation results. The proposed methods are tested and optimized on several semantic image databases. In addition, in the final phase of the project, this work is adapted for the problem of land cover classification from satellite images. Simulation results show that it is possible to achieve substantial improvement in image labeling accuracy by accurate combination of complementary information coming from different segmentation methods.TÜBİTA

    3-D Mesh geometry compression with set partitioning in the spectral domain

    Get PDF
    This paper explains the development of a highly efficient progressive 3-D mesh geometry coder based on the region adaptive transform in the spectral mesh compression method. A hierarchical set partitioning technique, originally proposed for the efficient compression of wavelet transform coefficients in high-performance wavelet-based image coding methods, is proposed for the efficient compression of the coefficients of this transform. Experiments confirm that the proposed coder employing such a region adaptive transform has a high compression performance rarely achieved by other state of the art 3-D mesh geometry compression algorithms. A new, high-performance fixed spectral basis method is also proposed for reducing the computational complexity of the transform. Many-to-one mappings are employed to relate the coded irregular mesh region to a regular mesh whose basis is used. To prevent loss of compression performance due to the low-pass nature of such mappings, transitions are made from transform-based coding to spatial coding on a per region basis at high coding rates. Experimental results show the performance advantage of the newly proposed fixed spectral basis method over the original fixed spectral basis method in the literature that employs one-to-one mappings.This work was supported in part by the Scientific and Technological Research Council of Turkey, and conducted under Project 106E064Publisher's Versio

    PL-GAN: Path loss prediction using generative adversarial networks

    Get PDF
    Accurate prediction of path loss is essential for the design and optimization of wireless communication networks. Existing path loss prediction methods typically suffer from the trade-off between accuracy and computational efficiency. In this paper, we present a deep learning based approach with clear advantages over the existing ones. The proposed method is based on the Generative Adversarial Network (GAN) technique to predict path loss map of a target area from the satellite image or the height map of the area. The proposed method produces the path loss map of the entire target area in a single inference, with accuracy close to the one produced by ray tracing simulations. The method is tested at 900MHz transmission frequency; the trained model and source codes are publicly available on a Github page

    RSS-based wireless LAN indoor localization and tracking using deep architectures

    Get PDF
    Wireless Local Area Network (WLAN) positioning is a challenging task indoors due to environmental constraints and the unpredictable behavior of signal propagation, even at a fixed location. The aim of this work is to develop deep learning-based approaches for indoor localization and tracking by utilizing Received Signal Strength (RSS). The study proposes Multi-Layer Perceptron (MLP), One and Two Dimensional Convolutional Neural Networks (1D CNN and 2D CNN), and Long Short Term Memory (LSTM) deep networks architectures for WLAN indoor positioning based on the data obtained by actual RSS measurements from an existing WLAN infrastructure in a mobile user scenario. The results, using different types of deep architectures including MLP, CNNs, and LSTMs with existing WLAN algorithms, are presented. The Root Mean Square Error (RMSE) is used as the assessment criterion. The proposed LSTM Model 2 achieved a dynamic positioning RMSE error of 1.73 m, which outperforms probabilistic WLAN algorithms such as Memoryless Positioning (RMSE: 10.35 m) and Nonparametric Information (NI) filter with variable acceleration (RMSE: 5.2 m) under the same experiment environment.ECSEL Joint Undertaking ; European Union's H2020 Framework Programme (H2020/2014-2020) Grant ; National Authority TUBITA

    Proposing a CNN method for primary and permanent tooth detection and enumeration on pediatric dental radiographs

    Get PDF
    OBJECTIVE: In this paper, we aimed to evaluate the performance of a deep learning system for automated tooth detection and numbering on pediatric panoramic radiographs. STUDY DESIGN: YOLO V4, a CNN (Convolutional Neural Networks) based object detection model was used for automated tooth detection and numbering. 4545 pediatric panoramic X-ray images, processed in labelImg, were trained and tested in the Yolo algorithm. RESULTS AND CONCLUSIONS: The model was successful in detecting and numbering both primary and permanent teeth on pediatric panoramic radiographs with the mean average precision (mAP) value of 92.22 %, mean average recall (mAR) value of 94.44% and weighted-F1 score of 0.91. The proposed CNN method yielded high and fast performance for automated tooth detection and numbering on pediatric panoramic radiographs. Automatic tooth detection could help dental practitioners to save time and also use it as a pre-processing tool for detection of dental pathologies

    A deep learning approach to permanent tooth germ detection on pediatric panoramic radiographs

    Get PDF
    Purpose: The aim of this study was to assess the performance of a deep learning system for permanent tooth germ detection on pediatric panoramic radiographs.Materials and Methods: In total, 4518 anonymized panoramic radiographs of children between 5 and 13 years of age were collected. YOLOv4, a convolutional neural network (CNN)-based object detection model, was used to automatically detect permanent tooth germs. Panoramic images of children processed in LabelImg were trained and tested in the YOLOv4 algorithm. True-positive, false-positive, and false-negative rates were calculated. A confusion matrix was used to evaluate the performance of the model.Results: The YOLOv4 model, which detected permanent tooth germs on pediatric panoramic radiographs, provided an average precision value of 94.16% and an F1 value of 0.90, indicating a high level of significance. The average YOLOv4 inference time was 90 ms. Conclusion: The detection of permanent tooth germs on pediatric panoramic X-rays using a deep learning-based approach may facilitate the early diagnosis of tooth deficiency or supernumerary teeth and help dental practitioners find more accurate treatment options while saving time and effort
    corecore