34 research outputs found

    Efficient Motion Estimation and Mode Decision Algorithms for Advanced Video Coding

    Get PDF
    H.264/AVC video compression standard achieved significant improvements in coding efficiency, but the computational complexity of the H.264/AVC encoder is drastically high. The main complexity of encoder comes from variable block size motion estimation (ME) and rate-distortion optimized (RDO) mode decision methods. This dissertation proposes three different methods to reduce computation of motion estimation. Firstly, the computation of each distortion measure is reduced by proposing a novel two step edge based partial distortion search (TS-EPDS) algorithm. In this algorithm, the entire macroblock is divided into different sub-blocks and the calculation order of partial distortion is determined based on the edge strength of the sub-blocks. Secondly, we have developed an early termination algorithm that features an adaptive threshold based on the statistical characteristics of rate-distortion (RD) cost regarding current block and previously processed blocks and modes. Thirdly, this dissertation presents a novel adaptive search area selection method by utilizing the information of the previously computed motion vector differences (MVDs). In H.264/AVC intra coding, DC mode is used to predict regions with no unified direction and the predicted pixel values are same and thus smooth varying regions are not well de-correlated. This dissertation proposes an improved DC prediction (IDCP) mode based on the distance between the predicted and reference pixels. On the other hand, using the nine prediction modes in intra 4x4 and 8x8 block units needs a lot of overhead bits. In order to reduce the number of overhead bits, an intra mode bit rate reduction method is suggested. This dissertation also proposes an enhanced algorithm to estimate the most probable mode (MPM) of each block. The MPM is derived from the prediction mode direction of neighboring blocks which have different weights according to their positions. This dissertation also suggests a fast enhanced cost function for mode decision of intra encoder. The enhanced cost function uses sum of absolute Hadamard-transformed differences (SATD) and mean absolute deviation of the residual block to estimate distortion part of the cost function. A threshold based large coefficients count is also used for estimating the bit-rate part

    A Deep Learning Approach for Spatiotemporal-Data-Driven Traffic State Estimation

    Get PDF
    The past decade witnessed rapid developments in traffic data sensing technologies in the form of roadside detector hardware, vehicle on-board units, and pedestrian wearable devices. The growing magnitude and complexity of the available traffic data has fueled the demand for data-driven models that can handle large scale inputs. In the recent past, deep-learning-powered algorithms have become the state-of-the-art for various data-driven applications. In this research, three applications of deep learning algorithms for traffic state estimation were investigated. Firstly, network-wide traffic parameters estimation was explored. An attention-based multi-encoder-decoder (Att-MED) neural network architecture was proposed and trained to predict freeway traffic speed up to 60 minutes ahead. Att-MED was designed to encode multiple traffic input sequences: short-term, daily, and weekly cyclic behavior. The proposed network produced an average prediction accuracy of 97.5%, which was superior to the compared baseline models. In addition to improving the output performance, the model\u27s attention weights enhanced the model interpretability. This research additionally explored the utility of low-penetration connected probe-vehicle data for network-wide traffic parameters estimation and prediction on freeways. A novel sequence-to-sequence recurrent graph networks (Seq2Se2 GCN-LSTM) was designed. It was then trained to estimate and predict traffic volume and speed for a 60-minute future time horizon. The proposed methodology generated volume and speed predictions with an average accuracy of 90.5% and 96.6%, respectively, outperforming the investigated baseline models. The proposed method demonstrated robustness against perturbations caused by the probe vehicle fleet\u27s low penetration rate. Secondly, the application of deep learning for road weather detection using roadside CCTVs were investigated. A Vision Transformer (ViT) was trained for simultaneous rain and road surface condition classification. Next, a Spatial Self-Attention (SSA) network was designed to consume the individual detection results, interpret the spatial context, and modify the collective detection output accordingly. The sequential module improved the accuracy of the stand-alone Vision Transformer as measured by the F1-score, raising the total accuracy for both tasks to 96.71% and 98.07%, respectively. Thirdly, a real-time video-based traffic incident detection algorithm was developed to enhance the utilization of the existing roadside CCTV network. The methodology automatically identified the main road regions in video scenes and investigated static vehicles around those areas. The developed algorithm was evaluated using a dataset of roadside videos. The incidents were detected with 85.71% sensitivity and 11.10% false alarm rate with an average delay of 27.53 seconds. In general, the research proposed in this dissertation maximizes the utility of pre-existing traffic infrastructure and emerging probe traffic data. It additionally demonstrated deep learning algorithms\u27 capability of modeling complex spatiotemporal traffic data. This research illustrates that advances in the deep learning field continue to have a high applicability potential in the traffic state estimation domain

    Development of a Queue Warning System Utilizing ATM Infrastructure System Development and Field-Testing

    Get PDF
    MnDOT has already deployed an extensive infrastructure for Active Traffic Management (ATM) on I-35W and I-94 with plans to expand on other segments of the Twin Cities freeway network. The ATM system includes intelligent lane control signals (ILCS) spaced every half mile over every lane to warn motorists of incidents or hazards on the roadway ahead. This project developed two separate systems that can identify lane-specific shockwave or queuing conditions on the freeway and use existing ILCS to warn motorists upstream for rear-end collision prevention. The two systems were field tested at two locations in the ATM equipped network that have a high frequency of rear- end collisions. These locations experience significantly different traffic-flow conditions, allowing for the development and testing of two different approaches to the same problem. The I-94 westbound segment in downtown Minneapolis is known for its high crash rate due to rapidly evolving shockwaves while the I-35W southbound segment north of the TH-62 interchange experiences longstanding queues extending into the freeway mainline. The Minnesota Traffic Observatory developed the I-94 Queue Warning system while the University of Michigan, under contract, developed the I-35W system. Prior to the I-94 installation, based on data collected in 2013, there were 11.9 crashes per VMT and 111.8 near crashes per VMT. In the first three months of the system’s deployment, event frequency reduced to 9.34 crashes per million vehicle miles of travel (MVMT) and 51.8 near crashes per MVMT, a 22% decrease in crashes and a 54% decrease in near crashes. The I-35W system did not undergo a similarly thorough evaluation, but for most of the lane segments involved, it showed that queue warning messages help reduce the speed variance near the queue locations and the speed difference between upstream and downstream locations. This also implicated a satisfactory level of compliance rate from travelers

    Investigating low-bitrate, low-complexity H.264 region of interest techniques in error-prone environments

    Get PDF
    The H.264/AVC video coding standard leverages advanced compression methods to provide a significant increase in performance over previous CODECs in terms of picture quality, bitrate, and flexibility. The specification itself provides several profiles and levels that allow customization through the use of various advanced features. In addition to these features, several new video coding techniques have been developed since the standard\u27s inception. One such technique known as Region of Interest (RoI) coding has been in existence since before H.264\u27s formalization, and several means of implementing RoI coding in H.264 have been proposed. Region of Interest coding operates under the assumption that one or more regions of a sequence have higher priority than the rest of the video. One goal of RoI coding is to provide a decrease in bitrate without significant loss of perceptual quality, and this is particularly applicable to low complexity environments, if the proper implementation is used. Furthermore, RoI coding may allow for enhanced error resilience in the selected regions if desired, making RoI suitable for both low-bitrate and error-prone scenarios. The goal of this thesis project was to examine H.264 Region of Interest coding as it applies to such scenarios. A modified version of the H.264 JM Reference Software was created in which all non-Baseline profile features were removed. Six low-complexity RoI coding techniques, three targeting rate control and three targeting error resilience, were selected for implementation. Error and distortion modeling tools were created to enhance the quality of experimental data. Results were gathered by varying a range of coding parameters including frame size, target bitrate, and macroblock error rates. Methods were then examined based on their rate-distortion curves, ability to achieve target bitrates accurately, and per-region distortions where applicable

    System-on-Chip design of a high performance low power full hardware cabac encoder in H.264/AVC

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    2019 EC3 July 10-12, 2019 Chania, Crete, Greece

    Get PDF
    corecore