27 research outputs found

    End-to-end security for video distribution

    Get PDF

    Analysing and quantifying visual experience in medical imaging

    Get PDF
    Healthcare professionals increasingly view medical images and videos in a variety of environments. The perception and interpretation of medical visual information across all specialties, career stages, and practice settings are critical to patient care and safety. However, medical images and videos are not self-explanatory and thus need to be interpreted by humans, who are prone to errors caused by the inherent limitations of the human visual system. It is essential to understand how medical experts perceive visual content, and use this knowledge to develop new solutions to improve clinical practice. Progress has been made in the literature towards such understanding, however studies remain limited. This thesis investigates two aspects of human visual experience in medical imaging, i.e., visual quality assessment and visual attention. Visual quality assessment is important as diverse visual signal distortion may arise in medical imaging and affect the perceptual quality of visual content, and therefore potentially impact the diagnosis accuracy. We adapted existing qualitative and quantitative methods to evaluate the quality of distorted medical videos. We also analysed the impact of medical specialty on visual perception and found significant differences between specialty groups, e.g., sonographers were in general more bothered by visual distortions than radiologists. Visual attention has been studied in medical imaging using eye-tracking technology. In this thesis, we firstly investigated gaze allocation with radiologists analysing two-view mammograms and secondly assessed the impact of expertise and experience on gaze behaviour. We also evaluated to what extent state-of-the-art visual attention models can predict radiologists’ gaze behaviour and showed the limitations of existing models. This thesis provides new experimental designs and statistical processes to evaluate the perception of medical images and videos, which can be used to optimise the visual experience of image readers in clinical practice

    The doctoral research abstracts. Vol:10 2016 / Institute of Graduate Studies, UiTM

    Get PDF
    Foreword: Congratulations to Institute of Graduate Studies on the continuous efforts to publish the 10th issue of the Doctoral Research Abstracts which showcases the research carried out in the various disciplines range from science and technology, business and administration to social science and humanities. This issue captures the novelty of research contributed by seventy (70) PhD graduands receiving their scrolls in the UiTM’s 85th Convocation. As of October 2016, this year UiTM has produced 138 PhD graduates soaring from125 in the previous year (2015). It shows that UiTM is in the positive direction to achive the total of 1200 PhD graduates in 2020. To the 70 doctorates, I would like it to be known that you have most certainly done UiTM proud by journeying through the scholarly world with its endless challenges and obstacles, and by persevering right till the very end. This convocation should not be regarded as the end of your highest scholarly achievement and contribution to the body of knowledge but rather as the beginning of embarking into more innovative research from knowledge gained during this academic journey, for the community and country. This year marks UiTM’s 60th Anniversary and we have been producing many good quality graduates that have a major impact on the socio-economic development of the country and the bumiputeras. As alumni of UiTM, we hold you dear to our hearts. We sincerely wish you all the best and may the Almighty guide you to a path of excellence and success. As you leave the university as alumni we hope a new relationship will be fostered between you and the faculty in soaring UiTM to greater heights. “UiTM Sentiasa di Hati Ku” / Prof Emeritus Dato’ Dr Hassan Said Vice Chancellor Universiti Teknologi MAR

    A computational model of visual attention.

    Get PDF
    Visual attention is a process by which the Human Visual System (HVS) selects most important information from a scene. Visual attention models are computational or mathematical models developed to predict this information. The performance of the state-of-the-art visual attention models is limited in terms of prediction accuracy and computational complexity. In spite of significant amount of active research in this area, modelling visual attention is still an open research challenge. This thesis proposes a novel computational model of visual attention that achieves higher prediction accuracy with low computational complexity. A new bottom-up visual attention model based on in-focus regions is proposed. To develop the model, an image dataset is created by capturing images with in-focus and out-of-focus regions. The Discrete Cosine Transform (DCT) spectrum of these images is investigated qualitatively and quantitatively to discover the key frequency coefficients that correspond to the in-focus regions. The model detects these key coefficients by formulating a novel relation between the in-focus and out-of-focus regions in the frequency domain. These frequency coefficients are used to detect the salient in-focus regions. The simulation results show that this attention model achieves good prediction accuracy with low complexity. The prediction accuracy of the proposed in-focus visual attention model is further improved by incorporating sensitivity of the HVS towards the image centre and the human faces. Moreover, the computational complexity is further reduced by using Integer Cosine Transform (ICT). The model is parameter tuned using the hill climbing approach to optimise the accuracy. The performance has been analysed qualitatively and quantitatively using two large image datasets with eye tracking fixation ground truth. The results show that the model achieves higher prediction accuracy with a lower computational complexity compared to the state-of-the-art visual attention models. The proposed model is useful in predicting human fixations in computationally constrained environments. Mainly it is useful in applications such as perceptual video coding, image quality assessment, object recognition and image segmentation

    End to end Multi-Objective Optimisation of H.264 and HEVC Codecs

    Get PDF
    All multimedia devices now incorporate video CODECs that comply with international video coding standards such as H.264 / MPEG4-AVC and the new High Efficiency Video Coding Standard (HEVC) otherwise known as H.265. Although the standard CODECs have been designed to include algorithms with optimal efficiency, large number of coding parameters can be used to fine tune their operation, within known constraints of for e.g., available computational power, bandwidth, consumer QoS requirements, etc. With large number of such parameters involved, determining which parameters will play a significant role in providing optimal quality of service within given constraints is a further challenge that needs to be met. Further how to select the values of the significant parameters so that the CODEC performs optimally under the given constraints is a further important question to be answered. This thesis proposes a framework that uses machine learning algorithms to model the performance of a video CODEC based on the significant coding parameters. Means of modelling both the Encoder and Decoder performance is proposed. We define objective functions that can be used to model the performance related properties of a CODEC, i.e., video quality, bit-rate and CPU time. We show that these objective functions can be practically utilised in video Encoder/Decoder designs, in particular in their performance optimisation within given operational and practical constraints. A Multi-objective Optimisation framework based on Genetic Algorithms is thus proposed to optimise the performance of a video codec. The framework is designed to jointly minimize the CPU Time, Bit-rate and to maximize the quality of the compressed video stream. The thesis presents the use of this framework in the performance modelling and multi-objective optimisation of the most widely used video coding standard in practice at present, H.264 and the latest video coding standard, H.265/HEVC. When a communication network is used to transmit video, performance related parameters of the communication channel will impact the end-to-end performance of the video CODEC. Network delays and packet loss will impact the quality of the video that is received at the decoder via the communication channel, i.e., even if a video CODEC is optimally configured network conditions will make the experience sub-optimal. Given the above the thesis proposes a design, integration and testing of a novel approach to simulating a wired network and the use of UDP protocol for the transmission of video data. This network is subsequently used to simulate the impact of packet loss and network delays on optimally coded video based on the framework previously proposed for the modelling and optimisation of video CODECs. The quality of received video under different levels of packet loss and network delay is simulated, concluding the impact on transmitted video based on their content and features

    Attention Driven Solutions for Robust Digital Watermarking Within Media

    Get PDF
    As digital technologies have dramatically expanded within the last decade, content recognition now plays a major role within the control of media. Of the current recent systems available, digital watermarking provides a robust maintainable solution to enhance media security. The two main properties of digital watermarking, imperceptibility and robustness, are complimentary to each other but by employing visual attention based mechanisms within the watermarking framework, highly robust watermarking solutions are obtainable while also maintaining high media quality. This thesis firstly provides suitable bottom-up saliency models for raw image and video. The image and video saliency algorithms are estimated directly from within the wavelet domain for enhanced compatibility with the watermarking framework. By combining colour, orientation and intensity contrasts for the image model and globally compensated object motion in the video model, novel wavelet-based visual saliency algorithms are provided. The work extends these saliency models into a unique visual attention-based watermarking scheme by increasing the watermark weighting parameter within visually uninteresting regions. An increased watermark robustness, up to 40%, against various filtering attacks, JPEG2000 and H.264/AVC compression is obtained while maintaining the media quality, verified by various objective and subjective evaluation tools. As most video sequences are stored in an encoded format, this thesis studies watermarking schemes within the compressed domain. Firstly, the work provides a compressed domain saliency model formulated directly within the HEVC codec, utilizing various coding decisions such as block partition size, residual magnitude, intra frame angular prediction mode and motion vector difference magnitude. Large computational savings, of 50% or greater, are obtained compared with existing methodologies, as the saliency maps are generated from partially decoded bitstreams. Finally, the saliency maps formulated within the compressed HEVC domain are studied within the watermarking framework. A joint encoder and a frame domain watermarking scheme are both proposed by embedding data into the quantised transform residual data or wavelet coefficients, respectively, which exhibit low visual salience

    Audio/Video Transmission over IEEE 802.11e Networks: Retry Limit Adaptation and Distortion Estimation

    Get PDF
    The objective of this thesis focuses on the audio and video transmission over wireless networks adopting the family of the IEEE 802.11x standards. In particular, this thesis discusses about the resolution of four issues: the adaptive retransmission, the comparison of video quality indexes for retry limit adaptation purposes, the estimation of the distortion and the joint adaptation of the maximum number of retransmissions of voice and video flows
    corecore