19 research outputs found
Medical imaging analysis with artificial neural networks
Given that neural networks have been widely reported in the research community of medical imaging, we provide a focused literature survey on recent neural network developments in computer-aided diagnosis, medical image segmentation and edge detection towards visual content analysis, and medical image registration for its pre-processing and post-processing, with the aims of increasing awareness of how neural networks can be applied to these areas and to provide a foundation for further research and practical development. Representative techniques and algorithms are explained in detail to provide inspiring examples illustrating: (i) how a known neural network with fixed structure and training procedure could be applied to resolve a medical imaging problem; (ii) how medical images could be analysed, processed, and characterised by neural networks; and (iii) how neural networks could be expanded further to resolve problems relevant to medical imaging. In the concluding section, a highlight of comparisons among many neural network applications is included to provide a global view on computational intelligence with neural networks in medical imaging
Image Compression Techniques: A Survey in Lossless and Lossy algorithms
The bandwidth of the communication networks has been increased continuously as results of technological advances. However, the introduction of new services and the expansion of the existing ones have resulted in even higher demand for the bandwidth. This explains the many efforts currently being invested in the area of data compression. The primary goal of these works is to develop techniques of coding information sources such as speech, image and video to reduce the number of bits required to represent a source without significantly degrading its quality. With the large increase in the generation of digital image data, there has been a correspondingly large increase in research activity in the field of image compression. The goal is to represent an image in the fewest number of bits without losing the essential information content within. Images carry three main type of information: redundant, irrelevant, and useful. Redundant information is the deterministic part of the information, which can be reproduced without loss from other information contained in the image. Irrelevant information is the part of information that has enormous details, which are beyond the limit of perceptual significance (i.e., psychovisual redundancy). Useful information, on the other hand, is the part of information, which is neither redundant nor irrelevant. Human usually observes decompressed images. Therefore, their fidelities are subject to the capabilities and limitations of the Human Visual System. This paper provides a survey on various image compression techniques, their limitations, compression rates and highlights current research in medical image compression
Neural-network-aided automatic modulation classification
Automatic modulation classification (AMC) is a pattern matching problem which significantly impacts divers telecommunication systems, with significant applications in military and civilian contexts alike. Although its appearance in the literature is far from novel, recent developments in machine learning technologies have triggered an increased interest in this area of research.
In the first part of this thesis, an AMC system is studied where, in addition to the typical point-to-point setup of one receiver and one transmitter, a second transmitter is also present, which is considered an interfering device. A convolutional neural network (CNN) is used for classification. In addition to studying the effect of interference strength, we propose a modification attempting to leverage some of the debilitating results of interference, and also study the effect of signal quantisation upon classification performance.
Consequently, we assess a cooperative setting of AMC, namely one where the receiver features multiple antennas, and receives different versions of the same signal from the single-antenna transmitter. Through the combination of data from different antennas, it is evidenced that this cooperative approach leads to notable performance improvements over the established baseline.
Finally, the cooperative scenario is expanded to a more complicated setting, where a realistic geographic distribution of four receiving nodes is modelled, and furthermore, the decision-making mechanism with regard to the identity of a signal resides in a fusion centre independent of the receivers, connected to them over finite-bandwidth backhaul links. In addition to the common concerns over classification accuracy and inference time, data reduction methods of various types (including “trained” lossy compression) are implemented with the objective of minimising the data load placed upon the backhaul links.Open Acces
Video coding for compression and content-based functionality
The lifetime of this research project has seen two dramatic developments in the area of digital video coding. The first has been the progress of compression research leading to a factor of two improvement over existing standards, much wider deployment possibilities and the development of the new international ITU-T Recommendation H.263. The second has been a radical change in the approach to video content production with the introduction of the content-based coding concept and the addition of scene composition information to the encoded bit-stream. Content-based coding is central to the latest international standards efforts from the ISO/IEC MPEG working group.
This thesis reports on extensions to existing compression techniques exploiting a priori knowledge about scene content. Existing, standardised, block-based compression coding techniques were extended with work on arithmetic entropy coding and intra-block prediction. These both form part of the H.263 and MPEG-4 specifications respectively. Object-based coding techniques were developed within a collaborative simulation model, known as SIMOC, then extended with ideas on grid motion vector modelling and vector accuracy confidence estimation. An improved confidence measure for encouraging motion smoothness is proposed.
Object-based coding ideas, with those from other model and layer-based coding approaches, influenced the development of content-based coding within MPEG-4. This standard made considerable progress in this newly adopted content based video coding field defining normative techniques for arbitrary shape and texture coding. The means to generate this information, the analysis problem, for the content to be coded was intentionally not specified. Further research work in this area concentrated on video segmentation and analysis techniques to exploit the benefits of content based coding for generic frame based video. The work reported here introduces the use of a clustering algorithm on raw data features for providing initial segmentation of video data and subsequent tracking of those image regions through video sequences. Collaborative video analysis frameworks from COST 21 l qual and MPEG-4, combining results from many other segmentation schemes, are also introduced
Approche robuste pour la segmentation et la classification d’images m´edicales
Image segmentation is a vital process in various fields, including robotics, object recognition, and medical imaging. In medical imaging, accurate segmentation of brain tissues from MRI images is crucial for diagnosing and treating brain disorders such as Alzheimer’s
disease, epilepsy, schizophrenia, multiple sclerosis, and cancer.
This thesis proposes an automatic fuzzy method for brain MRI segmentation.
Firstly, the proposed method aims to improve the efficiency of the Fuzzy C-Means (FCM) algorithm by reducing the need for manual intervention in cluster initialization and determining the number of clusters. For this purpose, we introduce an adaptive splitmerge
technique that effectively divides the image into several homogeneous regions using a multi-threshold method based on entropy information. During the merge process, a new distance metric is introduced to combine the regions that are both highly similar within the merged region and effectively separated from others. The cluster centers and numbers obtained from the adaptive split-merge step serve as the initial parameters for the FCM algorithm. The obtained fuzzy partitions are evaluated using a novel proposed validity index.
Secondly, we present a novel method to address the challenge of noisy pixels in the FCM algorithm by incorporating spatial information. Specifically, we assign a crucial role to the central pixel in the clustering process, provided it is not corrupted with noise.
However, if it is corrupted with noise, its influence is reduced. Furthermore, we propose a novel quantitative metric for replacing the central pixel with one of its neighbors if it can improve the segmentation result in terms of compactness and separation. To evaluate the effectiveness of the proposed method, a thorough comparison with existing clustering techniques is conducted, considering cluster validity functions, segmentation accuracy, and tissue segmentation accuracy. The evaluation comprises comprehensive qualitative and quantitative assessments, providing strong evidence of the superior performance of the proposed approach
Recommended from our members
Image coding employing vector quantisation
The work described in this thesis is concerned with the coding of digitised images employing vector quantisation (VQ). A new VQ-based coding system, named Directional Classified Gain-Shape Vector Quantisation (DCGSVQ), has been developed. It combines vector quantisation with transform coding tech-niques and exploits various properties of the human visual system (HVS) like frequency sensitivity, the masking effect, and orientation sensitivity, to produce reconstructed images with good subjective quality at low bit rates (0.48 bit per pixel).
A content classifier, operating in the spatial domain, is employed to classify each image block of 8x8 pixels into one of several classes which represent various image patterns (edges in various directions, monotone areas, complex texture, etc.). Then a classified gain-shape vector quantiser is employed in the cosine domain to encode vectors of AC transform coefficients, while using either a scalar quantiser or a gain-shape vector quantiser to encode the DC coefficients. A new vector configuration strategy for defining AC vectors in the cosine domain has been proposed to better adapt the system to the local statistics of the image blocks. Accordingly, the AC coefficients are first weighted by an equivalent modulation transfer function (MTF) that represents the filtering characteristics of the HVS, and then they are grouped into directional vectors according to their direction in the cosine domain. An optional simple method for feature enhancement, based on inherent properties of the proposed strategy, has also been proposed enabling further image processing at the receiver.
A new algorithm for designing the various DCGSVQ codebooks has been developed in two steps. First, a general-purpose new algorithm for classified VQ (CVQ) codebook design has been developed as an alternative to empirical methods proposed in the literature. The new algorithm provides a simple and systematic method for codebook design and reduces considerably the total num-ber of mathematical operations during codebook design. We have named this new algorithm Classified Nearest Neighbour Clustering (CNNC). A fast search algorithm has also been developed to reduce further computational efforts during codebook design.
Secondly, a new optimisation criterion which is more suitable for shape code-book design has been developed and employed within the CNNC algorithm to design classified shape codebooks for the DCGSVQ. We have named this algo-rithm modified CNNC. The new algorithm designs the various shape codebooks simultaneously giving the designer full freedom to assign more importance to certain classes of vectors or to certain training vectors. The DCGSVQ system has been shown to outperform the full search VQ, the CVQ, and the transform coding CVQ (TC-CVQ) producing nicer coded images with better signal to noise ratio (SNR) figures at various bit rates.
To improve further the perceived quality of coded images, a new postpro-cessing algorithm that can be applied at the decoder without increasing the bit rate has been developed. The proposed algorithm is based on various charac-teristics of the signal spectrum and the noise spectrum, and exploits various properties of the HVS. The proposed algorithm is a general-purpose algorithm that can be applied to block-coded images produced by various systems like VQ, transform coding (TC), and Block Truncation Coding (BTC). The algorithm is modular and can be applied in an adaptive way depending on the quality of the block-coded image.
The last theme of this work has been the identification of useful fidelity criteria for image quality assessment. Quality predictors in the form of some subjectively weighted error measures were sought such that a smooth functional relationship exists between them and quality ratings made by human viewers. Quality predictors that incorporate simplified models of the HVS have been proposed and tested on a large set of VQ-coded images. Two such predictors have been shown to be better suited for image quality assessment than the commonly used mean square error (MSE) measure
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Energy efficient hardware acceleration of multimedia processing tools
The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores.
To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature.
The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings
Dynamic adaptation of streamed real-time E-learning videos over the internet
Even though the e-learning is becoming increasingly popular in the academic environment,
the quality of synchronous e-learning video is still substandard and significant work needs to be
done to improve it. The improvements have to be brought about taking into considerations both:
the network requirements and the psycho- physical aspects of the human visual system.
One of the problems of the synchronous e-learning video is that the head-and-shoulder video
of the instructor is mostly transmitted. This video presentation can be made more interesting by
transmitting shots from different angles and zooms. Unfortunately, the transmission of such
multi-shot videos will increase packet delay, jitter and other artifacts caused by frequent
changes of the scenes. To some extent these problems may be reduced by controlled reduction
of the quality of video so as to minimise uncontrolled corruption of the stream. Hence, there is a
need for controlled streaming of a multi-shot e-learning video in response to the changing
availability of the bandwidth, while utilising the available bandwidth to the maximum.
The quality of transmitted video can be improved by removing the redundant background
data and utilising the available bandwidth for sending high-resolution foreground information.
While a number of schemes exist to identify and remove the background from the foreground,
very few studies exist on the identification and separation of the two based on the understanding
of the human visual system. Research has been carried out to define foreground and background
in the context of e-learning video on the basis of human psychology. The results have been
utilised to propose methods for improving the transmission of e-learning videos.
In order to transmit the video sequence efficiently this research proposes the use of Feed-
Forward Controllers that dynamically characterise the ongoing scene and adjust the streaming
of video based on the availability of the bandwidth. In order to satisfy a number of receivers
connected by varied bandwidth links in a heterogeneous environment, the use of Multi-Layer
Feed-Forward Controller has been researched. This controller dynamically characterises the
complexity (number of Macroblocks per frame) of the ongoing video sequence and combines it
with the knowledge of availability of the bandwidth to various receivers to divide the video
sequence into layers in an optimal way before transmitting it into network.
The Single-layer Feed-Forward Controller inputs the complexity (Spatial Information and
Temporal Information) of the on-going video sequence along with the availability of bandwidth
to a receiver and adjusts the resolution and frame rate of individual scenes to transmit the
sequence optimised to give the most acceptable perceptual quality within the bandwidth
constraints.
The performance of the Feed-Forward Controllers have been evaluated under simulated
conditions and have been found to effectively regulate the streaming of real-time e-learning
videos in order to provide perceptually improved video quality within the constraints of the
available bandwidth