734 research outputs found
Towards the optimal Bayes classifier using an extended self-organising map
In this paper, we propose an extended self-organising learning scheme, in which both distance measure and neighbourhood function have been replaced by the neuron's posterior probabilities. Updating of weights is within a limited but fixed sized neighbourhood of the winner. Each unit will converge to one component of a mixture distribution of input samples, so that an optimal pattern classifier can be formed. The proposed learning scheme can be used to train other forms of unsupervised networks, such as radial-basis-function networks. An application example on textured image segmentation is presented
Training Bayesian networks for image segmentation
We are concerned with the problem of image segmentation in which each pixel is assigned to one of a predefined finite number of classes. In Bayesian image analysis, this requires fusing together local predictions for the class labels with a prior model of segmentations. Markov Random Fields (MRFs) have been used to incorporate some of this prior knowledge, but this not entirely satisfactory as inference in MRFs is NP-hard. The multiscale quadtree model of Bouman and Shapiro (1994) is an attractive alternative, as this is a tree-structured belief network in which inference can be carried out in linear time (Pearl 1988). It is an hierarchical model where the bottom-level nodes are pixels, and higher levels correspond to downsampled versions of the image. The conditional-probability tables (CPTs) in the belief network encode the knowledge of how the levels interact. In this paper we discuss two methods of learning the CPTs given training data, using (a) maximum likelihood and the EM algorithm and (b) emphconditional maximum likelihood (CML). Segmentations obtained using networks trained by CML show a statistically-significant improvement in performance on synthetic images. We also demonstrate the methods on a real-world outdoor-scene segmentation task
A Hybrid Enhanced Independent Component Analysis Approach for Segmentation of Brain Magnetic Resonance Image
Medical imaging and analysis plays a crucial role in diagnosis and treatment planning. The anatomical complexity of human brain makes the process of imaging and analyzing very difficult. In spite of huge advancements in medical imaging procedures, accurate segmentation and classification of brain abnormalities remains a challenging and daunting task. This challenge is more visible in the case of brain tumors because of different possible shapes of tumors, locations and image intensities of different types of tumors. In this paper we have presented a method for automated segmentation of brain tumors from magnetic resonance images. An enhanced and modified Gaussian mixture mode model and the independent component analysis segmentation approach has been employed for segmenting brain tumors in magnetic resonance images. The results of segmentation are validated with the help of segmentation evaluation parameters
Object-based video representations: shape compression and object segmentation
Object-based video representations are considered to be useful for easing the process of multimedia content production and enhancing user interactivity in multimedia productions. Object-based video presents several new technical challenges, however.
Firstly, as with conventional video representations, compression of the video data is a
requirement. For object-based representations, it is necessary to compress the shape of
each video object as it moves in time. This amounts to the compression of moving
binary images. This is achieved by the use of a technique called context-based
arithmetic encoding. The technique is utilised by applying it to rectangular pixel blocks and as such it is consistent with the standard tools of video compression. The blockbased application also facilitates well the exploitation of temporal redundancy in the sequence of binary shapes. For the first time, context-based arithmetic encoding is used in conjunction with motion compensation to provide inter-frame compression. The method, described in this thesis, has been thoroughly tested throughout the MPEG-4 core experiment process and due to favourable results, it has been adopted as part of the MPEG-4 video standard.
The second challenge lies in the acquisition of the video objects. Under normal conditions, a video sequence is captured as a sequence of frames and there is no inherent information about what objects are in the sequence, not to mention information relating to the shape of each object. Some means for segmenting semantic objects from general video sequences is required. For this purpose, several image analysis tools may be of help and in particular, it is believed that video object tracking algorithms will be important. A new tracking algorithm is developed based on piecewise polynomial motion representations and statistical estimation tools, e.g. the expectationmaximisation method and the minimum description length principle
Multi-Modal Learning For Adaptive Scene Understanding
Modern robotics systems typically possess sensors of different modalities. Segmenting scenes observed by the robot into a discrete set of classes is a central requirement for autonomy. Equally, when a robot navigates through an unknown environment, it is often necessary to adjust the parameters of the scene segmentation model to maintain the same level of accuracy in changing situations. This thesis explores efficient means of adaptive semantic scene segmentation in an online setting with the use of multiple sensor modalities. First, we devise a novel conditional random field(CRF) inference method for scene segmentation that incorporates global constraints, enforcing particular sets of nodes to be assigned the same class label. To do this efficiently, the CRF is formulated as a relaxed quadratic program whose maximum a posteriori(MAP) solution is found using a gradient-based optimization approach. These global constraints are useful, since they can encode "a priori" information about the final labeling. This new formulation also reduces the dimensionality of the original image-labeling problem. The proposed model is employed in an urban street scene understanding task. Camera data is used for the CRF based semantic segmentation while global constraints are derived from 3D laser point clouds. Second, an approach to learn CRF parameters without the need for manually labeled training data is proposed. The model parameters are estimated by optimizing a novel loss function using self supervised reference labels, obtained based on the information from camera and laser with minimum amount of human supervision. Third, an approach that can conduct the parameter optimization while increasing the model robustness to non-stationary data distributions in the long trajectories is proposed. We adopted stochastic gradient descent to achieve this goal by using a learning rate that can appropriately grow or diminish to gain adaptability to changes in the data distribution
Dynamical models and machine learning for supervised segmentation
This thesis is concerned with the problem of how to outline regions of interest in medical images, when
the boundaries are weak or ambiguous and the region shapes are irregular. The focus on machine learning
and interactivity leads to a common theme of the need to balance conflicting requirements. First,
any machine learning method must strike a balance between how much it can learn and how well it
generalises. Second, interactive methods must balance minimal user demand with maximal user control.
To address the problem of weak boundaries,methods of supervised texture classification are investigated
that do not use explicit texture features. These methods enable prior knowledge about the image to
benefit any segmentation framework. A chosen dynamic contour model, based on probabilistic boundary
tracking, combines these image priors with efficient modes of interaction. We show the benefits of the
texture classifiers over intensity and gradient-based image models, in both classification and boundary
extraction.
To address the problem of irregular region shape, we devise a new type of statistical shape model
(SSM) that does not use explicit boundary features or assume high-level similarity between region
shapes. First, the models are used for shape discrimination, to constrain any segmentation framework
by way of regularisation. Second, the SSMs are used for shape generation, allowing probabilistic segmentation
frameworks to draw shapes from a prior distribution. The generative models also include
novel methods to constrain shape generation according to information from both the image and user
interactions.
The shape models are first evaluated in terms of discrimination capability, and shown to out-perform
other shape descriptors. Experiments also show that the shape models can benefit a standard type of
segmentation algorithm by providing shape regularisers. We finally show how to exploit the shape
models in supervised segmentation frameworks, and evaluate their benefits in user trials
Learning to Read by Spelling: Towards Unsupervised Text Recognition
This work presents a method for visual text recognition without using any
paired supervisory data. We formulate the text recognition task as one of
aligning the conditional distribution of strings predicted from given text
images, with lexically valid strings sampled from target corpora. This enables
fully automated, and unsupervised learning from just line-level text-images,
and unpaired text-string samples, obviating the need for large aligned
datasets. We present detailed analysis for various aspects of the proposed
method, namely - (1) impact of the length of training sequences on convergence,
(2) relation between character frequencies and the order in which they are
learnt, (3) generalisation ability of our recognition network to inputs of
arbitrary lengths, and (4) impact of varying the text corpus on recognition
accuracy. Finally, we demonstrate excellent text recognition accuracy on both
synthetically generated text images, and scanned images of real printed books,
using no labelled training examples
An automated method for tendon image segmentation on ultrasound using grey-level co-occurrence matrix features and hidden Gaussian Markov random fields
Background: Despite knowledge of qualitative changes that occur on ultrasound in tendinopathy, there is currently no objective and reliable means to quantify the severity or prognosis of tendinopathy on ultrasound.
Objective: The primary objective of this study is to produce a quantitative and automated means of inferring potential structural changes in tendinopathy by developing and implementing an algorithm which performs a texture based segmentation of tendon ultrasound (US) images.
Method: A model-based segmentation approach is used which combines Gaussian mixture models, Markov random field theory and grey-level co-occurrence (GLCM) features. The algorithm is trained and tested on 49 longitudinal B-mode ultrasound images of the Achilles tendons which are labelled as tendinopathic (24) or healthy (25). Hyperparameters are tuned, using a training set of 25 images, to optimise a decision tree based classification of the images from texture class proportions. We segment and classify the remaining test images using the decision tree.
Results: Our approach successfully detects a difference in the texture profiles of tendinopathic and healthy tendons, with 22/24 of the test images accurately classified based on a simple texture proportion cut-off threshold. Results for the tendinopathic images are also collated to gain insight into the topology of structural changes that occur with tendinopathy. It is evident that distinct textures, which are predominantly present in tendinopathic tendons, appear most commonly near the transverse boundary of the tendon, though there was a large variability among diseased tendons.
Conclusion: The GLCM based segmentation of tendons under ultrasound resulted in distinct segmentations between healthy and tendinopathic tendons and provides a potential tool to objectively quantify damage in tendinopathy
- âŠ