120 research outputs found

    Data comparison schemes for Pattern Recognition in Digital Images using Fractals

    Get PDF
    Pattern recognition in digital images is a common problem with application in remote sensing, electron microscopy, medical imaging, seismic imaging and astrophysics for example. Although this subject has been researched for over twenty years there is still no general solution which can be compared with the human cognitive system in which a pattern can be recognised subject to arbitrary orientation and scale. The application of Artificial Neural Networks can in principle provide a very general solution providing suitable training schemes are implemented. However, this approach raises some major issues in practice. First, the CPU time required to train an ANN for a grey level or colour image can be very large especially if the object has a complex structure with no clear geometrical features such as those that arise in remote sensing applications. Secondly, both the core and file space memory required to represent large images and their associated data tasks leads to a number of problems in which the use of virtual memory is paramount. The primary goal of this research has been to assess methods of image data compression for pattern recognition using a range of different compression methods. In particular, this research has resulted in the design and implementation of a new algorithm for general pattern recognition based on the use of fractal image compression. This approach has for the first time allowed the pattern recognition problem to be solved in a way that is invariant of rotation and scale. It allows both ANNs and correlation to be used subject to appropriate pre-and post-processing techniques for digital image processing on aspect for which a dedicated programmer's work bench has been developed using X-Designer

    Visual learning induces changes in resting-state fMRI multivariate pattern of information

    Get PDF
    When measured with functional magnetic resonance imaging (fMRI) in the resting state (R-fMRI), spontaneous activity is correlated between brain regions that are anatomically and functionally related. Learning and/or task performance can induce modulation of the resting synchronization between brain regions. Moreover, at the neuronal level spontaneous brain activity can replay patterns evoked by a previously presented stimulus. Here we test whether visual learning/task performance can induce a change in the patterns of coded information in R-fMRI signals consistent with a role of spontaneous activity in representing task-relevant information. Human subjects underwent R-fMRI before and after perceptual learning on a novel visual shape orientation discrimination task. Task-evoked fMRI patterns to trained versus novel stimuli were recorded after learning was completed, and before the second R-fMRI session. Using multivariate pattern analysis on task-evoked signals, we found patterns in several cortical regions, as follows: visual cortex, V3/V3A/V7; within the default mode network, precuneus, and inferior parietal lobule; and, within the dorsal attention network, intraparietal sulcus, which discriminated between trained and novel visual stimuli. The accuracy of classification was strongly correlated with behavioral performance. Next, we measured multivariate patterns in R-fMRI signals before and after learning. The frequency and similarity of resting states representing the task/visual stimuli states increased post-learning in the same cortical regions recruited by the task. These findings support a representational role of spontaneous brain activity

    Automatic video segmentation employing object/camera modeling techniques

    Get PDF
    Practically established video compression and storage techniques still process video sequences as rectangular images without further semantic structure. However, humans watching a video sequence immediately recognize acting objects as semantic units. This semantic object separation is currently not reflected in the technical system, making it difficult to manipulate the video at the object level. The realization of object-based manipulation will introduce many new possibilities for working with videos like composing new scenes from pre-existing video objects or enabling user-interaction with the scene. Moreover, object-based video compression, as defined in the MPEG-4 standard, can provide high compression ratios because the foreground objects can be sent independently from the background. In the case that the scene background is static, the background views can even be combined into a large panoramic sprite image, from which the current camera view is extracted. This results in a higher compression ratio since the sprite image for each scene only has to be sent once. A prerequisite for employing object-based video processing is automatic (or at least user-assisted semi-automatic) segmentation of the input video into semantic units, the video objects. This segmentation is a difficult problem because the computer does not have the vast amount of pre-knowledge that humans subconsciously use for object detection. Thus, even the simple definition of the desired output of a segmentation system is difficult. The subject of this thesis is to provide algorithms for segmentation that are applicable to common video material and that are computationally efficient. The thesis is conceptually separated into three parts. In Part I, an automatic segmentation system for general video content is described in detail. Part II introduces object models as a tool to incorporate userdefined knowledge about the objects to be extracted into the segmentation process. Part III concentrates on the modeling of camera motion in order to relate the observed camera motion to real-world camera parameters. The segmentation system that is described in Part I is based on a background-subtraction technique. The pure background image that is required for this technique is synthesized from the input video itself. Sequences that contain rotational camera motion can also be processed since the camera motion is estimated and the input images are aligned into a panoramic scene-background. This approach is fully compatible to the MPEG-4 video-encoding framework, such that the segmentation system can be easily combined with an object-based MPEG-4 video codec. After an introduction to the theory of projective geometry in Chapter 2, which is required for the derivation of camera-motion models, the estimation of camera motion is discussed in Chapters 3 and 4. It is important that the camera-motion estimation is not influenced by foreground object motion. At the same time, the estimation should provide accurate motion parameters such that all input frames can be combined seamlessly into a background image. The core motion estimation is based on a feature-based approach where the motion parameters are determined with a robust-estimation algorithm (RANSAC) in order to distinguish the camera motion from simultaneously visible object motion. Our experiments showed that the robustness of the original RANSAC algorithm in practice does not reach the theoretically predicted performance. An analysis of the problem has revealed that this is caused by numerical instabilities that can be significantly reduced by a modification that we describe in Chapter 4. The synthetization of static-background images is discussed in Chapter 5. In particular, we present a new algorithm for the removal of the foreground objects from the background image such that a pure scene background remains. The proposed algorithm is optimized to synthesize the background even for difficult scenes in which the background is only visible for short periods of time. The problem is solved by clustering the image content for each region over time, such that each cluster comprises static content. Furthermore, it is exploited that the times, in which foreground objects appear in an image region, are similar to the corresponding times of neighboring image areas. The reconstructed background could be used directly as the sprite image in an MPEG-4 video coder. However, we have discovered that the counterintuitive approach of splitting the background into several independent parts can reduce the overall amount of data. In the case of general camera motion, the construction of a single sprite image is even impossible. In Chapter 6, a multi-sprite partitioning algorithm is presented, which separates the video sequence into a number of segments, for which independent sprites are synthesized. The partitioning is computed in such a way that the total area of the resulting sprites is minimized, while simultaneously satisfying additional constraints. These include a limited sprite-buffer size at the decoder, and the restriction that the image resolution in the sprite should never fall below the input-image resolution. The described multisprite approach is fully compatible to the MPEG-4 standard, but provides three advantages. First, any arbitrary rotational camera motion can be processed. Second, the coding-cost for transmitting the sprite images is lower, and finally, the quality of the decoded sprite images is better than in previously proposed sprite-generation algorithms. Segmentation masks for the foreground objects are computed with a change-detection algorithm that compares the pure background image with the input images. A special effect that occurs in the change detection is the problem of image misregistration. Since the change detection compares co-located image pixels in the camera-motion compensated images, a small error in the motion estimation can introduce segmentation errors because non-corresponding pixels are compared. We approach this problem in Chapter 7 by integrating risk-maps into the segmentation algorithm that identify pixels for which misregistration would probably result in errors. For these image areas, the change-detection algorithm is modified to disregard the difference values for the pixels marked in the risk-map. This modification significantly reduces the number of false object detections in fine-textured image areas. The algorithmic building-blocks described above can be combined into a segmentation system in various ways, depending on whether camera motion has to be considered or whether real-time execution is required. These different systems and example applications are discussed in Chapter 8. Part II of the thesis extends the described segmentation system to consider object models in the analysis. Object models allow the user to specify which objects should be extracted from the video. In Chapters 9 and 10, a graph-based object model is presented in which the features of the main object regions are summarized in the graph nodes, and the spatial relations between these regions are expressed with the graph edges. The segmentation algorithm is extended by an object-detection algorithm that searches the input image for the user-defined object model. We provide two objectdetection algorithms. The first one is specific for cartoon sequences and uses an efficient sub-graph matching algorithm, whereas the second processes natural video sequences. With the object-model extension, the segmentation system can be controlled to extract individual objects, even if the input sequence comprises many objects. Chapter 11 proposes an alternative approach to incorporate object models into a segmentation algorithm. The chapter describes a semi-automatic segmentation algorithm, in which the user coarsely marks the object and the computer refines this to the exact object boundary. Afterwards, the object is tracked automatically through the sequence. In this algorithm, the object model is defined as the texture along the object contour. This texture is extracted in the first frame and then used during the object tracking to localize the original object. The core of the algorithm uses a graph representation of the image and a newly developed algorithm for computing shortest circular-paths in planar graphs. The proposed algorithm is faster than the currently known algorithms for this problem, and it can also be applied to many alternative problems like shape matching. Part III of the thesis elaborates on different techniques to derive information about the physical 3-D world from the camera motion. In the segmentation system, we employ camera-motion estimation, but the obtained parameters have no direct physical meaning. Chapter 12 discusses an extension to the camera-motion estimation to factorize the motion parameters into physically meaningful parameters (rotation angles, focal-length) using camera autocalibration techniques. The speciality of the algorithm is that it can process camera motion that spans several sprites by employing the above multi-sprite technique. Consequently, the algorithm can be applied to arbitrary rotational camera motion. For the analysis of video sequences, it is often required to determine and follow the position of the objects. Clearly, the object position in image coordinates provides little information if the viewing direction of the camera is not known. Chapter 13 provides a new algorithm to deduce the transformation between the image coordinates and the real-world coordinates for the special application of sport-video analysis. In sport videos, the camera view can be derived from markings on the playing field. For this reason, we employ a model of the playing field that describes the arrangement of lines. After detecting significant lines in the input image, a combinatorial search is carried out to establish correspondences between lines in the input image and lines in the model. The algorithm requires no information about the specific color of the playing field and it is very robust to occlusions or poor lighting conditions. Moreover, the algorithm is generic in the sense that it can be applied to any type of sport by simply exchanging the model of the playing field. In Chapter 14, we again consider panoramic background images and particularly focus ib their visualization. Apart from the planar backgroundsprites discussed previously, a frequently-used visualization technique for panoramic images are projections onto a cylinder surface which is unwrapped into a rectangular image. However, the disadvantage of this approach is that the viewer has no good orientation in the panoramic image because he looks into all directions at the same time. In order to provide a more intuitive presentation of wide-angle views, we have developed a visualization technique specialized for the case of indoor environments. We present an algorithm to determine the 3-D shape of the room in which the image was captured, or, more generally, to compute a complete floor plan if several panoramic images captured in each of the rooms are provided. Based on the obtained 3-D geometry, a graphical model of the rooms is constructed, where the walls are displayed with textures that are extracted from the panoramic images. This representation enables to conduct virtual walk-throughs in the reconstructed room and therefore, provides a better orientation for the user. Summarizing, we can conclude that all segmentation techniques employ some definition of foreground objects. These definitions are either explicit, using object models like in Part II of this thesis, or they are implicitly defined like in the background synthetization in Part I. The results of this thesis show that implicit descriptions, which extract their definition from video content, work well when the sequence is long enough to extract this information reliably. However, high-level semantics are difficult to integrate into the segmentation approaches that are based on implicit models. Intead, those semantics should be added as postprocessing steps. On the other hand, explicit object models apply semantic pre-knowledge at early stages of the segmentation. Moreover, they can be applied to short video sequences or even still pictures since no background model has to be extracted from the video. The definition of a general object-modeling technique that is widely applicable and that also enables an accurate segmentation remains an important yet challenging problem for further research

    Registration of pre-operative lung cancer PET/CT scans with post-operative histopathology images

    Get PDF
    Non-invasive imaging modalities used in the diagnosis of lung cancer, such as Positron Emission Tomography (PET) or Computed Tomography (CT), currently provide insuffcient information about the cellular make-up of the lesion microenvironment, unless they are compared against the gold standard of histopathology.The aim of this retrospective study was to build a robust imaging framework for registering in vivo and post-operative scans from lung cancer patients, in order to have a global, pathology-validated multimodality map of the tumour and its surroundings.;Initial experiments were performed on tissue-mimicking phantoms, to test different shape reconstruction methods. The choice of interpolator and slice thickness were found to affect the algorithm's output, in terms of overall volume and local feature recovery. In the second phase of the study, nine lung cancer patients referred for radical lobectomy were recruited. Resected specimens were inflated with agar, sliced at 5 mm intervals, and each cross-section was photographed. The tumour area was delineated on the block-face pathology images and on the preoperative PET/CT scans.;Airway segments were also added to the reconstructed models, to act as anatomical fiducials. Binary shapes were pre-registered by aligning their minimal bounding box axes, and subsequently transformed using rigid registration. In addition, histopathology slides were matched to the block-face photographs using moving least squares algorithm.;A two-step validation process was used to evaluate the performance of the proposed method against manual registration carried out by experienced consultants. In two out of three cases, experts rated the results generated by the algorithm as the best output, suggesting that the developed framework outperforms the current standard practice.Non-invasive imaging modalities used in the diagnosis of lung cancer, such as Positron Emission Tomography (PET) or Computed Tomography (CT), currently provide insuffcient information about the cellular make-up of the lesion microenvironment, unless they are compared against the gold standard of histopathology.The aim of this retrospective study was to build a robust imaging framework for registering in vivo and post-operative scans from lung cancer patients, in order to have a global, pathology-validated multimodality map of the tumour and its surroundings.;Initial experiments were performed on tissue-mimicking phantoms, to test different shape reconstruction methods. The choice of interpolator and slice thickness were found to affect the algorithm's output, in terms of overall volume and local feature recovery. In the second phase of the study, nine lung cancer patients referred for radical lobectomy were recruited. Resected specimens were inflated with agar, sliced at 5 mm intervals, and each cross-section was photographed. The tumour area was delineated on the block-face pathology images and on the preoperative PET/CT scans.;Airway segments were also added to the reconstructed models, to act as anatomical fiducials. Binary shapes were pre-registered by aligning their minimal bounding box axes, and subsequently transformed using rigid registration. In addition, histopathology slides were matched to the block-face photographs using moving least squares algorithm.;A two-step validation process was used to evaluate the performance of the proposed method against manual registration carried out by experienced consultants. In two out of three cases, experts rated the results generated by the algorithm as the best output, suggesting that the developed framework outperforms the current standard practice

    Image Registration Workshop Proceedings

    Get PDF
    Automatic image registration has often been considered as a preliminary step for higher-level processing, such as object recognition or data fusion. But with the unprecedented amounts of data which are being and will continue to be generated by newly developed sensors, the very topic of automatic image registration has become and important research topic. This workshop presents a collection of very high quality work which has been grouped in four main areas: (1) theoretical aspects of image registration; (2) applications to satellite imagery; (3) applications to medical imagery; and (4) image registration for computer vision research

    A multidisciplinary approach to the study of shape and motion processing and representation in rats

    Get PDF
    During my PhD I investigated how shape and motion information are processed by the rat visual system, so as to establish how advanced is the representation of higher-order visual information in this species and, ultimately, to understand to what extent rats can present a valuable alternative to monkeys, as experimental models, in vision studies. Specifically, in my thesis work, I have investigated: 1) The possible visual strategies underlying shape recognition. 2) The ability of rat visual cortical areas to represent motion and shape information. My work contemplated two different, but complementary experimental approaches: psychophysical measurements of the rat\u2019s recognition ability and strategy, and in vivo extracellular recordings in anaesthetized animals passively exposed to various (static and moving) visual stimulation. The first approach implied training the rats to an invariant object recognition task, i.e. to tolerate different ranges of transformations in the object\u2019s appearance, and the application of an mage classification technique known as The Bubbles to reveal the visual strategy the animals were able, under different conditions of stimulus discriminability, to adopt in order to perform the task. The second approach involved electrophysiological exploration of different visual areas in the rat\u2019s cortex, in order to investigate putative functional hierarchies (or streams of processing) in the computation of motion and shape information. Results show, on one hand, that rats are able, under conditions of highly stimulus discriminability, to adopt a shape-based, view-invariant, multi-featural recognition strategy; on the other hand, the functional properties of neurons recorded from different visual areas suggest the presence of a putative shape-based, ventral-like stream of processing in the rat\u2019s visual cortex. The general purpose of my work is and has been the unveiling the neural mechanisms that make object recognition happen, with the goal of eventually 1) be able to relate my findings on rats to those on more visually-advanced species, such as human and non-human primates; and 2) collect enough biological data to support the artificial simulation of visual recognition processes, which still presents an important scientific challenge

    Kinetic Analysis of dynamic MP4A PET Scans of Human Brain using Voxel based Nonlinear Least Squares Fitting

    Get PDF
    Dynamic PET (Positron Emission Tomography) involving a number of radiotracers is an established technique for in vivo estimation of biochemical parameters in human brain, such as the overall metabolic rate and certain receptor concentrations or enzyme activities. 11C labeled methyl-4-piperidyl acetate (MP4A) and -propionate (MP4P) are established radiotracers for measuring activity of acetylcholine esterase (AChE), which relates to functionality of the cholinergic system. MP4A kinetic analysis without arterial blood sampling employs a reference tissue based "irreversible tracer model". Implementations can be region or voxel based, in the second case providing parametric images of k3 which is an indicator of AChE activity. This work introduces an implementation of voxel based kinetic analysis using weighted Nonlinear Least Squares fitting (NLS), which is fast enough for standard PCs. The entire workflow leading from reconstructed PET scans to parametric images of k3, including normalization and correction for patient movement, has been automatized. Image preprocessing has been redefined and fixed masks are no longer required. A focus of this work is error estimation of k3 at the voxel and regional level. A formula is derived for voxel based estimation of random error, it is based on residual weighted squared differences and has been successfully validated against simulated data. The reference curves turned out to be the main source of errors in regional mean values of k3. Major improvements were reached in this area by switching from fixed to adaptive Putamen masks and raising their volume from 5.4 to 12.5 or 16 ml. Also, a method for correcting reference curves obtained from nonideal reference tissues is presented. For the improved implementation, random error of the mean k3 of a number of cerebral regions has been assessed based on PET studies of 12 human subjects, by splitting them in two independent data sets at the sinogram level. According to this sample, absolute standard errors of 0.0012 in most cortex regions and 0.0053 in Hippocampus are induced by noise of voxel based activity curves, while errors of approximately 0.0025 and 0.0050 are induced by noise of the reference curves. Different types of systematic as well as noise-induced bias have been investigated by simulations; their combined effect on the computed k3 was found below 3 percent. The implementation is available as a modul of the VINCI software package and has been used in clinical studies on Parkinson's Disease and Alzheimer Dementia

    Segmenting the male pelvic organs from limited angle images with application to ART

    Get PDF
    Prostate cancer is the second leading cause of cancer deaths in men, and external beam radiotherapy is a common method for treating prostate cancer. In a clinically state-of-the-art radiotherapy protocol, CT images are taken at treatment time and are used to properly position the patient with respect to the treatment device. In adaptive radiotherapy (ART), this image is used to approximate the actual radiation dose delivered to the patient and track the progress of therapy. Doing so, however, requires that the male pelvic organs of interest be segmented and that correspondence be established between the images (registration), such that cumulative delivered dose can be accumulated in a reference coordinate system. Because a typical prostate radiotherapy treatment is delivered over 30-40 daily fractions, there is a large non-therapeutic radiation dose delivered to the patient from daily imaging. In the interest of reducing this dose, gantry mounted limited angle imaging devices have been developed which reduce dose at the expense of image quality. However, in the male pelvis, such limited angle images are not suitable for the ART process using traditional methods. In this work, a patient specific deformation model is developed that is sufficient for use with limited angle images. This model is learned from daily CT images taken during the first several treatment fractions. Limited angle imaging can then be used for the remaining fractions at decreased dose. When the parameters of this model are set, it provides segmentation of the prostate, bladder, and rectum, correspondence between the images, and a CT-like image that can be used for dose accumulation. However, intra-patient deformation in the male pelvis is complex and quality deformation models cannot be developed from a reasonable number of training images using traditional methods. This work solves this issue by partitioning the deformation to be explained into independent sub-models that explain deformation due to articulation, deformation near to the skin, deformation of the prostate bladder, and rectum, and any residual deformation. It is demonstrated that a model that segments the prostate with accuracy comparable to inter-expert variation can be developed from 16 daily images.Doctor of Philosoph

    Autocalibrating vision guided navigation of unmanned air vehicles via tactical monocular cameras in GPS denied environments

    Get PDF
    This thesis presents a novel robotic navigation strategy by using a conventional tactical monocular camera, proving the feasibility of using a monocular camera as the sole proximity sensing, object avoidance, mapping, and path-planning mechanism to fly and navigate small to medium scale unmanned rotary-wing aircraft in an autonomous manner. The range measurement strategy is scalable, self-calibrating, indoor-outdoor capable, and has been biologically inspired by the key adaptive mechanisms for depth perception and pattern recognition found in humans and intelligent animals (particularly bats), designed to assume operations in previously unknown, GPS-denied environments. It proposes novel electronics, aircraft, aircraft systems, systems, and procedures and algorithms that come together to form airborne systems which measure absolute ranges from a monocular camera via passive photometry, mimicking that of a human-pilot like judgement. The research is intended to bridge the gap between practical GPS coverage and precision localization and mapping problem in a small aircraft. In the context of this study, several robotic platforms, airborne and ground alike, have been developed, some of which have been integrated in real-life field trials, for experimental validation. Albeit the emphasis on miniature robotic aircraft this research has been tested and found compatible with tactical vests and helmets, and it can be used to augment the reliability of many other types of proximity sensors
    corecore