439 research outputs found

    Disambiguating Multi–Modal Scene Representations Using Perceptual Grouping Constraints

    Get PDF
    In its early stages, the visual system suffers from a lot of ambiguity and noise that severely limits the performance of early vision algorithms. This article presents feedback mechanisms between early visual processes, such as perceptual grouping, stereopsis and depth reconstruction, that allow the system to reduce this ambiguity and improve early representation of visual information. In the first part, the article proposes a local perceptual grouping algorithm that — in addition to commonly used geometric information — makes use of a novel multi–modal measure between local edge/line features. The grouping information is then used to: 1) disambiguate stereopsis by enforcing that stereo matches preserve groups; and 2) correct the reconstruction error due to the image pixel sampling using a linear interpolation over the groups. The integration of mutual feedback between early vision processes is shown to reduce considerably ambiguity and noise without the need for global constraints

    Design CNN On Bone Spine Segmention TO Methodes Image Processing

    Get PDF
    This thesis proposes a deep learning approach to bone segmentation in abdominal CNN+PG. Segmentation is a common initial step in medical images analysis, often fundamental for computer-aided detection and diagnosis systems. The extraction of bones in PG is a challenging task, which if done manually by experts requires a time consuming process and that has not today a broadly recognized automatic solution. The method presented is based on a convolutional neural network, inspired by the U-Net and trained end-to-end, that performs a semantic segmentation of the data. The training dataset is made up of 21 abdominal PG+CNN, each one containing between 0 and 255 2D transversal images. Those images are in full resolution, 4*4*50 voxels, and each voxel is classified by the network into one of the following classes: background, femoral bones, hips, sacrum, sternum, spine and ribs. The output is therefore a bone mask where the bones are recognized and divided into six different classes. In the testing dataset, labeled by experts, the best model achieves a Dice coefficient as average of all bone classes of 0.8980. This work demonstrates, to the best of my knowledge for the first time, the feasibility of automatic bone segmentation and classification for PG using a convolutional neural network

    The verbal transformation effect:an exploration of the perceptual organization of speech

    Get PDF
    Six experiments investigated the influence of several grouping cues within the framework of the Verbal Transformation Effect (VTE, Experiments 1 to 4) and Phonemic Transformation Effect (PTE, Experiments 5 and 6), where listening to a repeated word (VTE) or sequence of vowels (PTE) produces verbal transformations (VTs). In Experiment 1, the influence of F0 frequency and lateralization cues (ITDs) was investigated in terms of the pattern of VTs. As the lateralization difference increased between two repeating sequences, the number of forms was significantly reduced with the fewest forms reported in the dichotic condition. Experiment 2 explored whether or not propensity to report more VTs on high pitch was due to the task demands of monitoring two sequences at once. The number of VTs reported was higher when listeners were asked to attend to one sequence only, suggesting smaller attentional constraints on the task requirements. In Experiment 3, consonant-vowel transitions were edited out from two sets of six stimuli words with ‘strong’ and ‘weak’ formant transitions, respectively. Listeners reported more forms in the spliced-out than in the unedited case for the strong-transition words, but not for those with weak transitions. A similar trend was observed for the F0 contour manipulation used in Experiment 4 where listeners reported more VTs and forms for words following a discontinuous F0 contour. In Experiments 5 and 6, the role of F0 frequency and ITD cues was investigated further using a related phenomenon – the PTE. Although these manipulations had relatively little effect on the number of VTs and forms reported, they did influence the particular forms heard. In summary, the current experiments confirmed that it is possible to successfully investigate auditory grouping cues within the VTE framework and that, in agreement with recent studies, the results can be attributed to the perceptual re-grouping of speech sounds

    Cluster validity in clustering methods

    Get PDF

    Geometric and photometric affine invariant image registration

    Get PDF
    This thesis aims to present a solution to the correspondence problem for the registration of wide-baseline images taken from uncalibrated cameras. We propose an affine invariant descriptor that combines the geometry and photometry of the scene to find correspondences between both views. The geometric affine invariant component of the descriptor is based on the affine arc-length metric, whereas the photometry is analysed by invariant colour moments. A graph structure represents the spatial distribution of the primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs represent connectivities by extracted contours. After matching, we refine the search for correspondences by using a maximum likelihood robust algorithm. We have evaluated the system over synthetic and real data. The method is endemic to propagation of errors introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System

    Dandelion Weed Detection and Recognition for a Weed Removal Robot

    Get PDF
    Current research in agricultural weeding automation attempts to develop accurate methods of distinguishing between crop and weed. Consequently, the use of computer vision has become a cornerstone in these endeavours. Some recent methods employ pattern recognition techniques that involve hierarchical feature groupings. The application generally applies some form of machine learning. Furthermore, using convolutional neural networks (CNN), many techniques implement complex architectures that not only classify but also detect and locate objects. These detection problems generally involve datasets taken under artificial or controlled lighting conditions where foreground elements (i.e. weed and crop) are easily distinguishable from the background (usually soil) by virtue of their distinct hue and textures. Plant overlap is generally limited to being between foreground elements. The research in this thesis addresses the challenges overlooked by agricultural weeding by focusing on weeding in lawn grass with two distinct approaches. First, a pattern recognition methodology is developed to distinguish dandelion weed centers from grass using the morphological attributes of binary (black-and-white) regions. This method is tested in lab settings with both artificial weeds and grass. However, practical limitations include a fragile performance in real-world applications in the field and a heavy reliance on parameter calibration. Next, a machine-learning approach is developed to address the shortcomings of the prior approach as well as to deal with the challenges specific to weeding in a domestic setting. A five-step process involving CNN structures proves successful at accurately detecting dandelion weeds within grass and other lawn vegetation. Extensive tests have been carried out on a wide array of real work images and the results demonstrate that the developed algorithm can detect and recognize dandelions in the grass within a reasonable range of natural lighting conditions
    • …
    corecore