46 research outputs found

    Fast marching over the 2D Gabor magnitude domain for tongue body segmentation

    Get PDF
    Author name used in this publication: David ZhangVersion of RecordPublishe

    Visual Speech Recognition

    Get PDF
    In recent years, Visual speech recognition has a more concentration, by researchers, than the past. Because of the leakage of the visual processing of the Arabic vocabularies recognition, we start to search in this field. Audio speech recognition concerned with the acoustic characteristic of the signal, but there are many situations that the audio signal is weak of not exist, and this will be a point in Chapter 2. The visual recognition process focuses on the features extracted from video of the speaker. These features are to be classified using several techniques. The most important feature to be extracted is motion. By segmenting motion of the lips of the speaker, an algorithm has manipulate it in such away to recognize the word which is said. But motion segmentation is not the only problem facing the speech recognition process, segmenting the lips itself is an early step in the speech recognition process, so, to segment lips motion we have to segment lips first, a new approach for lip segmentation is proposed in this thesis. Sometimes, motion feature needs another feature to support in recognition the spoken word. So in our thesis another new algorithm is proposed to use motion segmentation by using the Abstract Difference Image from an image series, supported by correlation for registering images in the image series, to recognize ten words in the Arabic language, the words are from “one” to “ten” in Arabic language. The algorithm also uses the HU-Invariant set of features to describe the Abstract Difference Image, and uses a three different recognition methods to recognize the words. The CLAHE method as a filtering technique is used by our algorithm to manipulate lighting problems. Our algorithm based on extracting the differences details from a series of images to recognize the word, achieved an overall results 55.8%, it is an adequate result for our algorithm when integrated in an audio-visual system

    Visual Tracking of Instruments in Minimally Invasive Surgery

    Get PDF
    Reducing access trauma has been a focal point for modern surgery and tackling the challenges that arise from new operating techniques and instruments is an exciting and open area of research. Lack of awareness and control from indirect manipulation and visualization has created a need to augment the surgeon's understanding and perception of how their instruments interact with the patient's anatomy but current methods of achieving this are inaccurate and difficult to integrate into the surgical workflow. Visual methods have the potential to recover the position and orientation of the instruments directly in the reference frame of the observing camera without the need to introduce additional hardware to the operating room and perform complex calibration steps. This thesis explores how this problem can be solved with the fusion of coarse region and fine scale point features to enable the recovery of both the rigid and articulated degrees of freedom of laparoscopic and robotic instruments using only images provided by the surgical camera. Extensive experiments on different image features are used to determine suitable representations for reliable and robust pose estimation. Using this information a novel framework is presented which estimates 3D pose with a region matching scheme while using frame-to-frame optical flow to account for challenges due to symmetry in the instrument design. The kinematic structure of articulated robotic instruments is also used to track the movement of the head and claspers. The robustness of this method was evaluated on calibrated ex-vivo images and in-vivo sequences and comparative studies are performed with state-of-the-art kinematic assisted tracking methods

    Multiscale Centerline Extraction Based on Regression and Projection onto the Set of Elongated Structures

    Get PDF
    Automatically extracting linear structures from images is a fundamental low-level vision problem with numerous applications in different domains. Centerline detection and radial estimation are the first crucial steps in most Computer Vision pipelines aiming to reconstruct linear structures. Existing techniques rely either on hand-crafted filters, designed to respond to ideal profiles of the linear structure, or on classification-based approaches, which automatically learn to detect centerline points from data. Hand-crafted methods are the most accurate when the content of the image fulfills the ideal model they rely on. However, they lose accuracy in the presence of noise or when the linear structures are irregular and deviate from the ideal case. Machine learning techniques can alleviate this problem. However, they are mainly based on a classification framework. In this thesis, we show that classification is not the best formalism to solve the centerline detection problem. In fact, since the appearance of a centerline point is very similar to the points immediately next to it, the output of a classifier trained to detect centerlines presents low localization accuracy and double responses on the body of the linear structure. To solve this problem, we propose a regression-based formulation for centerline detection. We rely on the distance transform of the centerlines to automatically learn a function whose local maxima correspond to centerline points. The output of our method can be used to directly estimate the location of the centerline, by a simple Non-Maximum Suppression operation, or it can be used as input to a tracing pipeline to reconstruct the graph of the linear structure. In both cases, our method gives more accurate results than state-of-the-art techniques on challenging 2D and 3D datasets. Our method relies on features extracted by means of convolutional filters. In order to process large amount of data efficiently, we introduce a general filter bank approximation scheme. In particular, we show that a generic filter bank can be approximated by a linear combination of a smaller set of separable filters. Thanks to this method, we can greatly reduce the computation time of the convolutions, without loss of accuracy. Our approach is general, and we demonstrate its effectiveness by applying it to different Computer Vision problems, such as linear structure detection and image classification with Convolutional Neural Networks. We further improve our regression-based method for centerline detection by taking advantage of contextual image information. We adopt a multiscale iterative regression approach to efficiently include a large image context in our algorithm. Compared to previous approaches, we use context both in the spatial domain and in the radial one. In this way, our method is also able to return an accurate estimation of the radii of the linear structures. The idea of using regression can also be beneficial for solving other related Computer Vision problems. For example, we show an improvement compared to previous works when applying it to boundary and membrane detection. Finally, we focus on the particular geometric properties of the linear structures. We observe that most methods for detecting them treat each pixel independently and do not model the strong relation that exists between neighboring pixels. As a consequence, their output is geometrically inconsistent. In this thesis, we address this problem by considering the projection of the score map returned by our regressor onto the set of all geometrically admissible ground truth images. We propose an efficient patch-wise approximation scheme to compute the projection. Moreover, we provide conditions under which the projection is exact. We demonstrate the advantage of our method by applying it to four different problems

    Upper airways segmentation using principal curvatures

    Get PDF
    Esta tesis propone una nueva técnica para segmentar las vías aéreas superiores. Esta propuesta permite la extracción de estructuras curvilíneas usando curvaturas principales. La propuesta permite la extracción de éstas estructuras en imágenes 2D y 3D. Entre las principales novedades se encuentra la propuesta de un nuevo criterio de parada en la propagación del algoritmo de realce de contraste (operador multi-escala de tipo sombrero alto). De la misma forma, el criterio de parada propuesto es usado para detener los algoritmos de difusión anisotrópica. Además, un nuevo criterio es propuesto para seleccionar las curvaturas principales que conforman las estructuras curvilíneas, que se basa en los criterios propuestos por Steger, Deng et. al. y Armande et. al. Además, se propone un nuevo algoritmo para realizar la supresión de nomáximos que permite reducir la presencia de discontinuidades en el borde de las estructuras curvilíneas. Para extraer los bordes de las estructuras curvilíneas, se utiliza un algoritmo de enlace que incluye un nuevo criterio de distancia para reducir la aparición de agujeros en la estructura final. Finalmente, con base en los resultados obtenidos, se utiliza un algoritmo morfológico para cerrar los agujeros y se aplica un algoritmo de crecimiento de regiones para obtener la segmentación final de las vías respiratorias superiores.This dissertation proposes a new approach to segment the upper airways. This proposal allows the extraction of curvilinear structures based on the principal curvatures. The proposal allows extracting these structures from 2D and 3D images. Among the main novelties is the proposal of a new stopping criterion to stop the propagation of the contrast enhancement algorithm (multiscale top-hat morphological operator). In the same way, the proposed stopping criterion is used to stop the anisotropic diffusion algorithms. In addition, a new criterion is proposed to select the principal curvatures that make up the curvilinear structures, which is based on the criteria proposed by Steger, Deng et. al. and Armande et. al. Furthermore, a new algorithm to perform the non-maximum suppression that allows reducing the presence of discontinuities in the border of curvilinear structures is proposed. To extract the edges of the curvilinear structures, a linking algorithm is used that includes a new distance criterion to reduce the appearance of gaps in the final structure. Finally, based on the obtained results, a morphological algorithm is used to close the gaps and a region growing algorithm to obtain the final upper airways segmentation is applied.Doctor en IngenieríaDoctorad
    corecore