20 research outputs found

    The research for shape-based visual recognition of object categories

    Get PDF
    摘要 视觉目标类识别旨在识别图像中特定的某类目标,基于形状的目标类识别是目前计算机视觉研究的热点之一。真实图像中物体姿态的多样性以及环境的复杂性,给目标的形状提取和识别带来巨大挑战。本文借鉴生物视觉机制的研究成果,对基于形状的目标类识别算法进行研究。主要研究内容如下: 1. 研究与形状认知相关的视觉机制,分析形状知觉整体性的生理基础及其生理模型。以形状知觉整体性为基础,建立基于形状的目标类识别系统框架。框架既重视整体性在自下而上的特征加工中的作用,也重视整体约束在自上而下的识别中的作用。 2. 受生物视觉上的整合野模型启发,本文提出了一个三阶段轮廓检测算法。阶段1利用结构自适应滤波器平滑...Categorical object detection addresses determining the number of instances of a particular object category in an image, and localizing those instances in space and scale. The shape-based visual recognition of object categories is one of hot topics in computer vision. The diversity of poses of targets and complexity of the environment in real images bring huge challenges to shape extraction and obj...学位:工学博士院系专业:信息科学与技术学院自动化系_控制理论与控制工程学号:2322006015337

    Untethered human motion recognition for a multimodal interface

    Get PDF
    Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.Includes bibliographical references (p. 55-58).This thesis used machine learning techniques to extract useful information about human body articulations. First, it presents a learning approach to model non-linear constraints; a support vector classifier is trained from motion capture data to model the boundary of the space of valid poses. Next, it proposes a system that incorporates body tracking and gesture recognition for an untethered human-computer interface. The detection step utilizes an SVM to identify periods of gesture activity. The classification step uses gesture-specific Hidden Markov Models (HMMs) to determine which gesture was performed at any time period, and to extract the parameters of those gestures. Several experiments were performed to verify the effectiveness of these techniques with encouraging results.by Teresa H. Ko.M.Eng

    Fast, Accurate and Consistent Modeling of Drainage and Surrounding Terrain

    Get PDF
    We propose an automated approach to modeling drainage channels and, more generally, linear features that lie on the terrain|from multiple images. It produces models of the features and of the surrounding terrain that are accurate and consistent and requires only minimal human intervention. We take advantage of geometric constraints and photommetric knowledge. First, rivers flow downhill and lie at the bottom valleys whose floors tend to be either V- or U-shaped. Second, the drainage pattern appears in gray-level images as a network of linear features that can be visually detected. Many approaches have explored individual facets of this problem. Ours unifies these elements in a common framework. We accurately model terrain and features as 3-dimensional ob jects from several information sources that may be in error and inconsistent with one another. This approach allows us to generate models that are faithful to sensor data, internally consistent and consistent with physical constraints. We have proposed generic models that have been applied to the specific task at hand. We show that the constraints can be expressed in a computationally effective way and, therefore, enforced while initializing the models and then fitting them to the data. Furthermore, these techniques are general enough to work on other features that are constrained by predictable forces

    Recalage géométrique avec plusieurs prototypes

    Get PDF
    Projet SYNTIMWe describe a general-purpose method for the accurate and robust interpretation of a data set of p-dimensional points by several deformable prototypes. This method is based on the fusion of two algorithms: a Generalization of the Iterative Closest Point (GICP) to different types of deformations for registration purposes, and a fuzzy clustering algorithm (FCM). Our method always converges monotonically to the nearest local minimum of a mean-square distance metric, and experiments show that the convergence is fast during the first few iterations. Therefore, we propose a scheme for choosing the initial solution to converge to an "interesting" local minimum. The method presented is very generic and can be applied: to shapes or objects in a p-dimensional space, to many shape patterns such as polyhedra, quadrics, polynomial functions, snakes, to many possible shape deformations such as rigid displacements, similitudes, affine and homographic transforms. Consequently, our method has important applications in registration with an ideal model prior to shape inspection, i.e. to interpret 2D or 3D sensed data obtained from calibrated or uncalibrated sensors. Experimental results illustrate some capabilities of our method.Nous décrivons un cadre général pour l'interprétation précise et robuste d'un ensemble de points par plusieurs prototypes déformables. Cette méthode est basée sur l'unification de deux algorithmes : une généralisation de l'algorithme "Iterative Closest Point" (GICP) à différents types de transformations pour des tâches de recalage, et un algorithme de classification floue (FCM) pour traiter plusieurs prototypes. Notre algorithme converge de façon monotone vers le plus proche minimum local d'un fonction de coût au moindre carré, et les expériences montrent que la convergence est rapide dans les premières étapes. En conséquence, nous avons proposé un schéma pour choisir la position initiale des prototypes pour qu'ils convergent vers une solution "intéressante". La méthode présentée est très générique et peut être appliquée : à des prototypes dans un espace de dimension p quelconque, à différentes formes de prototypes comme les polyèdres, les quadriques, les fonctions polynômiales, les snakes, à de nombreux types de déformations comme les déplacements rigides, les similitudes, les affinités et les homographies. Ainsi, notre méthode a un grand nombre d'applications en recalage avec un modèle idéal connu a priori, c'est-à-dire pour interpréter des données 2D et 3D obtenues par des capteurs calibrés ou non. Des résultats expérimentaux illustrent quelqu'unes des possibilités de notre approche

    Facial soft tissue segmentation

    Get PDF
    The importance of the face for socio-ecological interaction is the cause for a high demand on any surgical intervention on the facial musculo-skeletal system. Bones and soft-tissues are of major importance for any facial surgical treatment to guarantee an optimal, functional and aesthetical result. For this reason, surgeons want to pre-operatively plan, simulate and predict the outcome of the surgery allowing for shorter operation times and improved quality. Accurate simulation requires exact segmentation knowledge of the facial tissues. Thus semi-automatic segmentation techniques are required. This thesis proposes semi-automatic methods for segmentation of the facial soft-tissues, such as muscles, skin and fat, from CT and MRI datasets, using a Markov Random Fields (MRF) framework. Due to image noise, artifacts, weak edges and multiple objects of similar appearance in close proximity, it is difficult to segment the object of interest by using image information alone. Segmentations would leak at weak edges into neighboring structures that have a similar intensity profile. To overcome this problem, additional shape knowledge is incorporated in the energy function which can then be minimized using Graph-Cuts (GC). Incremental approaches by incorporating additional prior shape knowledge are presented. The proposed approaches are not object specific and can be applied to segment any class of objects be that anatomical or non-anatomical from medical or non-medical image datasets, whenever a statistical model is present. In the first approach a 3D mean shape template is used as shape prior, which is integrated into the MRF based energy function. Here, the shape knowledge is encoded into the data and the smoothness terms of the energy function that constrains the segmented parts to a reasonable shape. In the second approach, to improve handling of shape variations naturally found in the population, the fixed shape template is replaced by a more robust 3D statistical shape model based on Probabilistic Principal Component Analysis (PPCA). The advantages of using the Probabilistic PCA are that it allows reconstructing the optimal shape and computing the remaining variance of the statistical model from partial information. By using an iterative method, the statistical shape model is then refined using image based cues to get a better fitting of the statistical model to the patient's muscle anatomy. These image cues are based on the segmented muscle, edge information and intensity likelihood of the muscle. Here, a linear shape update mechanism is used to fit the statistical model to the image based cues. In the third approach, the shape refinement step is further improved by using a non-linear shape update mechanism where vertices of the 3D mesh of the statistical model incur the non-linear penalty depending on the remaining variability of the vertex. The non-linear shape update mechanism provides a more accurate shape update and helps in a finer shape fitting of the statistical model to the image based cues in areas where the shape variability is high. Finally, a unified approach is presented to segment the relevant facial muscles and the remaining facial soft-tissues (skin and fat). One soft-tissue layer is removed at a time such as the head and non-head regions followed by the skin. In the next step, bones are removed from the dataset, followed by the separation of the brain and non-brain regions as well as the removal of air cavities. Afterwards, facial fat is segmented using the standard Graph-Cuts approach. After separating the important anatomical structures, finally, a 3D fixed shape template mesh of the facial muscles is used to segment the relevant facial muscles. The proposed methods are tested on the challenging example of segmenting the masseter muscle. The datasets were noisy with almost all possessing mild to severe imaging artifacts such as high-density artifacts caused by e.g. dental fillings and dental implants. Qualitative and quantitative experimental results show that by incorporating prior shape knowledge leaking can be effectively constrained to obtain better segmentation results
    corecore