39 research outputs found

    Structural graph matching using the EM algorithm and singular value decomposition

    Get PDF
    This paper describes an efficient algorithm for inexact graph matching. The method is purely structural, that is, it uses only the edge or connectivity structure of the graph and does not draw on node or edge attributes. We make two contributions: 1) commencing from a probability distribution for matching errors, we show how the problem of graph matching can be posed as maximum-likelihood estimation using the apparatus of the EM algorithm; and 2) we cast the recovery of correspondence matches between the graph nodes in a matrix framework. This allows one to efficiently recover correspondence matches using the singular value decomposition. We experiment with the method on both real-world and synthetic data. Here, we demonstrate that the method offers comparable performance to more computationally demanding method

    Structural matching by discrete relaxation

    Get PDF
    This paper describes a Bayesian framework for performing relational graph matching by discrete relaxation. Our basic aim is to draw on this framework to provide a comparative evaluation of a number of contrasting approaches to relational matching. Broadly speaking there are two main aspects to this study. Firstly we locus on the issue of how relational inexactness may be quantified. We illustrate that several popular relational distance measures can be recovered as specific limiting cases of the Bayesian consistency measure. The second aspect of our comparison concerns the way in which structural inexactness is controlled. We investigate three different realizations ai the matching process which draw on contrasting control models. The main conclusion of our study is that the active process of graph-editing outperforms the alternatives in terms of its ability to effectively control a large population of contaminating clutter

    Busca de padroes em subdivisoes planares

    Get PDF
    Orientador: André Luiz Pires GuedesDissertaçao (mestrado) - Universidade Federal do Paraná, Setor de Ciencias Exatas, Programa de Pós-Graduaçao em Informática. Defesa: Curitiba, 2004Inclui bibliografiaResumo: O sub-isomorfismo de grafos é uma abordagem muito utilizada para solucionar problemas de busca de padrões, mas este e um problema NP-completo. Desta forma, deve-se investir em pesquisa para encontrar soluções aproximadas, ou que funcionem em casos especiais do problema. Subdivisões planares podem ser consideradas um caso especial de grafos, pois, além dos vértices e arestas, existe uma topologia mais r³gida quanto µa ordem das arestas, surgindo o conceito de face. Este trabalho apresenta um algoritmo linear para busca de padrões em subdivisões planares. Os padrões a serem buscados também são considerados subdivisões e, portanto, este e um problema de sub-isomorfismo. O algoritmo apresentado baseia-se em uma representação h³brida entre o dual e o grafo de regiões adjacentes (RAG) para representar os padrões, de forma a não ter qualquer custo adicional de armazenamento. Então, os padrões são procurados na subdivisão de busca, utilizando um algoritmo de crescimento de regiões. Este trabalho também realiza um estudo comparativo das estruturas de dados mais utilizadas para armazenamento de subdivisões planares.Abstract: Graph sub-isomorphism is a very used approach to solving pattern search problems, but this is a NPcomplete problem. This way, it is necessary to invest in research of approximate solutions, or in special cases of the problem. Planar subdivisions can be considered as a special case of graphs, because, in addition to nodes and edges, there is a more rigid topology in relation to the order of the edges, arising to the concept of face. This work presents a linear algorithm for pattern search in planar subdivisions. The patterns to be searched are also considered subdivisions, and therefore it is a sub-isomorphism problem. The presented algorithm is based on a hybrid approach between the dual and the region adjacency graph (RAG) to represent the patterns, saving additional storage costs. Thus, the patterns are looked over the search subdivision, using an algorithm of region growing. This work also performs a comparative study of the data structures commonly used for storage of planar subdivisions

    Mean field theories and differential identities for multispecies Ising models and exponential random graph models

    Get PDF
    This work is concerned with the mean field theories (MFTs) of multispecies Ising models and various probabilistic ensembles of graphs known as exponential random graph models (ERGMs). The MFT is a universal approximation, in which the true Hamiltonian of the model is linearised by introducing self-consistent mean fields, and it turns out that the mean field self-consistency equations can be obtained as low viscosity solutions of certain viscous partial differential equations (PDEs) that arise from differential identities obeyed by the Helmholtz free energies of the socalled mean field models, whose exact thermodynamic solutions coincide with the mean field self-consistency equations. Thermodynamic equations of state are obtained for the multi component analogue of the Curie- Weiss (CW) model and analysed in detail for the 2-component case. This analysis largely extends the preceding works by providing a good orientation in the parameter space of the 2-component CW model, which reveals that, unlike the original CW model, the 2-component model admits critical points in nonzero fields and can exhibit any number from one to four of (meta)stable macrostates. The results are confirmed by Monte-Carlo (MC) simulations, and some applications of the model are discussed. The above discussion is followed by the mean field analysis of a particular class of ERGMs, known as homogeneousMarkov random graphs. Such models are precisely described by the MFT at large sizes due to their infinite-dimensional nature, and this work provides a simple unified approach to study such models at the macroscopic level, reveals a possibility of (meta)stable macrostates of moderate connectance at arbitrarily low temperatures, and gives a general result relating the order of the interactions with the maximum number of (meta)stable macrostates. MC simulations show that, as expected, the theory seems to be exact for macroscopic observables of large homogeneous ERGMs, but often fails at the microscopic level or for the models of small size. The results suggest that the long-tailed distributions, common in real-world networks, cannot be fully explained by the spontaneous symmetry breaking in homogeneous ERGMs, and for this reason, the heterogeneous ERGM, based on the multicomponent CW model, is introduced and discussed

    On-line Chinese character recognition.

    Get PDF
    by Jian-Zhuang Liu.Thesis (Ph.D.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (p. 183-196).Microfiche. Ann Arbor, Mich.: UMI, 1998. 3 microfiches ; 11 x 15 cm

    XVII. Magyar Számítógépes Nyelvészeti Konferencia

    Get PDF

    From user-generated text to insight context-aware measurement of social impacts and interactions using natural language processing

    Get PDF
    Recent improvements in information and communication technologies have contributed to an increasingly globalized and connected world. The digital data that are created as the result of people's online activities and interactions consist of different types of personal and social information that can be used to extract and understand people's implicit or explicit beliefs, ideas, and biases. This thesis leverages methods and theories from natural language processing and social sciences to study and analyze the manifestations of various attributes and signals, namely social impacts, personal values, and moral traits, in user-generated texts. This work provides a comprehensive understanding of people's viewpoints, social values, and interactions and makes the following contributions. First, we present a study that combines review mining and impact assessment to provide an extensive discussion on different types of impact that information products, namely documentary films, can have on people. We first establish a novel impact taxonomy and demonstrate that, with a rigorous analysis of user-generated texts and a theoretically grounded codebook, classification schema, and prediction model, we can detect multiple types of (self-reported) impact in texts and show that people's language can help in gaining insights about their opinions, socio-cultural information, and emotional states. Furthermore, the results of our analyses show that documentary films can shift peoples' perceptions and cognitions regarding different societal issues, e.g., climate change, and using a combination of informative features (linguistic, syntactic, and psychological), we can predict impact in sentences with high accuracy. Second, we investigate the relationship between principles of human morality and the expression of stances in user-generated text data, namely tweets. More specifically, we first introduce and expand the Moral Foundations Dictionary and operationalize moral values to enhance the measurement of social effects. In addition, we provide detailed explanation on how morality and stance are associated in user-generated texts. Through extensive analysis, we show that discussions related to various social issues have distinctive moral and lexical profiles, and leveraging moral values as an additional feature can lead to measurable improvements in prediction accuracy of stance analysis. Third, we utilize the representation of emotional and moral states in texts to study people's interactions in two different social networks. Moreover, we first expand the analysis of structural balance to include direction and multi-level balance assessment (triads, subgroups, and the whole network). Our results show that analyzing different levels of networks and using various linguistic cues can grant a more inclusive view of people and the stability of their interactions; we found that, unlike sentiments, moral statuses in discussions stay balanced throughout the networks even in the presence of tension. Overall, this thesis aims to contribute to the emerging field of "social" NLP and broadens the scope of research in it by (1) utilizing a combination of novel taxonomies, datasets, and tools to examine user-generated texts and (2) providing more comprehensive insights about human language, cultures, and experiences

    Contribution en appariement de graphes pour la recherche d'images par le contenu

    Get PDF
    Cette thèse s’inscrit dans le cadre général de la reconnaissance de formes structurelles. Elle s’intéresse plus particulièrement à la modélisation des formes par les graphes. L’utilisation de graphes est motivée par le double intérêt qu’apportent ces derniers pour modéliser tous les objets d’une forme donnée et toutes les relations inter objets nécessaires pour la reconnaissance. Un exemple typique utilisé dans cette thèse est celui de la recherche d’images par le contenu (RIPC). Cependant, les techniques présentées dans cette thèse ont un champ plus vaste que la RIPC. La représentation des images par des graphes implique le recours à des algorithmes d’appariement de graphes afin de comparer et de détecter la similarité entre les images. Par ailleurs la recherche dans une base de données d’image nécessite une réorganisation préalable de la base afin de faciliter la recherche, ce qui nous conduit à faire appel à des techniques de classification des images représentées par des graphes. Dans un premier temps, nous proposons un nouvel algorithme pour mettre en correspondance un graphe requête et un graphe modèle. L’idée de base est de diviser le processus de recherche des correspondances en plusieurs phases (K). À l’issue de chaque phase, l’ensemble des correspondances est extrait, évalué et finalement comparé à celui dont le coût de correspondance est minimal. Dans un deuxième temps, nous proposons un nouvel algorithme pour identifier un représentant appelé Graphe Médian, parmi un ensemble de graphes. Le rôle du graphe médian est capital pour la classification et la réorganisation d’une base de données image utilisant les graphes pour représenter son contenu. Finalement, nous proposons un système de recherche d’images par le contenu utilisant les graphes pour représenter leur contenu et les deux algorithmes précédemment décrits. D’une manière générale, les résultats présentés dans cette thèse montrent l’intérêt potentiel d’utiliser les graphes pour représenter les formes. Ces résultats semblent valider le choix judicieux des graphes comme une solution de remplacement aux structures de données classiques à savoir les vecteurs. De plus, on voit clairement à travers les résultats obtenus que les algorithmes, développés dans cette thèse, pourront jouer un rôle primordial comme un outil de mesure de similarité dans un espace aussi complexe que les graphes

    Geometric, Feature-based and Graph-based Approaches for the Structural Analysis of Protein Binding Sites : Novel Methods and Computational Analysis

    Get PDF
    In this thesis, protein binding sites are considered. To enable the extraction of information from the space of protein binding sites, these binding sites must be mapped onto a mathematical space. This can be done by mapping binding sites onto vectors, graphs or point clouds. To finally enable a structure on the mathematical space, a distance measure is required, which is introduced in this thesis. This distance measure eventually can be used to extract information by means of data mining techniques
    corecore