7,526 research outputs found

    Enrichissement de lexiques sémantiques approvisionnés par les foules : le systÚme WISIGOTH appliqué à Wiktionary

    Get PDF
    International audienceSemantic lexical resources are a mainstay of various NLP applications. However, comprehensive and reliable resources rarely exist or are often not freely available. We discuss in this paper the context of lexical resources building and the problems of evaluation. We present Wiktionary, a freely available and collaboratively built multilingual dictionary and we propose a semi-automatic approach based on random walks for enriching its synonymy network, which uses endogenous and exogenous data. We then propose a validation "by crowds". Finally, we present an implementation of this system called WISIGOTH.Bien que de nombreuses applications de TAL reposent sur des ressources lexicales sĂ©mantiques, celles-ci sont rarement simultanĂ©ment de qualitĂ© satisfaisante et librement disponibles. Partant de la confrontation entre mĂ©thodes traditionnelles et tendances Ă©mergentes de construction et d'Ă©valuation de ressources lexicales, nous prĂ©sentons dans cet article une nouvelle mĂ©thode fondĂ©e sur Wiktionary, un dictionnaire multilingue libre, disponible en ligne et construit collaborativement, puis nous proposons un enrichissement semi-automatique de son rĂ©seau de synonymie utilisant des donnĂ©es endogĂšnes et exogĂšnes, recourant Ă  une validation " par les foules ". Nous dĂ©crivons enïŹn une implĂ©mentation de ce systĂšme baptisĂ©e WISIGOTH

    Clustering Sets of Objects Using Concepts-Objects Bipartite Graphs

    Get PDF
    International audienceIn this paper we deal with data stated under the form of abinary relation between objects and properties. We propose an approachfor clustering the objects and labeling them with characteristic subsetsof properties. The approach is based on a parallel between formal con-cept analysis and graph clustering. The problem is made tricky due tothe fact that generally there is no partitioning of the objects that can beassociated with a partitioning of properties. Indeed a relevant partitionof objects may exist, whereas it is not the case for properties. In order toobtain a conceptual clustering of the objects, we work with a bipartitegraph relating objects with formal concepts. Experiments on artificialbenchmarks and real examples show the effectiveness of the method,more particularly the fact that the results remain stable when an in-creasing number of properties are shared between objects of differentclusters

    A Parallel between Extended Formal Concept Analysis and Bipartite Graphs Analysis

    Get PDF
    International audienceThe paper offers a parallel between two approaches to con-ceptual clustering, namely formal concept analysis (augmented with theintroduction of new operators) and bipartite graph analysis. It is shownthat a formal concept (as defined in formal concept analysis) correspondsto the idea of a maximal bi-clique, while a “conceptual world” (definedthrough a Galois connection associated of the new operators) is a dis-connected sub-graph in a bipartite graph. The parallel between formalconcept analysis and bipartite graph analysis is further exploited by con-sidering “approximation” methods on both sides. It leads to suggests newideas for providing simplified views of datasets

    Comparing and Fusing Terrain Network Information

    Get PDF
    International audienceTerrain networks (or complex networks) is a type of relational infor-mation that is encountered in many fields. In order to properly answer questionspertaining to the comparison or to the merging of such networks, a method thattakes into account the underlying structure of graphs is proposed. The effective-ness of the method is illustrated using real linguistic data networks and artificialnetworks, in particular

    Invariants and variability of synonymy networks: Self mediated agreement by confluence

    Get PDF
    International audienceEdges of graphs that model real data can beseen as judgements whether pairs of objectsare in relation with each other or not. So,one can evaluate the similarity of two graphswith a measure of agreement between judgesclassifying pairs of vertices into two cate-gories (connected or not connected). Whenapplied to synonymy networks, such measuresdemonstrate a surprisingly low agreement be-tween various resources of the same language.This seems to suggest that the judgementson synonymy of lexemes of the same lexi-con radically differ from one dictionary ed-itor to another. In fact, even a strong dis-agreement between edges does not necessarilymean that graphs model a completely differ-ent reality: although their edges seem to dis-agree, synonymy resources may, at a coarsergrain level, outline similar semantics. To in-vestigate this hypothesis, we relied on sharedcommon properties of real world data net-works to look at the graphs at a more globallevel by using random walks. They enabledus to reveal a much better agreement betweendense zones than between edges of synonymygraphs. These results suggest that althoughsynonymy resources may disagree at the levelof judgements on single pairs of words, theymay nevertheless convey an essentially simi-lar semantic information

    Mesurer la similarité structurelle entre réseaux lexicaux

    Get PDF
    International audienceIn this paper, we compare the topological structure of lexical networks with a method based on randomwalks. Instead of characterising pairs of vertices according only to whether they are connected or not, we measure theirstructural proximity by evaluating the relative probability of reaching one vertex from the other via a short random walk.This proximity between vertices is the basis on which we can compare the topological structure of lexical networks be-cause it outlines the similar dense zones of the graphs.Dans cet article, nous comparons la structure topologique des rĂ©seaux lexicaux avec une mĂ©thode fondĂ©e sur des marches alĂ©atoires. Au lieu de caractĂ©riser les paires de sommets selon un critĂšre binaire de connectivitĂ©, nous mesurons leur proximitĂ© structurelle par la probabilitĂ© relative d'atteindre un sommet depuis l'autre par une courte marche alĂ©atoire. Parce que cette proximitĂ© rapproche les sommets d'une mĂȘme zone dense en arĂȘtes, elle permet de comparer la structure topologique des rĂ©seaux lexicaux

    Nueva población de Cynara tournefortii Boiss. & Reut. (Compositae) en Andalucía (S. España)

    Get PDF
    New record for Cynara tournefortii Boiss. & Reut. (Compositae) in Andalusia (S. Spain)Palabras clave. Cynara tournefortii, Compositae, corología, conservación, S. España.Key words. Cynara tournefortii, Compositae, chorology, conservation, S. Spain

    Aplication of remote senging techniques to the study of internal waves in the strait of Gibraltar

    Get PDF
    The generation and propagation of internal waves is one of the most interesting oceanographic processes in the Strait of Gibraltar. In this paper, radar (ASAR) and ocean colour images (MODIS y MERIS) have been used in order to characterize this phenomenon. The processing of instantaneous colour images has allowed the analysis of the relationship between physical processes of the internal waves and the biological implications. During internal waves generation, MODIS and MERIS images show a chlorophyll maximum structures in the coastal areas of Camarinal Sill. When these waves are located in AlborĂĄn Sea, the colour images illustrate the presence of chlorophyll maximum associated to the waves front. The results seem to indicate that a suction of coastal water take place during the internal waves generation and this rich chlorophyll water entry in AlborĂĄn Sea travelling joint to the internal waves.Peer Reviewe

    Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary

    Get PDF
    International audienceThe lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent, semi-automatic and endogenous method for enriching lexical resources, based on collaborative editing and random walks through existing lexical relationships, and shows how this approach enables us to overcome recurrent impediments. It compares the impact of using different data sources and similarity measures on the task of improving synonymy networks. Finally, it defines an architecture for applying the presented method to Wiktionary and explains how it has been implemented
    • 

    corecore