1,346 research outputs found

    Automatic Normalization and Annotation for Discovering Semantic Mappings

    Get PDF
    Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the “hidden meaning” associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning” to schema labels. However, accuracy of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and word abbreviations. In this work, we address this problem by proposing a method to perform schema labels normalization which increases the number of comparable labels. Unlike other solutions, the method semi-automatically expands abbreviations and annotates compound terms, without a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy

    Uncertainty in data integration systems: automatic generation of probabilistic relationships

    Get PDF
    This paper proposes a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is based on probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. to perform annotation w.r.t. a thesaurus/lexical resource) the schemata of a given set of data sources to be integrated. From the annotated schemata and the relathionships defined in the thesaurus, we derived the probabilistic lexical relationships among schema elements. Lexical relationships are collected in the Probabilistic Common Thesaurus (PCT), as well as structural relationships

    Robust Subgraph Generation Improves Abstract Meaning Representation Parsing

    Full text link
    The Abstract Meaning Representation (AMR) is a representation for open-domain rich semantics, with potential use in fields like event extraction and machine translation. Node generation, typically done using a simple dictionary lookup, is currently an important limiting factor in AMR parsing. We propose a small set of actions that derive AMR subgraphs by transformations on spans of text, which allows for more robust learning of this stage. Our set of construction actions generalize better than the previous approach, and can be learned with a simple classifier. We improve on the previous state-of-the-art result for AMR parsing, boosting end-to-end performance by 3 F1_1 on both the LDC2013E117 and LDC2014T12 datasets.Comment: To appear in ACL 201

    Melis: an incremental method for the lexical annotation of domain ontologies

    Get PDF
    In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELIS is the incremental process: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of ME LIS as a standalone tool and as a component integrated in MOMIS

    Semantic Web Services Provisioning

    Get PDF
    Semantic Web Services constitute an important research area, where vari ous underlying frameworks, such as WSMO and OWL-S, define Semantic Web ontologies to describe Web services, so they can be automatically discovered, composed, and invoked. Service discovery has been traditionally interpreted as a functional filter in current Semantic Web Services frameworks, frequently performed by Description Logics reasoners. However, semantic provisioning has to be performed taking Quality-of-Service (QOS) into account, defining user preferences that enable QOS-aware Semantic Web Service selection. Nowadays, the research focus is actually on QOS-aware processes, so cur rent proposals are developing the field by providing QOS support to semantic provisioning, especially in selection processes. These processes lead to opti mization problems, where the best service among a set of services has to be selected, so Description Logics cannot be used in this context. Furthermore, user preferences has to be semantically defined so they can be used within selection processes. There are several proposals that extend Semantic Web Services frameworks allowing QOS-aware semantic provisioning. However, proposed selection techniques are very coupled with their proposed extensions, most of them being implemented ad hoc. Thus, there is a semantic gap between functional descriptions (usually using WSMO or OWL-S) and user preferences, which are specific for each proposal, using different ontologies or even non-semantic de scriptions, and depending on its corresponding ad hoc selection technique. In this report, we give an overview of most important Semantic Web Ser vices frameworks, showing a comparison between them. Then, a thorough analysis of state-of-the art proposals on QOS-aware semantic provisioning and user preferences descriptions is presented, discussing about their applicabil ity, advantages, and defects. Results from this analysis motivate our research work, which has been already materialized in two early contributions.Los servicios web semánticos constituyen un importante campo de inves tigación, en el cual distintos frameworks, como por ejemplo WSMO y OWL-S, definen ontologías de la web semántica para describir servicios web, de for ma que estos puedan ser descubiertos, compuestos e invocados de manera automática. El descubrimiento de servicios ha sido interpretado tradicional mente como un filtro funcional en los frameworks actuales de servicios web semánticos, usando para ello razonadores de lógica descriptiva. Sin embargo, las tareas de aprovisionamiento semántico deberían tener en cuenta la calidad del servicio, definiendo para ello preferencias de usuario de manera que sea posible realizar una selección de servicios web semánticos sensible a la cali dad. Actualmente, el foco de la investigación está en procesos sensibles a la ca lidad, por lo que las propuestas actuales están trabajando en este campo intro duciendo el soporte adecuado a la calidad del servicio dentro del aprovisio namiento semántico, y principalmente en las tareas de selección. Estas tareas desembocan en problemas de optimización, donde el mejor servicio de entre un concjunto debe ser seleccionado, por lo que las lógicas descriptivas no pue den ser usadas en este contexto. Además, las preferencias de usuario deben ser definidas semánticamente, de forma que puedan ser usadas en las tareas de selección. Existen bastantes propuestas que extienden los frameworks de servicios web semánticos para habilitar el aprovisionamiento sensible a la calidad. Sin embargo, las técnicas de selección propuestas están altamente acopladas con dichas extensiones, donde la mayoría de ellas implementan algoritmos ad hoc. Por tanto, existe un salto semántico entre las descripciones funcionales (nor malmente usando WSMO o OWL-S) y las preferencias de usuario, las cuales son definidas específicamente por cada propuesta, usando ontologías distin tas o incluso descripciones no semánticas que dependen de la correspondiente técnica de selección ad hoc

    Toward automatic extraction of expressive elements from motion pictures : tempo

    Full text link
    This paper addresses the challenge of bridging the semantic gap that exists between the simplicity of features that can be currently computed in automated content indexing systems and the richness of semantics in user queries posed for media search and retrieval. It proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high-level semantics of stories portrayed, thus enabling rich video annotation and interpretation. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step toward demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for a number of full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful high-level semantic construct in its own right and a promising component of others such as rhythm, tone or mood of a film. In addition to the development of this computable tempo measure, a study is conducted as to the usefulness of biasing it toward either of its constituents, namely, motion or shot length. Finally, a refinement is made to the shot length normalizing mechanism, driven by the peculiar characteristics of shot length distribution exhibited by movies. Results of these additional studies, and possible applications and limitations are discussed

    Recovering Grammar Relationships for the Java Language Specification

    Get PDF
    Grammar convergence is a method that helps discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent

    Overview of BioASQ 2023: The eleventh BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

    Full text link
    This is an overview of the eleventh edition of the BioASQ challenge in the context of the Conference and Labs of the Evaluation Forum (CLEF) 2023. BioASQ is a series of international challenges promoting advances in large-scale biomedical semantic indexing and question answering. This year, BioASQ consisted of new editions of the two established tasks b and Synergy, and a new task (MedProcNER) on semantic annotation of clinical content in Spanish with medical procedures, which have a critical role in medical practice. In this edition of BioASQ, 28 competing teams submitted the results of more than 150 distinct systems in total for the three different shared tasks of the challenge. Similarly to previous editions, most of the participating systems achieved competitive performance, suggesting the continuous advancement of the state-of-the-art in the field.Comment: 24 pages, 12 tables, 3 figures. CLEF2023. arXiv admin note: text overlap with arXiv:2210.0685

    An Intelligent Online Shopping Guide Based On Product Review Mining

    Get PDF
    This position paper describes an on-going work on a novel recommendation framework for assisting online shoppers in choosing the most desired products, in accordance with requirements input in natural language. Existing feature-based Shopping Guidance Systems fail when the customer lacks domain expertise. This framework enables the customer to use natural language in the query text to retrieve preferred products interactively. In addition, it is intelligent enough to allow a customer to use objective and subjective terms when querying, or even the purpose of purchase, to screen out the expected products
    corecore