4 research outputs found

    Ontology Ranking: Finding the Right Ontologies on the Web

    No full text
    Ontology search, which is the process of finding ontologies or ontological terms for users’ defined queries from an ontology collection, is an important task to facilitate ontology reuse of ontology engineering. Ontology reuse is desired to avoid the tedious process of building an ontology from scratch and to limit the design of several competing ontologies that represent similar knowledge. Since many organisations in both the private and public sectors are publishing their data in RDF, they increasingly require to find or design ontologies for data annotation and/or integration. In general, there exist multiple ontologies representing a domain, therefore, finding the best matching ontologies or their terms is required to facilitate manual or dynamic ontology selection for both ontology design and data annotation. The ranking is a crucial component in the ontology retrieval process which aims at listing the ‘relevant0 ontologies or their terms as high as possible in the search results to reduce the human intervention. Most existing ontology ranking techniques inherit one or more information retrieval ranking parameter(s). They linearly combine the values of these parameters for each ontology to compute the relevance score against a user query and rank the results in descending order of the relevance score. A significant aspect of achieving an effective ontology ranking model is to develop novel metrics and dynamic techniques that can optimise the relevance score of the most relevant ontology for a user query. In this thesis, we present extensive research in ontology retrieval and ranking, where several research gaps in the existing literature are identified and addressed. First, we begin the thesis with a review of the literature and propose a taxonomy of Semantic Web data (i.e., ontologies and linked data) retrieval approaches. That allows us to identify potential research directions in the field. In the remainder of the thesis, we address several of the identified shortcomings in the ontology retrieval domain. We develop a framework for the empirical and comparative evaluation of different ontology ranking solutions, which has not been studied in the literature so far. Second, we propose an effective relationship-based concept retrieval framework and a concept ranking model through the use of learning to rank approach which addresses the limitation of the existing linear ranking models. Third, we propose RecOn, a framework that helps users in finding the best matching ontologies to a multi-keyword query. There the relevance score of an ontology to the query is computed by formulating and solving the ontology recommendation problem as a linear and an optimisation problem. Finally, the thesis also reports on an extensive comparative evaluation of our proposed solutions with several other state-of-the-art techniques using real-world ontologies. This thesis will be useful for researchers and practitioners interested in ontology search, for methods and performance benchmark on ranking approaches to ontology search

    Automatically assembling a full census of an academic field

    Get PDF
    The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is prohibitively expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method constructs a complete census of the field within a few minutes, and achieves over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction.Comment: 11 pages, 6 figures, 2 table

    A structural and quantitative analysis of the webof linked data and its components to perform retrieval data

    Get PDF
    Esta investigación consiste en un análisis cuantitativo y estructural de la Web of Linked Data con el fin de mejorar la búsqueda de datos en distintas fuentes. Para obtener métricas cuantitativas de la Web of Linked Data, se aplicarán técnicas estadísticas. En el caso del análisis estructural haremos un Análisis de Redes Sociales (ARS). Para tener una idea de la Web of Linked Data para poder hacer un análisis, nos ayudaremos del diagrama de la Linking Open Data (LOD) cloud. Este es un catálogo online de datasets cuya información ha sido publicada usando técnicas de Linked Data. Los datasets son publicados en un lenguaje llamado Resource Description Framework (RDF), el cual crea enlaces entre ellos para que la información pudiera ser reutilizada. El objetivo de obtener un análisis cuantitativo y estructural de la Web of Linked Data es mejorar las búsquedas de datos. Para ese propósito nosotros nos aprovecharemos del uso del lenguaje de marcado Schema.org y del proyecto Linked Open Vocabularies (LOV). Schema.org es un conjunto de etiquetas cuyo objetivo es que los Webmasters pudieran marcar sus propias páginas Web con microdata. El microdata es usado para ayudar a los motores de búsqueda y otras herramientas Web a entender mejor la información que estas contienen. LOV es un catálogo para registrar los vocabularios que usan los datasets de la Web of Linked Data. Su objetivo es proporcionar un acceso sencillo a dichos vocabularios. En la investigación, vamos a desarrollar un estudio para la obtención de datos de la Web of Linked Data usando las fuentes mencionadas anteriormente con técnicas de “ontology matching”. En nuestro caso, primeros vamos a mapear Schema.org con LOV, y después LOV con la Web of Linked Data. Un ARS de LOV también ha sido realizado. El objetivo de dicho análisis es obtener una idea cuantitativa y cualitativa de LOV. Sabiendo esto podemos concluir cosas como: cuales son los vocabularios más usados o si están especializados en algún campo o no. Estos pueden ser usados para filtrar datasets o reutilizar información

    A Taxonomy of Semantic Web Data Retrieval Techniques

    No full text
    corecore