7,047 research outputs found

    Pragmatic Ontology Evolution: Reconciling User Requirements and Application Performance

    Get PDF
    Increasingly, organizations are adopting ontologies to describe their large catalogues of items. These ontologies need to evolve regularly in response to changes in the domain and the emergence of new requirements. An important step of this process is the selection of candidate concepts to include in the new version of the ontology. This operation needs to take into account a variety of factors and in particular reconcile user requirements and application performance. Current ontology evolution methods focus either on ranking concepts according to their relevance or on preserving compatibility with existing applications. However, they do not take in consideration the impact of the ontology evolution process on the performance of computational tasks – e.g., in this work we focus on instance tagging, similarity computation, generation of recommendations, and data clustering. In this paper, we propose the Pragmatic Ontology Evolution (POE) framework, a novel approach for selecting from a group of candidates a set of concepts able to produce a new version of a given ontology that i) is consistent with the a set of user requirements (e.g., max number of concepts in the ontology), ii) is parametrised with respect to a number of dimensions (e.g., topological considerations), and iii) effectively supports relevant computational tasks. Our approach also supports users in navigating the space of possible solutions by showing how certain choices, such as limiting the number of concepts or privileging trendy concepts rather than historical ones, would reflect on the application performance. An evaluation of POE on the real-world scenario of the evolving Springer Nature taxonomy for editorial classification yielded excellent results, demonstrating a significant improvement over alternative approaches

    The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas

    Get PDF
    Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 26K topics and 226K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO we have developed the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. Users can use the portal to rate topics and relationships, suggest missing relationships, and visualise sections of the ontology. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various communities engaged with scholarly data

    Forecasting the Spreading of Technologies in Research Communities

    Get PDF
    Technologies such as algorithms, applications and formats are an important part of the knowledge produced and reused in the research process. Typically, a technology is expected to originate in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies have been successfully adopted by a variety of fields, e.g., Information Retrieval, Human Computer Interaction, Biology, and many others. Unfortunately, the spreading of technologies across research areas may be a slow and inefficient process, since it is easy for researchers to be unaware of potentially relevant solutions produced by other research communities. In this paper, we hypothesise that it is possible to learn typical technology propagation patterns from historical data and to exploit this knowledge i) to anticipate where a technology may be adopted next and ii) to alert relevant stakeholders about emerging and relevant technologies in other fields. To do so, we propose the Technology-Topic Framework, a novel approach which uses a semantically enhanced technology-topic model to forecast the propagation of technologies to research areas. A formal evaluation of the approach on a set of technologies in the Semantic Web and Artificial Intelligence areas has produced excellent results, confirming the validity of our solution

    Klink-2: integrating multiple web sources to generate semantic topic networks

    Get PDF
    The amount of scholarly data available on the web is steadily increasing, enabling different types of analytics which can provide important insights into the research activity. In order to make sense of and explore this large-scale body of knowledge we need an accurate, comprehensive and up-to-date ontology of research topics. Unfortunately, human crafted classifications do not satisfy these criteria, as they evolve too slowly and tend to be too coarse-grained. Current automated methods for generating ontologies of research areas also present a number of limitations, such as: i) they do not consider the rich amount of indirect statistical and semantic relationships, which can help to understand the relation between two topics – e.g., the fact that two research areas are associated with a similar set of venues or technologies; ii) they do not distinguish between different kinds of hierarchical relationships; and iii) they are not able to handle effectively ambiguous topics characterized by a noisy set of relationships. In this paper we present Klink-2, a novel approach which improves on our earlier work on automatic generation of semantic topic networks and addresses the aforementioned limitations by taking advantage of a variety of knowledge sources available on the web. In particular, Klink-2 analyses networks of research entities (including papers, authors, venues, and technologies) to infer three kinds of semantic relationships between topics. It also identifies ambiguous keywords (e.g., “ontology”) and separates them into the appropriate distinct topics – e.g., “ontology/philosophy” vs. “ontology/semantic web”. Our experimental evaluation shows that the ability of Klink-2 to integrate a high number of data sources and to generate topics with accurate contextual meaning yields significant improvements over other algorithms in terms of both precision and recall

    Mejorando la Ciencia Abierta Usando Datos Abiertos Enlazados: Caso de Uso CONICET Digital

    Get PDF
    Los servicios de publicación científica están cambiando drásticamente, los investigadores demandan servicios de búsqueda inteligentes para descubrir y relacionar publicaciones científicas. Los editores deben incorporar información semántica para organizar mejor sus activos digitales y hacer que las publicaciones sean más visibles. En este documento, presentamos el trabajo en curso para publicar un subconjunto de publicaciones científicas de CONICET Digital como datos abiertos enlazados. El objetivo de este trabajo es mejorar la recuperación y la reutilización de datos a través de tecnologías de Web Semántica y Datos Enlazados en el dominio de las publicaciones científicas. Para lograr estos objetivos, se han tenido en cuenta los estándares de la Web Semántica y los esquemas RDF (Dublín Core, FOAF, VoID, etc.). El proceso de conversión y publicación se basa en las pautas metodológicas para publicar datos vinculados de gobierno. También describimos como estos datos se pueden vincular a otros conjuntos de datos como DBLP, Wikidata y DBPedia. Finalmente, mostramos algunos ejemplos de consultas que responden a preguntas que inicialmente no permite CONICET Digital.Scientific publication services are changing drastically, researchers demand intelligent search services to discover and relate scientific publications. Publishersneed to incorporate semantic information to better organize their digital assets and make publications more discoverable. In this paper, we present the on-going work to publish a subset of scientific publications of CONICET Digital as Linked Open Data. The objective of this work is to improve the recovery andreuse of data through Semantic Web technologies and Linked Data in the domain of scientific publications.To achieve these goals, Semantic Web standards and reference RDF schema?s have been taken into account (Dublin Core, FOAF, VoID, etc.). The conversion and publication process is guided by the methodological guidelines for publishing government linked data. We also outline how these data can be linked to other datasets DBLP, WIKIDATA and DBPEDIA on the web of data. Finally, we show some examples of queries that answer questions that initially CONICET Digital does not allowFil: Zárate, Marcos Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Centro para el Estudio de Sistemas Marinos; ArgentinaFil: Carlos Buckle. Universidad Nacional de la Patagonia "San Juan Bosco"; ArgentinaFil: Mazzanti, Renato. Universidad Nacional de la Patagonia "San Juan Bosco"; ArgentinaFil: Samec, Gustavo Daniel. Universidad Nacional de la Patagonia "San Juan Bosco"; Argentin

    Semantic learning webs

    Get PDF
    By 2020, microprocessors will likely be as cheap and plentiful as scrap paper,scattered by the millions into the environment, allowing us to place intelligent systems everywhere. This will change everything around us, including the nature of commerce, the wealth of nations, and the way we communicate, work, play, and live. This will give us smart homes, cars, TVs , jewellery, and money. We will speak to our appliances, and they will speak back. Scientists also expect the Internet will wire up the entire planet and evolve into a membrane consisting of millions of computer networks, creating an “intelligent planet.” The Internet will eventually become a “Magic Mirror” that appears in fairy tales, able to speak with the wisdom of the human race. Michio Kaku, Visions: How Science Will Revolutionize the Twenty - First Century, 1998 If the semantic web needed a symbol, a good one to use would be a Navaho dream-catcher: a small web, lovingly hand-crafted, [easy] to look at, and rumored to catch dreams; but really more of a symbol than a reality. Pat Hayes, Catching the Dreams, 2002 Though it is almost impossible to envisage what the Web will be like by the end of the next decade, we can say with some certainty that it will have continued its seemingly unstoppable growth. Given the investment of time and money in the Semantic Web (Berners-Lee et al., 2001), we can also be sure that some form of semanticization will have taken place. This might be superficial - accomplished simply through the addition of loose forms of meta-data mark-up, or more principled – grounded in ontologies and formalised by means of emerging semantic web standards, such as RDF (Lassila and Swick, 1999) or OWL (Mc Guinness and van Harmelen, 2003). Whatever the case, the addition of semantic mark-up will make at least part of the Web more readily accessible to humans and their software agents and will facilitate agent interoperability. If current research is successful there will also be a plethora of e-learning platforms making use of a varied menu of reusable educational material or learning objects. For the learner, the semanticized Web will, in addition, offer rich seams of diverse learning resources over and above the course materials (or learning objects) specified by course designers. For instance, the annotation registries, which provide access to marked up resources, will enable more focussed, ontologically-guided (or semantic) search. This much is already in development. But we can go much further. Semantic technologies make it possible not only to reason about the Web as if it is one extended knowledge base but also to provide a range of additional educational semantic web services such as summarization, interpretation or sense-making, structure-visualization, and support for argumentation
    corecore