18 research outputs found

    The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas

    Get PDF
    Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 26K topics and 226K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO we have developed the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. Users can use the portal to rate topics and relationships, suggest missing relationships, and visualise sections of the ontology. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various communities engaged with scholarly data

    On tasks and soft skills in operations and supply chain management: analysis and evidence from the O*NET database

    Get PDF
    Purpose: The purpose of the study is to identify the soft skills and abilities that are crucial to success in the fields of operations management (OM) and supply chain management (SCM), using the ONET database and the classification of a set of professional figures integrating values for task skills and abilities needed to operate successfully in these professions. Design/methodology/approach: The study used the ONET database to identify the soft skills and abilities required for success in OM and SCM industries. Correlation analysis was conducted to determine the tasks required for the job roles and their characteristics in terms of abilities and soft skills. ANOVA analysis was used to validate the findings. The study aims to help companies define specific assessments and tests for OM and SCM roles to measure individual attitudes and correlate them with the job position. Findings: As a result of the work, a set of soft skills and abilities was defined that allow, through correlation analysis, to explain a large number of activities required to work in the operations and SCM (OSCM) environment. Research limitations/implications: The work is inherently affected by the database used for the professional figures mapped and the scores that are attributed within ONET to the analyzed elements. Practical implications: The information resulting from this study can help companies develop specific assessments and tests for the roles of OM and SCM to measure individual attitudes and correlate them with the requirements of the job position. The study aims to address the need to identify soft skills in the human sphere and determine which of them have the most significant impact on the OM and SCM professions. Originality/value: The originality of this study lies in its approach to identify the set of soft skills and abilities that determine success in the OM and SCM industries. The study used the ONET database to correlate the tasks required for specific job roles with their corresponding soft skills and abilities. Furthermore, the study used ANOVA analysis to validate the findings in other sectors mapped by the same database. The identified soft skills and abilities can help companies develop specific assessments and tests for OM and SCM roles to measure individual attitudes and correlate them with the requirements of the job position. In addressing the necessity for enhanced clarity in the domain of human factor, this study contributes to identifying key success factors. Subsequent research can further investigate their practical application within companies to formulate targeted growth strategies and make appropriate resource selections for vacant positions

    Modular Design Patterns for Hybrid Learning and Reasoning Systems: a taxonomy, patterns and use cases

    Full text link
    The unification of statistical (data-driven) and symbolic (knowledge-driven) methods is widely recognised as one of the key challenges of modern AI. Recent years have seen large number of publications on such hybrid neuro-symbolic AI systems. That rapidly growing literature is highly diverse and mostly empirical, and is lacking a unifying view of the large variety of these hybrid systems. In this paper we analyse a large body of recent literature and we propose a set of modular design patterns for such hybrid, neuro-symbolic systems. We are able to describe the architecture of a very large number of hybrid systems by composing only a small set of elementary patterns as building blocks. The main contributions of this paper are: 1) a taxonomically organised vocabulary to describe both processes and data structures used in hybrid systems; 2) a set of 15+ design patterns for hybrid AI systems, organised in a set of elementary patterns and a set of compositional patterns; 3) an application of these design patterns in two realistic use-cases for hybrid AI systems. Our patterns reveal similarities between systems that were not recognised until now. Finally, our design patterns extend and refine Kautz' earlier attempt at categorising neuro-symbolic architectures.Comment: 20 pages, 22 figures, accepted for publication in the International Journal of Applied Intelligenc

    Understanding future skills: requirements for better data

    Get PDF
    Deliverable 6.3 focuses on data necessary for a comprehensive analysis of skills for digitalisation. Reliable data is needed to make appropriate decisions for the New Skills Agenda for Europe, national initiatives, and VET systems. Qualitative assessments of Tasks 6.1 to 6.4 are contrasted with quantitative WP3 data to identify gaps in data, indicators and measures that support monitoring of skill requirements. The main outcome is that there are still gaps in European data on skills that leave stakeholders partially blindfolded when looking at changes in skill demand and resulting needs for adaptations of skill supply. The report formulates requirements for the improvement of data

    Infrastructure and cities ontologies

    Get PDF
    The creation and use of ontologies has become increasingly relevant for complex systems in recent years. This is because of the growing number of use of cases that rely on real-world integration of disparate systems, the need for semantic congruence across boundaries and the expectations of users for conceptual clarity within evolving domains or systems of interest. These needs are evident in most spheres of research involving complex systems, but they are particularly apparent in infrastructure and cities where traditionally siloed and sectoral approaches have dominated, undermining the potential for integration to solve societal challenges such as net zero, resilience to climate change, equity and affordability. This paper reports on findings of a literature review on infrastructure and city ontologies and puts forward some hypotheses inferred from the literature findings. The hypotheses are discussed with reference to the literature and provide avenues for further research on (a) belief systems that underpin non-top-level ontologies and the potential for interference from them, (b) the need for a small number of top-level ontologies and translation mechanisms between them and (c) clarity on the role of standards and information systems in the adaptability and quality of data sets using ontologies. A gap is also identified in the extent that ontologies can support more complex automated coupling and data transformation when dealing with different scales

    Ontology-Driven Semantic Data Integration in Open Environment

    Get PDF
    Collaborative intelligence in the context of information management can be defined as A shared intelligence that results from the collaboration between various information systems . In open environments, these collaborating information systems can be heterogeneous, dynamic and loosely-coupled. Information systems in open environment can also possess a certain degree of autonomy. The integration of data residing in various heterogeneous information systems is essential in order to drive the intelligence efficiently and accurately. Because of the heterogeneous, loosely-coupled, and dynamic nature of open environment, the integration between these information systems in the data level is not efficient. Several approaches and models have been proposed in order to perform the task of data integration. Many of the existing approaches for data integration are designed for closed environment, tightly-coupled systems and enterprise data integration. They make explicit, or implicit, assumptions about the semantic structure of the data. Because of the heterogeneous and loosely-coupled nature of open environment, such assumptions are deemed unintuitive. Data integration approaches based on model that are extensional in nature are also inadequate for open environment. This is because they do not account for the dynamic nature of open environment. The need for an adequate model for describing data integration systems in open environment is quite evident. Intensional based modeling is found to be an adequate and natural choice for modeling in open environment. This is because it addresses the dynamic and loosely-coupled nature of open environment. In this work, an intensional model for the conceptualization is presented. This model is based on the theory of Properties Relations and Propositions (PRP). The proposed description takes the concepts, relations, and properties as primitive and as such, irreducible entities. The formal intensional account of both Ontology and Ontological Commitment are also proposed in light of the intensional model for conceptualization. An intensional model for ontology-driven mediated data integration in open environment is also proposed. The proposed model accounts for the dynamic nature of open environment and also intensionally describes the information of data sources. The interface between global and local ontologies and the formal intensional semantics of the query answering are then described

    Developing and Validating Open Source Tools for Advanced Neuroimaging Research

    Get PDF
    Almost all scientific research relies on software. This is particularly true for research that uses neuroimaging technologies, such as functional magnetic resonance imaging (fMRI). These technologies generate massive amounts of data per participant, which must be processed and analyzed using specialized software. A large portion of these tools are developed by teams of researchers, rather than trained software developers. In this kind of ecosystem, where the majority of software creators are scientists, rather than trained programmers, it becomes more important than ever to rely on community-based development, which may explain why most of this software is open source. It is in the development of this kind of research-oriented, open source software that I have focused much of my graduate training, as is reflected in this dissertation. One software package I have helped to develop and maintain is tedana, a Python library for denoising multi-echo fMRI data. In chapter 2, I describe this library in a short, published software paper. Another library I maintain as the primary developer is NiMARE, a Python library for performing neuroimaging meta-analyses and derivative analyses, such as automated annotation and functional decoding. In chapter 3, I present NiMARE in a hybrid software paper with embedded tutorial code exhibiting the functionality of the library. This paper is currently hosted as a Jupyter book that combines narrative content and code snippets that can be executed online. In addition to research software development, I have focused my graduate work on performing reproducible, open fMRI research. To that end, chapter 4 is a repli- cation and extension of a recent paper on multi-echo fMRI denoising methods Power et al. (2018a). This replication was organized as a registered report, in which the introduction and methods were submitted for peer review before the analyses were performed. Finally, chapter 5 is a conclusion to the dissertation, in which I reflect on the work I have done and the skills I have developed throughout my training

    Deciphering the functional organization of molecular networks via graphlets-based methods and network embedding techniques

    Full text link
    [eng] Advances in capturing technologies have yielded a massive production of large-scale molecular data that describe different aspects of cellular functioning. These data are often modeled as networks, in which nodes are molecular entities, and the edges connecting them represent their relationships. These networks are a valuable source of biological information, but they need to be untangled by new algorithms to reveal the information hidden in their wiring patterns. State-of-the-art approaches for deciphering these complex networks are based on graphlets and network embeddings. This thesis focuses on the development of novel algorithms to overcome the limitations of the current graphlet and network embedding methodologies in the field of biology. Graphlets are a powerful tool for characterizing the local wiring patterns of molecular networks. However, current graphlet-based methods are mostly applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that represent the probability of an interaction occurring in the cell. This probabilistic information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. To address this challenge, we introduce probabilistic graphlets, a novel approach that can capture the local wiring patterns of weighted networks and uncover hidden probabilistic relationships between molecular entities. We use probabilistic graphlets to generalize the graphlet methods and apply these to the probabilistic representation of real-world molecular interactions. We show that probabilistic graphlets robustly un- cover relevant biological information from the molecular networks. Furthermore, we demonstrate that probabilistic graphlets exhibit a higher sensitivity to identifying condition-specific functions compared to their unweighted counterparts. Network embedding algorithms learn a low-dimensional vectorial representation for each gene in the network while preserving the structural information of the molecular network. Current, available embedding approaches strictly focus on clustering the genes’ embedding vectors and interpreting such clusters to reveal the hidden information of the biological networks. Thus, we investigate new perspectives and methods that go beyond gene-centric approaches. First, we shift the exploration of the embedding space’s functional organization from the genes to their functions. We introduce the Functional Mapping Matrix and apply it to investigate the changes in the organization of cancer and control network embedding spaces from a functional perspective. We demonstrate that our methodology identifies novel cancer-related functions and genes that the currently available methods for gene-centric analyses cannot identify. Finally, we go even further and switch the perspective from the organization of the embedded entities (genes and functions) in the embedding space to the space itself. We annotate axes of the network embedding spaces of six species with both, functional annotations and genes. We demonstrate that the embedding space axes represent coherent cellular functions and offer a functional fingerprint of the cell’s functional organization. Moreover, we show that the analysis of the axes reveals new functional evolutionary connections between species
    corecore