10,599 research outputs found
Coordinating virus research: The Virus Infectious Disease Ontology
The COVID-19 pandemic prompted immense work on the investigation of the SARS-CoV-2 virus. Rapid, accurate, and consistent interpretation of generated data is thereby of fundamental concern. Ontologies––structured, controlled, vocabularies––are designed to support consistency of interpretation, and thereby to prevent the development of data silos. This paper describes how ontologies are serving this purpose in the COVID-19 research domain, by following principles of the Open Biological and Biomedical Ontology (OBO) Foundry and by reusing existing ontologies such as the Infectious Disease Ontology (IDO) Core, which provides terminological content common to investigations of all infectious diseases. We report here on the development of an IDO extension, the Virus Infectious Disease Ontology (VIDO), a reference ontology covering viral infectious diseases. We motivate term and definition choices, showcase reuse of terms from existing OBO ontologies, illustrate how ontological decisions were motivated by relevant life science research, and connect VIDO to the Coronavirus Infectious Disease Ontology (CIDO). We next use terms from these ontologies to annotate selections from life science research on SARS-CoV-2, highlighting how ontologies employing a common upper-level vocabulary may be seamlessly interwoven. Finally, we outline future work, including bacteria and fungus infectious disease reference ontologies currently under development, then cite uses of VIDO and CIDO in host-pathogen data analytics, electronic health record annotation, and ontology conflict-resolution projects
A Survey on Forensics and Compliance Auditing for Critical Infrastructure Protection
The broadening dependency and reliance that modern societies have on essential services
provided by Critical Infrastructures is increasing the relevance of their trustworthiness. However, Critical
Infrastructures are attractive targets for cyberattacks, due to the potential for considerable impact, not just
at the economic level but also in terms of physical damage and even loss of human life. Complementing
traditional security mechanisms, forensics and compliance audit processes play an important role in ensuring
Critical Infrastructure trustworthiness. Compliance auditing contributes to checking if security measures are
in place and compliant with standards and internal policies. Forensics assist the investigation of past security
incidents. Since these two areas significantly overlap, in terms of data sources, tools and techniques, they can
be merged into unified Forensics and Compliance Auditing (FCA) frameworks. In this paper, we survey the
latest developments, methodologies, challenges, and solutions addressing forensics and compliance auditing
in the scope of Critical Infrastructure Protection. This survey focuses on relevant contributions, capable of
tackling the requirements imposed by massively distributed and complex Industrial Automation and Control
Systems, in terms of handling large volumes of heterogeneous data (that can be noisy, ambiguous, and
redundant) for analytic purposes, with adequate performance and reliability. The achieved results produced
a taxonomy in the field of FCA whose key categories denote the relevant topics in the literature. Also, the
collected knowledge resulted in the establishment of a reference FCA architecture, proposed as a generic
template for a converged platform. These results are intended to guide future research on forensics and
compliance auditing for Critical Infrastructure Protection.info:eu-repo/semantics/publishedVersio
A knowledge graph-supported information fusion approach for multi-faceted conceptual modelling
It has become progressively more evident that a single data source is unable to comprehensively capture the
variability of a multi-faceted concept, such as product design, driving behaviour or human trust, which has
diverse semantic orientations. Therefore, multi-faceted conceptual modelling is often conducted based on multi-sourced data covering indispensable aspects, and information fusion is frequently applied to cope with the high
dimensionality and data heterogeneity. The consideration of intra-facets relationships is also indispensable. In
this context, a knowledge graph (KG), which can aggregate the relationships of multiple aspects by semantic
associations, was exploited to facilitate the multi-faceted conceptual modelling based on heterogeneous and
semantic-rich data. Firstly, rules of fault mechanism are extracted from the existing domain knowledge repository, and node attributes are extracted from multi-sourced data. Through abstraction and tokenisation of
existing knowledge repository and concept-centric data, rules of fault mechanism were symbolised and integrated with the node attributes, which served as the entities for the concept-centric knowledge graph (CKG).
Subsequently, the transformation of process data to a stack of temporal graphs was conducted under the CKG
backbone. Lastly, the graph convolutional network (GCN) model was applied to extract temporal and attribute
correlation features from the graphs, and a temporal convolution network (TCN) was built for conceptual
modelling using these features. The effectiveness of the proposed approach and the close synergy between the
KG-supported approach and multi-faceted conceptual modelling is demonstrated and substantiated in a case
study using real-world data
Documenting Knowledge Graph Embedding and Link Prediction using Knowledge Graphs
In recent years, sub-symbolic learning, i.e., Knowledge Graph Embedding (KGE) incorporated with Knowledge Graphs (KGs) has gained significant attention in various downstream tasks (e.g., Link Prediction (LP)). These techniques learn a latent vector representation of KG's semantical structure to infer missing links. Nonetheless, the KGE models remain a black box, and the decision-making process behind them is not clear. Thus, the trustability and reliability of the model's outcomes have been challenged. While many state-of-the-art approaches provide data-driven frameworks to address these issues, they do not always provide a complete understanding, and the interpretations are not machine-readable. That is why, in this work, we extend a hybrid interpretable framework, InterpretME, in the field of the KGE models, especially for translation distance models, which include TransE, TransH, TransR, and TransD. The experimental evaluation on various benchmark KGs supports the validity of this approach, which we term Trace KGE. Trace KGE, in particular, contributes to increased interpretability and understanding of the perplexing KGE model's behavior
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Digital approaches to construction compliance checking: Validating the suitability of an ecosystem approach to compliance checking
The lifecycle of the built environment is governed by complex regulations, requirements and standards. Ensuring compliance against these requirements is a complicated process, affecting the entire supply chain and often incurring significant costs, delay and uncertainty. Many of the processes, and elements within these processes, are formalised and supported by varying levels of digitisation and automation. This ranges from energy simulation, geometric checking, to building information modelling based checking.
However, there are currently no unifying standards or integrating technology to tie regulatory efforts together to enable the widespread adoption of automated compliance processes. This has left many current technical approaches, while advanced and robust, isolated. However, the increasing maturity of asset datasets/information models, means that integration of data/tools is now feasible. This paper will propose and validate a new approach of solving the problem of automated compliance checking through the use of an ecosystem of compliance checking services.
This work has identified a clear research gap. How automated compliance checking in the construction sector can move beyond sole reliance on BIM data, and tightly coupled integration with software tools, to provide an extensible enough system to integrate the current isolated software elements currently used within compliance checking processes.
To test this approach, an architecture for an ecosystem of compliance services will be specified. To validate this architecture, a prototype version will be developed and validated against requirements derived from the weaknesses of current approaches.
This validation has found that a distributed ecosystem can perform accurately and successfully, whilst providing advantages in terms of scalability and extensibility. This approach provides a route to the increased adoption of automated compliance checking, overcoming the issues of relying on one computer system/application to perform all aspects of this process
The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.
Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch\u27s APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch\u27s data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch\u27s analytic tools by developing a customized plugin for OpenAI\u27s ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app
Semantic models and knowledge graphs as manufacturing system reconfiguration enablers
Reconfigurable Manufacturing System (RMS) provides a cost-effective approach for manufacturers to adapt to fluctuating market demands by reconfiguring assets through automated analysis of asset utilization and resource allocation. Achieving this automation necessitates a clear understanding, formalization, and documentation of asset capabilities and capacity utilization. This paper introduces a unified model employing semantic modeling to delineate the manufacturing sector's capabilities, capacity, and reconfiguration potential. The model illustrates the integration of these three components to facilitate efficient system reconfiguration. Additionally, semantic modeling allows for the capture of historical experiences, thus enhancing long-term system reconfiguration through a knowledge graph. Two use cases are presented: capability matching and reconfiguration solution recommendation based on the proposed model. A thorough explication of the methodology and outcomes is provided, underscoring the advantages of this approach in terms of heightened efficiency, diminished costs, and augmented productivity
The DO-KB Knowledgebase: a 20-year journey developing the disease open science ecosystem.
In 2003, the Human Disease Ontology (DO, https://disease-ontology.org/) was established at Northwestern University. In the intervening 20 years, the DO has expanded to become a highly-utilized disease knowledge resource. Serving as the nomenclature and classification standard for human diseases, the DO provides a stable, etiology-based structure integrating mechanistic drivers of human disease. Over the past two decades the DO has grown from a collection of clinical vocabularies, into an expertly curated semantic resource of over 11300 common and rare diseases linking disease concepts through more than 37000 vocabulary cross mappings (v2023-08-08). Here, we introduce the recently launched DO Knowledgebase (DO-KB), which expands the DO\u27s representation of the diseaseome and enhances the findability, accessibility, interoperability and reusability (FAIR) of disease data through a new SPARQL service and new Faceted Search Interface. The DO-KB is an integrated data system, built upon the DO\u27s semantic disease knowledge backbone, with resources that expose and connect the DO\u27s semantic knowledge with disease-related data across Open Linked Data resources. This update includes descriptions of efforts to assess the DO\u27s global impact and improvements to data quality and content, with emphasis on changes in the last two years
A clinical decision support system for detecting and mitigating potentially inappropriate medications
Background: Medication errors are a leading cause of preventable harm to patients. In older adults, the impact of ageing on the therapeutic effectiveness and safety of drugs is a significant concern, especially for those over 65. Consequently, certain medications called Potentially Inappropriate Medications (PIMs) can be dangerous in the elderly and should be avoided. Tackling PIMs by health professionals and patients can be time-consuming and error-prone, as the criteria underlying the definition of PIMs are complex and subject to frequent updates. Moreover, the criteria are not available in a representation that health systems can interpret and reason with directly.
Objectives: This thesis aims to demonstrate the feasibility of using an ontology/rule-based approach in a clinical knowledge base to identify potentially inappropriate medication(PIM). In addition, how constraint solvers can be used effectively to suggest alternative medications and administration schedules to solve or minimise PIM undesirable side effects.
Methodology: To address these objectives, we propose a novel integrated approach using formal rules to represent the PIMs criteria and inference engines to perform the reasoning presented in the context of a Clinical Decision Support System (CDSS). The approach aims to detect, solve, or minimise undesirable side-effects of PIMs through an ontology (knowledge base) and inference engines incorporating multiple reasoning approaches.
Contributions: The main contribution lies in the framework to formalise PIMs, including the steps required to define guideline requisites to create inference rules to detect and propose alternative drugs to inappropriate medications. No formalisation of the selected guideline (Beers Criteria) can be found in the literature, and hence, this thesis provides a novel ontology for it. Moreover, our process of minimising undesirable side effects offers a novel approach that enhances and optimises the drug rescheduling process, providing a more accurate way to minimise the effect of drug interactions in clinical practice
- …