1,541 research outputs found
Natural Language Processing for the identification of Human factors in aviation accidents causes: An application to the SHEL methodology
Accidents in aviation are rare events. From them, aviation safety management systems take fast and effective remedy actions by performing the analysis of the root causes of accidents, most of which are proved to be human factors. Since the current standard relies on the manual classification performed by trained staff, there are no technical standards already defined for automated human factors identification. This paper considers this issue, proposing machine learning techniques by leveraging on the state-of-the-art technologies of Natural Language Processing. The techniques are then adapted to the Software Hardware Environment Liveware (SHEL) standard accident causality model and tested on a set of real accidents. The computational results show the accuracy and effectiveness of the proposed methodology. Furthermore, the application of the methodology to real documents checked by experts estimates a reduction of the time needed for at least 30% compared to the standard methods of human factors identification
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Construction contract risk identification based on knowledge-augmented language model
Contract review is an essential step in construction projects to prevent
potential losses. However, the current methods for reviewing construction
contracts lack effectiveness and reliability, leading to time-consuming and
error-prone processes. While large language models (LLMs) have shown promise in
revolutionizing natural language processing (NLP) tasks, they struggle with
domain-specific knowledge and addressing specialized issues. This paper
presents a novel approach that leverages LLMs with construction contract
knowledge to emulate the process of contract review by human experts. Our
tuning-free approach incorporates construction contract domain knowledge to
enhance language models for identifying construction contract risks. The use of
a natural language when building the domain knowledge base facilitates
practical implementation. We evaluated our method on real construction
contracts and achieved solid performance. Additionally, we investigated how
large language models employ logical thinking during the task and provide
insights and recommendations for future research
The potential of a data centred approach & knowledge graph data representation in chemical safety and drug design
Big Data pervades nearly all areas of life sciences, yet the analysis of large integrated data sets remains a major challenge. Moreover, the field of life sciences is highly fragmented and, consequently, so is its data, knowledge, and standards. This, in turn, makes integrated data analysis and knowledge gathering across sub-fields a demanding task. At the same time, the integration of various research angles and data types is crucial for modelling the complexity of organisms and biological processes in a holistic manner. This is especially valid in the context of drug development and chemical safety assessment where computational methods can provide solutions for the urgent need of fast, effective, and sustainable approaches. At the same time, such computational methods require the development of methodologies suitable for an inte-grated and data centred Big Data view. Here we discuss Knowledge Graphs (KG) as a solution to a data centred analysis approach for drug and chemical development and safety assessment. KGs are knowledge bases, data analysis engines, and knowledge discovery systems all in one, allowing them to be used from simple data retrieval, over meta-analysis to complex predictive and knowledge discovery systems. Therefore, KGs have immense potential to advance the data centred approach, the re-usability, and infor-mativity of data. Furthermore, they can improve the power of analysis, and the complexity of modelled processes, all while providing knowledge in a natively human understandable network data model. (c) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creative-commons.org/licenses/by-nc-nd/4.0/).Peer reviewe
Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity
Traditional computer vision models often require extensive manual effort for
data acquisition, annotation and validation, particularly when detecting subtle
behavioral nuances or events. The difficulty in distinguishing routine
behaviors from potential risks in real-world applications, such as
differentiating routine shopping from potential shoplifting, further
complicates the process. Moreover, these models may demonstrate high false
positive rates and imprecise event detection when exposed to real-world
scenarios that differ significantly from the conditions of the training data.
To overcome these hurdles, we present Ethosight, a novel zero-shot computer
vision system. Ethosight initiates with a clean slate based on user
requirements and semantic knowledge of interest. Using localized label affinity
calculations and a reasoning-guided iterative learning loop, Ethosight infers
scene details and iteratively refines the label set. Reasoning mechanisms can
be derived from large language models like GPT4, symbolic reasoners like
OpenNARS\cite{wang2013}\cite{wang2006}, or hybrid systems.
Our evaluations demonstrate Ethosight's efficacy across 40 complex use cases,
spanning domains such as health, safety, and security. Detailed results and
case studies within the main body of this paper and an appendix underscore a
promising trajectory towards enhancing the adaptability and resilience of
computer vision models in detecting and extracting subtle and nuanced
behaviors
- …