1,623 research outputs found
Semantic-guided predictive modeling and relational learning within industrial knowledge graphs
The ubiquitous availability of data in today’s manufacturing environments, mainly driven by the extended usage of software and built-in sensing capabilities in automation systems, enables companies to embrace more advanced predictive modeling and analysis in order to optimize processes and usage of equipment. While the potential insight gained from such analysis is high, it often remains untapped, since integration and analysis of data silos from different production domains requires high manual effort and is therefore not economic. Addressing these challenges, digital representations of production equipment, so-called digital twins, have emerged leading the way to semantic interoperability across systems in different domains. From a data modeling point of view, digital twins can be seen as industrial knowledge graphs, which are used as semantic backbone of manufacturing software systems and data analytics. Due to the prevalent historically grown and scattered manufacturing software system landscape that is comprising of numerous proprietary information models, data sources are highly heterogeneous. Therefore, there is an increasing need for semi-automatic support in data modeling, enabling end-user engineers to model their domain and maintain a unified semantic knowledge graph across the company. Once the data modeling and integration is done, further challenges arise, since there has been little research on how knowledge graphs can contribute to the simplification and abstraction of statistical analysis and predictive modeling, especially in manufacturing.
In this thesis, new approaches for modeling and maintaining industrial knowledge graphs with focus on the application of statistical models are presented. First, concerning data modeling, we discuss requirements from several existing standard information models and analytic use cases in the manufacturing and automation system domains and derive a fragment of the OWL 2 language that is expressive enough to cover the required semantics for a broad range of use cases. The prototypical implementation enables domain end-users, i.e. engineers, to extend the basis ontology model with intuitive semantics. Furthermore it supports efficient reasoning and constraint checking via translation to rule-based representations. Based on these models, we propose an architecture for the end-user facilitated application of statistical models using ontological concepts and ontology-based data access paradigms.
In addition to that we present an approach for domain knowledge-driven preparation of predictive models in terms of feature selection and show how schema-level reasoning in the OWL 2 language can be employed for this task within knowledge graphs of industrial automation systems. A production cycle time prediction model in an example application scenario serves as a proof of concept and demonstrates that axiomatized domain knowledge about features can give competitive performance compared to purely data-driven ones. In the case of high-dimensional data with small sample size, we show that graph kernels of domain ontologies can provide additional information on the degree of variable
dependence. Furthermore, a special application of feature selection in graph-structured data is presented and we develop a method that allows to incorporate domain constraints derived from meta-paths in knowledge graphs in a branch-and-bound pattern enumeration algorithm.
Lastly, we discuss maintenance of facts in large-scale industrial knowledge graphs focused on latent variable models for the automated population and completion of missing facts. State-of-the art approaches can not deal with time-series data in form of events that naturally occur in industrial applications. Therefore we present an extension of learning knowledge graph embeddings in conjunction with data in form of event logs. Finally, we design several use case scenarios of missing information and evaluate our embedding approach on data coming from a real-world factory environment.
We draw the conclusion that industrial knowledge graphs are a powerful tool that can be used by end-users in the manufacturing domain for data modeling and model validation.
They are especially suitable in terms of the facilitated application of statistical models in conjunction with background domain knowledge by providing information about features upfront. Furthermore, relational learning approaches showed great potential to semi-automatically infer missing facts and provide recommendations to production operators on how to keep stored facts in synch with the real world
Graph Kernels and Applications in Bioinformatics
In recent years, machine learning has emerged as an important discipline. However, despite the popularity of machine learning techniques, data in the form of discrete structures are not fully exploited. For example, when data appear as graphs, the common choice is the transformation of such structures into feature vectors. This procedure, though convenient, does not always effectively capture topological relationships inherent to the data; therefore, the power of the learning process may be insufficient. In this context, the use of kernel functions for graphs arises as an attractive way to deal with such structured objects.
On the other hand, several entities in computational biology applications, such as gene products or proteins, may be naturally represented by graphs. Hence, the demanding need for algorithms that can deal with structured data poses the question of whether the use of kernels for graphs can outperform existing methods to solve specific computational biology problems. In this dissertation, we address the challenges involved in solving two specific problems in computational biology, in which the data are represented by graphs.
First, we propose a novel approach for protein function prediction by modeling proteins as graphs. For each of the vertices in a protein graph, we propose the calculation of evolutionary profiles, which are derived from multiple sequence alignments from the amino acid residues within each vertex. We then use a shortest path graph kernel in conjunction with a support vector machine to predict protein function. We evaluate our approach under two instances of protein function prediction, namely, the discrimination of proteins as enzymes, and the recognition of DNA binding proteins. In both cases, our proposed approach achieves better prediction performance than existing methods.
Second, we propose two novel semantic similarity measures for proteins based on the gene ontology. The first measure directly works on the gene ontology by combining the pairwise semantic similarity scores between sets of annotating terms for a pair of input proteins. The second measure estimates protein semantic similarity using a shortest path graph kernel to take advantage of the rich semantic knowledge contained within ontologies. Our comparison with other methods shows that our proposed semantic similarity measures are highly competitive and the latter one outperforms state-of-the-art methods. Furthermore, our two methods are intrinsic to the gene ontology, in the sense that they do not rely on external sources to calculate similarities
Targeting occupant feedback using digital twins: Adaptive spatial-temporal thermal preference sampling to optimize personal comfort models
Collecting intensive longitudinal thermal preference data from building
occupants is emerging as an innovative means of characterizing the performance
of buildings and the people who use them. These techniques have occupants
giving subjective feedback using smartphones or smartwatches frequently over
the course of days or weeks. The intention is that the data will be collected
with high spatial and temporal diversity to best characterize a building and
the occupant's preferences. But in reality, leaving the occupant to respond in
an ad-hoc or fixed interval way creates unneeded survey fatigue and redundant
data. This paper outlines a scenario-based (virtual experiment) method for
optimizing data sampling using a smartwatch to achieve comparable accuracy in a
personal thermal preference model with fewer data. This method uses
BIM-extracted spatial data and Graph Neural Network-based (GNN) modeling to
find regions of similar comfort preference to identify the best scenarios for
triggering the occupant to give feedback. This method is compared to two
baseline scenarios that use conventional zoning and a generic 4x4 square meter
grid method from two field-based data sets. The results show that the proposed
Build2Vec method has an 18-23\% higher overall sampling quality than the
spaces-based and square-grid-based sampling methods. The Build2Vec method also
performs similar to the baselines when removing redundant occupant feedback
points but with better scalability potential
Interaction-Aware Personalized Vehicle Trajectory Prediction Using Temporal Graph Neural Networks
Accurate prediction of vehicle trajectories is vital for advanced driver
assistance systems and autonomous vehicles. Existing methods mainly rely on
generic trajectory predictions derived from large datasets, overlooking the
personalized driving patterns of individual drivers. To address this gap, we
propose an approach for interaction-aware personalized vehicle trajectory
prediction that incorporates temporal graph neural networks. Our method
utilizes Graph Convolution Networks (GCN) and Long Short-Term Memory (LSTM) to
model the spatio-temporal interactions between target vehicles and their
surrounding traffic. To personalize the predictions, we establish a pipeline
that leverages transfer learning: the model is initially pre-trained on a
large-scale trajectory dataset and then fine-tuned for each driver using their
specific driving data. We employ human-in-the-loop simulation to collect
personalized naturalistic driving trajectories and corresponding surrounding
vehicle trajectories. Experimental results demonstrate the superior performance
of our personalized GCN-LSTM model, particularly for longer prediction
horizons, compared to its generic counterpart. Moreover, the personalized model
outperforms individual models created without pre-training, emphasizing the
significance of pre-training on a large dataset to avoid overfitting. By
incorporating personalization, our approach enhances trajectory prediction
accuracy
Proceedings of the 2020 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory
In 2020 fand der jährliche Workshop des Faunhofer IOSB und the Lehrstuhls für interaktive Echtzeitsysteme statt. Vom 27. bis zum 31. Juli trugen die Doktorranden der beiden Institute über den Stand ihrer Forschung vor in Themen wie KI, maschinellen Lernen, computer vision, usage control, Metrologie vor. Die Ergebnisse dieser Vorträge sind in diesem Band als technische Berichte gesammelt
- …