496 research outputs found
Machine Learning Algorithm for the Scansion of Old Saxon Poetry
Several scholars designed tools to perform the automatic scansion of poetry in many languages, but none of these tools
deal with Old Saxon or Old English. This project aims to be a first attempt to create a tool for these languages. We
implemented a Bidirectional Long Short-Term Memory (BiLSTM) model to perform the automatic scansion of Old Saxon
and Old English poems. Since this model uses supervised learning, we manually annotated the Heliand manuscript, and
we used the resulting corpus as labeled dataset to train the model. The evaluation of the performance of the algorithm
reached a 97% for the accuracy and a 99% of weighted average for precision, recall and F1 Score. In addition, we tested
the model with some verses from the Old Saxon Genesis and some from The Battle of Brunanburh, and we observed that
the model predicted almost all Old Saxon metrical patterns correctly misclassified the majority of the Old English input
verses
Continuous Rationale Management
Continuous Software Engineering (CSE) is a software life cycle model open to frequent changes in requirements or technology. During CSE, software developers continuously make decisions on the requirements and design of the software or the development process. They establish essential decision knowledge, which they need to document and share so that it supports the evolution and changes of the software. The management of decision knowledge is called rationale management. Rationale management provides an opportunity to support the change process during CSE.
However, rationale management is not well integrated into CSE. The overall goal of this dissertation is to provide workflows and tool support for continuous rationale management. The dissertation contributes an interview study with practitioners from the industry, which investigates rationale management problems, current practices, and features to support continuous rationale management beneficial for practitioners. Problems of rationale management in practice are threefold:
First, documenting decision knowledge is intrusive in the development process and an additional effort.
Second, the high amount of distributed decision knowledge documentation is difficult to access and use.
Third, the documented knowledge can be of low quality, e.g., outdated, which impedes its use.
The dissertation contributes a systematic mapping study on recommendation and classification approaches to treat the rationale management problems.
The major contribution of this dissertation is a validated approach for continuous rationale management consisting of the ConRat life cycle model extension and the comprehensive ConDec tool support. To reduce intrusiveness and additional effort, ConRat integrates rationale management activities into existing workflows, such as requirements elicitation, development, and meetings. ConDec integrates into standard development tools instead of providing a separate tool. ConDec enables lightweight capturing and use of decision knowledge from various artifacts and reduces the developers' effort through automatic text classification, recommendation, and nudging mechanisms for rationale management. To enable access and use of distributed decision knowledge documentation, ConRat defines a knowledge model of decision knowledge and other artifacts. ConDec instantiates the model as a knowledge graph and offers interactive knowledge views with useful tailoring, e.g., transitive linking. To operationalize high quality, ConRat introduces the rationale backlog, the definition of done for knowledge documentation, and metrics for intra-rationale completeness and decision coverage of requirements and code. ConDec implements these agile concepts for rationale management and a knowledge dashboard. ConDec also supports consistent changes through change impact analysis.
The dissertation shows the feasibility, effectiveness, and user acceptance of ConRat and ConDec in six case study projects in an industrial setting. Besides, it comprehensively analyses the rationale documentation created in the projects. The validation indicates that ConRat and ConDec benefit CSE projects. Based on the dissertation, continuous rationale management should become a standard part of CSE, like automated testing or continuous integration
3D Design Review Systems in Immersive Environments
Design reviews play a crucial role in the development process, ensuring the quality and effectiveness of designs in various industries. However, traditional design review methods face challenges in effectively understanding and communicating complex 3D models. Immersive technologies, particularly Head-Mounted Displays (HMDs), offer new opportunities to enhance the design review process. In this thesis, we investigate using immersive environments, specifically HMDs, for 3D design reviews. We begin with a systematic literature review to understand the current state of employing HMDs in industry for design reviews. As part of this review, we utilize a detailed taxonomy from the literature to categorize and analyze existing approaches. Additionally, we present four iterations of an immersive design review system developed during my industry experience. Two of these iterations are evaluated through case studies involving domain experts, including engineers, designers, and clients. A formal semi-structured focus group is conducted to gain further insights into traditional design review practices. The outcomes of these evaluations and the focus group discussions are thoroughly discussed. Based on the literature review and the focus group findings, we uncover a new challenge associated with using HMDs in immersive design reviews—asynchronous and remote collaboration. Unlike traditional design reviews, where participants view the same section on a shared screen, HMDs allow independent exploration of areas of interest, leading to a shift from synchronous to asynchronous communication. Consequently, important feedback may be missed as the lead designer disconnects from the users' perspectives. To address this challenge, we collaborate with a domain expert to develop a prototype that utilizes heatmap visualization to display 3D gaze data distribution. This prototype enables lead designers to quickly identify areas of review and missed regions. The study incorporates the Design Critique approach and provides valuable insights into different heatmap visualization variants (top view projection, object-based, and volume-based). Furthermore, a list of well-defined requirements is outlined for future spatio-temporal visualization applications aimed at integrating into existing workflows. Overall, this thesis contributes to the understanding and improvement of immersive design review systems, particularly in the context of utilizing HMDs. It offers insights into the current state of employing HMDs for design reviews, utilizes a taxonomy from the literature to analyze existing approaches, highlights challenges associated with asynchronous collaboration, and proposes a prototype solution with heatmap visualization to address the identified challenge
The text classification pipeline: Starting shallow, going deeper
An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC.An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC
Bildung in der digitalen Transformation
Die Coronapandemie und der durch sie erzwungene zeitweise Übergang von Präsenz- zu Distanzlehre haben die Digitalisierung des Bildungswesens enorm vorangetrieben. Noch deutlicher als vorher traten dabei positive wie negative Aspekte dieser Entwicklung zum Vorschein. Während den Hochschulen der Wechsel mit vergleichsweise geringen Reibungsverlusten gelang, offenbarten sich diese an Schulen weitaus deutlicher. Trotz aller Widrigkeiten erscheint eines klar: Die zeitweisen Veränderungen werden Nachwirkungen zeigen. Eine völlige Rückkehr zum Status quo ante ist kaum noch vorstellbar. Zwei Fragen bestimmen vor diesem Hintergrund die Doppelgesichtigkeit des Themas der 29. Jahrestagung der Gesellschaft für Medien in der Wissenschaft (GMW). Erstens: Wie ‚funktioniert‘ Bildung in der sich derzeit ereignenden digitalen Transformation und welche Herausforderungen gibt es? Und zweitens: Befindet sich möglicherweise Bildung selbst in der Transformation? Beiträge zu diesen und weiteren Fragen vereint der vorliegende Tagungsband
Recent Developments in Recommender Systems: A Survey
In this technical survey, we comprehensively summarize the latest
advancements in the field of recommender systems. The objective of this study
is to provide an overview of the current state-of-the-art in the field and
highlight the latest trends in the development of recommender systems. The
study starts with a comprehensive summary of the main taxonomy of recommender
systems, including personalized and group recommender systems, and then delves
into the category of knowledge-based recommender systems. In addition, the
survey analyzes the robustness, data bias, and fairness issues in recommender
systems, summarizing the evaluation metrics used to assess the performance of
these systems. Finally, the study provides insights into the latest trends in
the development of recommender systems and highlights the new directions for
future research in the field
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
A Reinforcement Learning-based Framework for Proactive Supply Chain Risk Identification
Over the past few decades, global supply chains (GSCs) have seen a significant increase with the widespread adoption of digital technologies and improved trade policies. GSCs are a network of organisations or individuals across the world involved in producing and delivering goods and services to customers. While this globalisation and the use of global technologies have increased the efficiency of supply chain operations, it has also exposed them to various additional uncertainties and risk types that can negatively impact their operations. Thus, for GSCs to function properly, such uncertainties must be managed. Hence, supply chain risk management is critical in the smooth operation of GSCs. The first task in supply chain risk management is risk identification, where risk managers identify the risk events that may negatively impact their operations for further analysis. It is crucial that risk identification is undertaken in a timely manner so that risk managers can be proactive in managing the possible impacts of the identified risks on their operations. This task can be done manually which is tedious and time-consuming, however, with the increased sophistication and capability of artificial intelligence (AI), there is a potential for AI algorithms to be used to enhance the efficacy and efficiency of this task.
A review of the existing literature detailed in this thesis highlights that while AI has been widely employed in different disciplines, it has shortcomings which are specific to the area of risk identification in supply chains. In other words, the majority of the existing risk identification techniques in supply chain risk management are either reactive or predictive in their working nature. This means that such techniques either identify the risk events after they occur or predict future occurrences of the known risk events based on their past pattern of occurrences. However, as emphasised in this thesis, for the supply chain risk identification process to be effective and comprehensive, it has to be proactive in its working nature rather than reactive or predictive. By being proactive, the risk identification techniques aim to identify beforehand known or unknown events of risks that have the potential to occur and negatively impact an activity. The analysis obtained assists the risk manager to perform the steps of risk analysis and risk evaluation on the identified risks before developing plans to manage them. Existing literature on supply chain risk identification lacks techniques to achieve this aim.
To address this gap in the literature, this thesis develops a framework, namely Reinforcement Learning-based Supply Chain Risk Identification, which assists risk managers in automatedly and accurately identifying the risk events that may have the potential to impact their operations and bring them to his/her attention for further follow up. The proposed framework adopts the science and engineering research approach and four different frameworks are developed that identify the risk events of interest to the risk manager, extract related news articles on these risk events and analyse them, before recommending the most important news articles to the risk manager for follow-up actions. The functionality and viability of these prototypes are validated by experiments and systematised by a supply chain case study to highlight their effectiveness
- …