webLyzard technology gmbh
Not a member yet
102 research outputs found
Sort by
Hybrid AI Models for Structured Mobility Prediction in Metropolitan Areas
This paper introduces hybrid AI models for structured mobility prediction in metropolitan areas, focusing on Vienna, to guide citizens toward greener transportation options. The AI-CENTIVE project explores how AI can identify effective incentives by forecasting future trips using a combination of traditional machine learning and modern deep learning architectures. Trained on a dataset of commuter trips from the Ummadum app, the models predict transport mode, time, origin, destination, distance, and duration. The most accurate predictions trigger notifications suggesting sustainable alternatives. The evaluation of various hybrid architectures revealed that a graph convolutional network that uses statistical patterns achieved the best performance on the analyzed dataset. The presented research contributes to leveraging AI to promote sustainable mobility through targeted incentivization
An Efficient Workflow Towards Improving Classifiers in Low-Resource Settings with Synthetic Data
The correct classification of the 17 Sustainable
Development Goals (SDG) proposed by the
United Nations (UN) is still a challenging and
compelling prospect due to the Shared Task’s
imbalanced dataset. This paper presents a good
method to create a baseline using RoBERTa
and data augmentation that offers a good over-
all performance on this imbalanced dataset.
What is interesting to notice is that even though
the alignment between synthetic gold and real
gold was only marginally better than what
would be expected by chance alone, the final
scores were still okay
Visualizing Large Language Models: A Brief Survey
This paper explores the current landscape of visualizing large language models (LLMs). The main objective was
threefold. Firstly, we investigate how we can visualize LLMspecific techniques such as prompt engineering, instruction tuning, or guidance. Secondly, LLM causality, interpretability, and
explainability are examined through visualization. And finally, we
showcase the role of visualization in illuminating the integration
of multiple modalities. We are interested in discovering the
papers that present visualization systems instead of those that use
visualization to showcase a part of their work. Our survey aims
to synthesize the state-of-the-art in LLM visualization, offering
a compact resource for exploring future research avenues
Scouting out the Border: Leveraging Explainable AI to Generate Synthetic Training Data for SDG Classification
This paper discusses the use of synthetic training data towards training and optimizing a DistilBERT-based classifier for the SwissText 2024 Shared Task which focused on the classification of the United Nation's Sustainable Development Goals (SDGs) in scientific abstracts. The proposed approach uses Large Language Models (LLMs) to generate synthetic training data based on the test data provided by the shared task organizers. We then train a classifier on the synthetic dataset, evaluate the system on gold standard data, and use explainable AI to extract problematic features that caused incorrect classifications. Generating synthetic data that demonstrates the use of the problematic features within the correct class, aids the system in learning based on its past mistakes. An evaluates demonstrates that the suggested approach significantly improves classification performance, yielding the best result for Shared Task 1 according to the accuracy performance metric
Large Language Models versus Foundation Models for Assessing the Future-Readiness of Skills
Automatization, offshoring and the emerging “gig economy” further accelerate changes in the job market leading to significant shifts in required skills. As automation and technology continue to advance, new technical proficiencies such as data analysis, artificial intelligence, and machine learning become increasingly valuable. Recent research, for example, estimates that 60% of occupations contain a significant portion of automatable skills.
The “Future of Work” project uses scientific literature, experts and deep learning to estimate the automatability and offshorability of skills which are assumed to impact their future-readiness. This article investigates the performance of two deep learning methods towards propagating expert and literature assessments on automatability and offshorability to yet unseen skills: (i) a Large Language Model (ChatGPT) with few-shot learning and a heuristic that maps results to the target variables, and (ii) foundation models (BERT, DistilBERT) trained on a gold
standard dataset. An evaluation on expert data provides initial insights into the systems’ performance and outlines the strengths and weaknesses of both approaches
Framing Few-Shot Knowledge Graph Completion with Large Language Models
Knowledge Graph Completion (KGC) from text involves identifying known or unknown entities (nodes) as well as relations (edges) among these entities. Recent work has started to explore the use of Large Language Models (LLMs) for entity detection and relation extraction, due to their Natural Language Understanding (NLU) capabilities. However, LLM performance varies across models and depends on the quality of the prompt engineering. We examine specific relation extraction cases and present a set of examples collected from well-known resources in a small corpus. We provide a set of annotations and identify various issues that occur when using different LLMs for this task. As LLMs will remain a focal point of future KGC research, we conclude with suggestions for improving the KGC process
Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications
Sentic computing relies on well-defined affective models of different complexity - polarity to distinguish positive and negative sentiment, for example, or more nuanced models to capture expressions of human emotions. When used to measure communication success, even the most granular affective model combined with sophisticated machine learning approaches may not fully capture an organisation's strategic positioning goals. Such goals often deviate from the assumptions of standardised affective models. While certain emotions such as Joy and Trust typically represent desirable brand associations, specific communication goals formulated by marketing professionals often go beyond such standard dimensions. For instance, the brand manager of a television show may consider fear or sadness to be desired emotions for its audience. This article introduces expansion techniques for affective models, combining common and commonsense knowledge available in knowledge graphs with language models and affective reasoning, improving coverage and consistency as well as supporting domain-specific interpretations of emotions. An extensive evaluation compares the performance of different expansion techniques: (i) a quantitative evaluation based on the revisited Hourglass of Emotions model to assess performance on complex models that cover multiple affective categories, using manually compiled gold standard data, and (ii) a qualitative evaluation of a domain-specific affective model for television programme brands. The results of these evaluations demonstrate that the introduced techniques support a variety of embeddings and pre-trained models. The paper concludes with a discussion on applying this approach to other scenarios where affective model resources are scarce
Classifying News Media Coverage for Corruption Risks Management with Deep Learning and Web Intelligence
A substantial number of international corporations have been affected by corruption. The research presented in this paper introduces the Integrity Risks Monitor, an analytics dashboard that applies Web Intelligence and Deep Learning to english and german-speaking documents for the task of (i) tracking and visualizing past corruption management gaps and their respective impacts, (ii) understanding present and past integrity issues, (iii) supporting companies in analyzing news media for identifying and mitigating integrity risks. Afterwards, we discuss the design, implementation, training and evaluation of classification components capable of identifying English documents covering the integrity topic of corruption. Domain experts created a gold standard dataset compiled from Anglo-American media coverage on corruption cases that has been used for training and evaluating the classifier. The experiments performed to evaluate the classifiers draw upon popular algorithms used for text classification such as Naïve Bayes, Support Vector Machines (SVM) and Deep Learning architectures (LSTM, BiLSTM, CNN) that draw upon different word embeddings and document representations. They also demonstrate that although classical machine learning approaches such as Naïve Bayes struggle with the diversity of the media coverage on corruption, state-of-the art Deep Learning models perform sufficiently well in the project's context
Introducing Orbis: An Extendable Evaluation Pipeline for Named Entity Linking Drill-Down Analysis
Most current evaluation tools are focused solely on benchmarking and comparative evaluations thus only provide aggregated statistics such as precision, recall and F1-measure to assess overall system performance. They do not offer comprehensive analyses up to the level of individual annotations. This paper introduces Orbis, an extendable evaluation pipeline framework developed to allow visual drill-down analyses of individual entities, computed by annotation services, in the context of the text they appear in, in reference to the entities specified in the gold standard
Improving Named Entity Linking Corpora Quality
Gold standard corpora and competitive evaluations play a key role in benchmarking named entity linking (NEL) performance and driving the development of more sophisticated NEL systems. The quality of the used corpora and the used evaluation metrics are crucial in this process. We, therefore, assess the quality of three popular evaluation corpora, identifying four major issues which affect these gold standards: (i) the use of different annotation styles, (ii) incorrect and missing annotations, (iii) Knowledge Base evolution, (iv) and differences in annotating co-occurrences. This paper addresses these issues by formalizing NEL annotations and corpus versioning which allows standardizing corpus creation, supports corpus evolution, and paves the way for the use of lenses to automatically transform between different corpus configurations. In addition, the use of clearly defined scoring rules and evaluation metrics ensures a better comparability of evaluation results