Search CORE

14 research outputs found

Lessons learned from challenging data science case studies

Author: Braschler Martin
Stadelmann Thilo
Stockinger Kurt
Publication venue: Springer
Publication date: 14/06/2019
Field of study

In this chapter, we revisit the conclusions and lessons learned of the chapters presented in Part II of this book and analyze them systematically. The goal of the chapter is threefold: firstly, it serves as a directory to the individual chapters, allowing readers to identify which chapters to focus on when they are interested either in a certain stage of the knowledge discovery process or in a certain data science method or application area. Secondly, the chapter serves as a digested, systematic summary of data science lessons that are relevant for data science practitioners. And lastly, we reflect on the perceptions of a broader public towards the methods and tools that we covered in this book and dare to give an outlook towards the future developments that will be influenced by them

ZHAW digitalcollection

Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models

Author: Chen Huaming
Huang Yuheng
Ma Lei
Song Jiayang
Wang Zhijie
Publication venue
Publication date: 16/07/2023
Field of study

The recent performance leap of Large Language Models (LLMs) opens up new opportunities across numerous industrial applications and domains. However, erroneous generations, such as false predictions, misinformation, and hallucination made by LLMs, have also raised severe concerns for the trustworthiness of LLMs', especially in safety-, security- and reliability-sensitive scenarios, potentially hindering real-world adoptions. While uncertainty estimation has shown its potential for interpreting the prediction risks made by general machine learning (ML) models, little is known about whether and to what extent it can help explore an LLM's capabilities and counteract its undesired behavior. To bridge the gap, in this paper, we initiate an exploratory study on the risk assessment of LLMs from the lens of uncertainty. In particular, we experiment with twelve uncertainty estimation methods and four LLMs on four prominent natural language processing (NLP) tasks to investigate to what extent uncertainty estimation techniques could help characterize the prediction risks of LLMs. Our findings validate the effectiveness of uncertainty estimation for revealing LLMs' uncertain/non-factual predictions. In addition to general NLP tasks, we extensively conduct experiments with four LLMs for code generation on two datasets. We find that uncertainty estimation can potentially uncover buggy programs generated by LLMs. Insights from our study shed light on future design and development for reliable LLMs, facilitating further research toward enhancing the trustworthiness of LLMs.Comment: 20 pages, 4 figure

arXiv.org e-Print Archive

Deep Learning Methods for Document Image Understanding

Author: Capobianco Samuele
Publication venue
Publication date: 01/01/2020
Field of study

Florence Research

Mathematical Modelling and Machine Learning Methods for Bioinformatics and Data Science Applications

Author
Publication venue: 'MDPI AG'
Publication date: 24/02/2022
Field of study

Mathematical modeling is routinely used in physical and engineering sciences to help understand complex systems and optimize industrial processes. Mathematical modeling differs from Artificial Intelligence because it does not exclusively use the collected data to describe an industrial phenomenon or process, but it is based on fundamental laws of physics or engineering that lead to systems of equations able to represent all the variables that characterize the process. Conversely, Machine Learning methods require a large amount of data to find solutions, remaining detached from the problem that generated them and trying to infer the behavior of the object, material or process to be examined from observed samples. Mathematics allows us to formulate complex models with effectiveness and creativity, describing nature and physics. Together with the potential of Artificial Intelligence and data collection techniques, a new way of dealing with practical problems is possible. The insertion of the equations deriving from the physical world in the data-driven models can in fact greatly enrich the information content of the sampled data, allowing to simulate very complex phenomena, with drastically reduced calculation times. Combined approaches will constitute a breakthrough in cutting-edge applications, providing precise and reliable tools for the prediction of phenomena in biological macro/microsystems, for biotechnological applications and for medical diagnostics, particularly in the field of precision medicine

Directory of Open Access Books (DOAB)

Deep Learning-based Object Detection Models applied to Document Images

Author: Ziran Zahra
Publication venue
Publication date: 01/01/2020
Field of study

Florence Research

Design patterns for resource-constrained automated deep-learning methods

Author: Amirian Mohammadreza
Benites de Azevedo e Souza Fernando
Gupta Prakhar
Schilling Frank-Peter
Stadelmann Thilo
Tuggener Lukas
von Däniken Pius
Publication venue: 'MDPI AG'
Publication date: 06/11/2020
Field of study

We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges, which set the task of optimizing both model accuracy and search efficiency under tight time and computing constraints. We propose structured empirical evaluations as the most promising avenue to obtain design principles for deep-learning systems due to the absence of strong theoretical support. From these evaluations, we distill relevant patterns which give rise to neural network design recommendations. In particular, we establish (a) that very wide fully connected layers learn meaningful features faster; we illustrate (b) how the lack of pretraining in audio processing can be compensated by architecture search; we show (c) that in text processing deep-learning-based methods only pull ahead of traditional methods for short text lengths with less than a thousand characters under tight resource limitations; and lastly we present (d) evidence that in very data- and computing-constrained settings, hyperparameter tuning of more traditional machine-learning methods outperforms deep-learning systems

Multidisciplinary Digital Publishing Institute

Infoscience - École polytechnique fédérale de Lausanne

ZHAW digitalcollection

Jahresbericht Forschung und Transfer 2019

Author: Hochschule Konstanz Technik Wirtschaft und Gestaltung
Publication venue: Konstanz : Hochschule Konstanz
Publication date: 01/01/2020
Field of study

Forschungsjahresbericht 2019 der Hochschule Konstanz Technik, Wirtschaft und Gestaltun

Hochschulschriftenserver der HTWG Konstanz