Search CORE

5 research outputs found

Instance Selection Mechanisms for Human-in-the-Loop Systems in Few-Shot Learning

Author: Blumenstiel Benedikt
Hemmer Patrick
Jakubik Johannes
Vössing Michael
Publication venue: AIS Electronic Library (AISeL)
Publication date: 17/01/2022
Field of study

Business analytics and machine learning have become essential success factors for various industries - with the downside of cost-intensive gathering and labeling of data. Few-shot learning addresses this challenge and reduces data gathering and labeling costs by learning novel classes with very few labeled data. In this paper, we design a human-in-the-loop (HITL) system for few-shot learning and analyze an extensive range of mechanisms that can be used to acquire human expert knowledge for instances that have an uncertain prediction outcome. We show that the acquisition of human expert knowledge significantly accelerates the few-shot model performance given a negligible labeling effort. We validate our findings in various experiments on a benchmark dataset in computer vision and real-world datasets. We further demonstrate the cost-effectiveness of HITL systems for few-shot learning. Overall, our work aims at supporting researchers and practitioners in effectively adapting machine learning models to novel classes at reduced costs

arXiv.org e-Print Archive

AIS Electronic Library (AISeL)

What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation

Author: Blumenstiel Benedikt
Jakubik Johannes
Kühne Hilde
Vössing Michael
Publication venue
Publication date: 27/06/2023
Field of study

While semantic segmentation has seen tremendous improvements in the past, there is still significant labeling efforts necessary and the problem of limited generalization to classes that have not been present during training. To address this problem, zero-shot semantic segmentation makes use of large self-supervised vision-language models, allowing zero-shot transfer to unseen classes. In this work, we build a benchmark for Multi-domain Evaluation of Semantic Segmentation (MESS), which allows a holistic analysis of performance across a wide range of domain-specific datasets such as medicine, engineering, earth monitoring, biology, and agriculture. To do this, we reviewed 120 datasets, developed a taxonomy, and classified the datasets according to the developed taxonomy. We select a representative subset consisting of 22 datasets and propose it as the MESS benchmark. We evaluate eight recently published models on the proposed MESS benchmark and analyze characteristics for the performance of zero-shot transfer models. The toolkit is available at https://github.com/blumenstiel/MESS

arXiv.org e-Print Archive

Instance Selection Mechanisms for Human-in-the-Loop Systems in Few-Shot Learning

Author: Blumenstiel Benedikt
Hemmer Patrick
Jakubik Johannes
Vössing Michael
Publication venue
Publication date: 14/07/2022
Field of study

arXiv.org e-Print Archive

TensorBank:Tensor Lakehouse for Foundation Model Training

Author: Behrendt Michael
Blumenstiel Benedikt
Civitarese Daniel Salles
Freitag Marcus
Hamann Hendrik
Kienzler Romeo
Kimura Daiki
Mukkavilli S. Karthik
Nagy Zoltan Arnold
Schmude Johannes
Simumba Naomi
Publication venue
Publication date: 07/09/2023
Field of study

Storing and streaming high dimensional data for foundation model training became a critical requirement with the rise of foundation models beyond natural language. In this paper we introduce TensorBank, a petabyte scale tensor lakehouse capable of streaming tensors from Cloud Object Store (COS) to GPU memory at wire speed based on complex relational queries. We use Hierarchical Statistical Indices (HSI) for query acceleration. Our architecture allows to directly address tensors on block level using HTTP range reads. Once in GPU memory, data can be transformed using PyTorch transforms. We provide a generic PyTorch dataset type with a corresponding dataset factory translating relational queries and requested transformations as an instance. By making use of the HSI, irrelevant blocks can be skipped without reading them as those indices contain statistics on their content at different hierarchical resolution levels. This is an opinionated architecture powered by open standards and making heavy use of open-source technology. Although, hardened for production use using geospatial-temporal data, this architecture generalizes to other use case like computer vision, computational neuroscience, biological sequence analysis and more

arXiv.org e-Print Archive

Designing a Human-in-the-Loop System for Object Detection in Floor Plans

Author: Bartos Andrea
Blumenstiel Benedikt
Hemmer Patrick
Jakubik Johannes
Mohr Kamilla
Vössing Michael
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 28/06/2022
Field of study

In recent years, companies in the Architecture, Engineering, and Construction (AEC) industry have started exploring how artificial intelligence (AI) can reduce time-consuming and repetitive tasks. One use case that can benefit from the adoption of AI is the determination of quantities in floor plans. This information is required for several planning and construction steps. Currently, the task requires companies to invest a significant amount of manual effort. Either digital floor plans are not available for existing buildings, or the formats cannot be processed due to lack of standardization. In this paper, we therefore propose a human-in-the-loop approach for the detection and classification of symbols in floor plans. The developed system calculates a measure of uncertainty for each detected symbol which is used to acquire the knowledge of human experts for those symbols that are difficult to classify. We evaluate our approach with a real-world dataset provided by an industry partner and find that the selective acquisition of human expert knowledge enhances the model’s performance by up to 12.9%—resulting in an overall prediction accuracy of 92.1% on average. We further design a pipeline for the generation of synthetic training data that allows the systems to be adapted to new construction projects with minimal manual effort. Overall, our work supports professionals in the AEC industry on their journey to the data-driven generation of business value

Association for the Advancement of Artificial Intelligence: AAAI Publications