Search CORE

80 research outputs found

Municipal wastewater treatment by microsieving, microfiltration and forward osmosis : Concepts and potentials

Author: HEY TOBIAS
Publication venue: Department of Chemical Engineering, Lund University
Publication date: 11/11/2016
Field of study

Conventional wastewater treatment plants are designed for treating manmade wastewater (e.g., from households and industries) and to protect the environment (e.g., receiving water bodies) and humans from adverse effects.The objective of this work was to investigate the feasibility of treating municipal wastewater without a biological treatment step by applying different separation processes, such as microsieving, microfiltration and forward osmosis. The scope of this work was to treat municipal wastewater with a lower area demand while meeting the Swedish wastewater discharge requirements and allowing for the integration of the new separation techniques with existing full-scale wastewater treatment plants. To achieve these goals, pilot-plant and bench scale studies were conducted using raw municipal wastewater on-site at a full-scale wastewater treatment plant.Two different treatment concepts were identified to be feasible for municipal wastewater treatment based on the experimental findings. The first concept comprised coagulation and anionic flocculation before microsieving with subsequent microfiltration. The second concept only included microsieving and forward osmosis. Both concepts were evaluated for their specific electricity, energy and area demands, including sludge treatment, and were compared with five existing conventional wastewater treatment plants.Both concepts complied with the Swedish wastewater discharge demands for only small- and medium-sized wastewater treatment plants because up to only 65% of the nitrogen was retained. Nevertheless, both concepts achieved high retentions, with ≥96% for biochemical oxygen demand, ≥94% for chemical oxygen demand, and ≥99% for total phosphorus. Furthermore, the evaluation of both concepts showed that the specific electricity demand was 30% lower than the average specific electricity demand for 105 traditional Swedish wastewater treatment plants with population sizes of 1 500-10 000. In addition, the specific area demand could be reduced by at least 73% for existing wastewater treatment plants supporting the same population or a population of equivalent magnitude. Moreover, the results indicated that the new method had positive effects on electricity and energy due to the increased biogas potential compared to conventional wastewater treatment

Lund University Publications

Carbon utilisation for extended nitrogen removal and resource savings

Author: Hey Tobias
Publication venue: Lund University (Media-Tryck)
Publication date: 01/01/2013
Field of study

A full-scale in-line primary sludge hydrolysis experiment was conducted in one out of four primary settlers at the Klagshamn Wastewater Treatment Plant (WWTP) to test if the wastewater quality can be improved in terms of providing easily accessible carbon for possible pre-denitrification and the reduction of external carbon sources. The amount of easily accessible carbon produced, in the form of the volatile fatty acid (VFA), alkalinity and ammonium concentrations, was measured throughout the entire full-scale experiment at the outlet of the hydrolysis tank and that of the ordinary primary settler, which served as a reference line. VFA concentrations were measured in wastewater and hydrolysate samples using three analytical methods: the 5 and 8 pH point titration methods and gas chromatography. A calibrated model was established to fit data regarding the Klagshamn WWTP’s annual activated sludge operation of its secondary settler and wastewater composition. For modelling purposes and due to the small amount of data available, a linear regression method was established and used to complete the annual data set of the wastewater entering the Klagshamn WWTP. The full-scale data were incorporated into the calibrated model to simulate different scenarios of the activated sludge process with the purpose of saving energy (electricity) and resources (ethanol). Furthermore, an environmental (CO2-emissions) and economic evaluation was performed based on the data gathered from the full-scale experiment. A VFA concentration of 43 mgCODVFA∙l-1 with no release of ammonium was achieved in the full-scale hydrolysis experiment; this amount was shown, by simulation, to substitute for 50% of the concentration of ethanol currently used. The amount of ethanol saved represents an equivalent electricity saving of 19 MWh for ethanol production, and the operation of fewer nitrification zones, while still maintaining full nitrification over two summer months, could ensure an additional electricity saving of 177 MWh. The evaluation and comparison of the results obtained using the three techniques showed that the 5 pH point titrimetric method was adequate and sufficiently accurate in this context to monitor VFA concentrations below 100 mg∙l-1 at an alkalinity of 300 mgCaCO3∙l-1. The method can be easily implemented in the routine laboratory of the WWTP, and the measured VFA concentrations are equivalent to those obtained by gas chromatography. For the Klagshamn WWTP, the modelling results and further evaluations showed that in-line primary sludge hydrolysis can decrease the dependence on external carbon utilization and can thereby reduce chemical costs and carbon dioxide emissions

Lund University Publications

INDIRECT: Intent-driven Requirements-to-code Traceability

Author: Hey Tobias
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2019
Field of study

Traceability information is important for software maintenance, change impact analysis, software reusability, and other software engineering tasks. However, manually generating this information is costly. State-of-the-art automation approaches suffer from their imprecision and domain dependence. I propose INDIRECT, an intent-driven approach to automated requirements-to-code traceability. It combines natural language understanding and program analysis to generate intent models for both requirements and source code. Then INDIRECT learns a mapping between the two intent models. I expect that using the two intent models as base for the mapping poses a more precise and general approach. The intent models contain information such as the semantics of the statements, underlying concepts, and relations between them. The generation of the requirements intent model is divided into smaller subtasks by using an iterative natural language understanding. Likewise, the intent model for source code is built iteratively by identifying and understanding semantically related source code chunks

Crossref

KITopen

Coreference Resolution for Software Architecture Documentation

Author: Dao Quang Nhat
Hey Tobias
Publication venue: Karlsruher Institut für Technologie
Publication date: 05/07/2022
Field of study

In der Softwareentwicklung spielt die Softwarearchitekturdokumentation eine wichtige Rolle. Sie enthält viele wichtige Informationen über Gründe und Entwurfsentscheidungen. Daher gibt es viele Aktivitäten, die sich aus verschiedenen Gründen mit der Dokumentation befassen, z. B. um Informationen zu extrahieren oder verschiedene Formen der Dokumentation konsistent zu halten. Diese Aktivitäten beinhalten oft eine automatische Verarbeitung der Dokumentation, z. B. Traceability Link Recovery (TLR). Bei der automatischen Verarbeitung kann es jedoch zu Problemen kommen, wenn in der Dokumentation Koreferenzen vorhanden sind. Eine Koreferenz liegt vor, wenn sich zwei oder mehr Erwähnungen auf dieselbe Entität beziehen. Diese Erwähnungen können unterschiedlich sein und zu Mehrdeutigkeiten führen, z. B. wenn es sich um Pronomen handelt. Um dieses Problem zu lösen, werden in dieser Arbeit zwei Beiträge zur Koreferenzauflösung in der Softwarearchitekturdokumentation vorgeschlagen. Der erste Beitrag besteht darin, die Leistungsfähigkeit bestehender Modelle zur Koreferenzauflösung in der Softwarearchitekturdokumentation zu untersuchen. Der zweite Beitrag besteht darin, die Koreferenzauflösung in viele spezifischere Arten von Auflösungen zu unterteilen, wie die Pronomenauflösung, Abkürzungenauflösung usw. Für jede Kombination von spezifischen Auflösungen haben wir einen spezifischen Ansatz. Um die Arbeit dieser Abschlussarbeit zu evaluieren, werden wir uns zunächst ansehen, wie die Ansätze für die Koreferenzauflösung in der Softwarearchitekturdokumentation abschneiden. Hier erreicht Hobbs+Naive, eine Kombination aus Hobbs’ Algorithmus und naiver Nicht-Pronomen-Auflösung, einen F1-Score von 63%. StanfordCoreNLP_Deterministic, ein deterministisches System zur Koreferenzauflösung von Stanford CoreNLP, erreicht dagegen 59%. Dann wollen wir sehen, wie gut die Ansätze die Koreferenzen für eine bestimmte Aktivität, nämlich TLR, auflösen. StanfordCoreNLP_Deterministic erreicht einen F1-Score von 63%, während Hobbs+Naive 59% für diesen Aspekt erreicht. Da Koreferenzen von Pronomen eines der größten Probleme bei TLR sind, bewerten wir schließlich auch, wie die Ansätze bei der Pronomenauflösung abschneiden. In diesem Fall erreicht die Kombination mit Hobbs’ Algorithmus als Pronomenauflösungsmodell einen F1-Score von 74%, während StanfordCoreNLP_Neural nur 71% erreicht. Zusammenfassend lässt sich sagen, dass die Kombinationsansätze eine bessere Leistung bei der Koreferenzauflösung in der Softwarearchitekturdokumentation erbringen. Außerdem schneiden sie bei der Pronomenauflösung für TLR besser ab als die bestehenden Modellansätze. Nichtsdestotrotz sind die bestehenden Modellansätze bei der Koreferenzauflösung für TLR überlegen

KITopen

Automatische Wiederherstellung von Nachverfolgbarkeit zwischen Anforderungen und Quelltext

Author: Hey Tobias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 21/09/2023
Field of study

Für die effiziente Entwicklung, Wartung und Pflege von Softwaresystemen spielt ein umfassendes Verständnis der Zusammenhänge zwischen den Softwareentwicklungsartefakten eine entscheidende Rolle. Die Nachverfolgbarkeit dieser Zusammenhänge ermöglicht es beispielsweise, vergangene Entwurfsentscheidungen nachzuvollziehen oder die Auswirkungen von Änderungen zu berücksichtigen. Das manuelle Erstellen und Pflegen dieser Nachverfolgbarkeitsinformationen ist allerdings mit hohem manuellem Aufwand und damit potenziell hohen Kosten verbunden, da meist menschliche Expertise zum Verständnis der Beziehungen erforderlich ist. Dies sorgt dafür, dass in den meisten Softwareprojekten diese Informationen nicht zur Verfügung stehen. Könnten Nachverfolgbarkeitsinformationen zwischen Softwareartefakten allerdings automatisch generiert werden, könnte die Entwicklung, Wartung und Pflege einer Vielzahl von Softwaresystemen effizienter gestaltet werden. Bestehende Ansätze zur automatischen Wiederherstellung von Nachverfolgbarkeitsverbindungen zwischen Anforderungen und Quelltext sind nicht in der Lage, die semantische Lücke zwischen den Artefakten zu überbrücken. Sie erzielen zu geringe Präzisionen auf akzeptablen Ausbeuteniveaus, um in der Praxis eingesetzt werden zu können. Das in dieser Arbeit vorgestellte Verfahren FTLR zielt durch einen semantischen Ähnlichkeitsvergleich auf eine Verbesserung der automatischen Wiederherstellung von Nachverfolgbarkeitsverbindungen zwischen Anforderungen und Quelltext ab. FTLR setzt hierzu vortrainierte fastText-Worteinbettungen zur Repräsentation der Semantik ein. Außerdem macht es sich strukturelle Informationen der Anforderungen und des Quelltextes zunutze, indem es anstatt auf Artefaktebene auf Ebene der Teile der Anforderungen und des Quelltextes abbildet. Diese Abbildung geschieht durch den Einsatz der Wortüberführungsdistanz, welche einen semantischen Ähnlichkeitsvergleich, der nicht durch Aggregation verfälscht wird, ermöglicht. Die eigentliche Bestimmung der Nachverfolgbarkeitsverbindungen erfolgt daraufhin durch einen Mehrheitsentscheid über alle feingranularen Zusammenhänge eines Artefakts, um die vorherrschenden Aspekte zu bestimmen und ggf. irrelevante Zusammenhänge zu ignorieren. In einem Experiment auf sechs Vergleichsdatensätzen konnte gezeigt werden, dass der Einsatz der Wortüberführungsdistanz gegenüber einer einfachen, aggregierten Vektorabbildung zu einer signifikanten Verbesserung der Identifikation von Nachverfolgbarkeitsverbindungen führt. Ebenso zeigte die Abbildung auf feingranularer Ebene mit anschließender Aggregation durch einen Mehrheitsentscheid signifikante Verbesserungen gegenüber der direkten Abbildung auf Artefaktebene. Um die Präzision FTLRs weiter zu erhöhen, wird ein Ansatz zum Filtern von irrelevanten Teilen von Anforderungen eingesetzt. Dieser basiert auf einer Klassifikation der Anforderungselemente mittels eines sprachmodellbasierten Klassifikators. Entscheidend für die Anwendung in FTLR ist dabei eine Anwendbarkeit auf ungesehene Projekte. Der vorgestellte Klassifikator NoRBERT nutzt Transferlernen, um große vortrainierte BERT-Sprachmodelle auf die Klassifikation von Anforderungen feinanzupassen. Hierdurch ist NoRBERT in der Lage, vielversprechende Ergebnisse auf ungesehenen Projekten zu erzielen. Das Verfahren war in der Lage, auf ungesehenen Projekten eine Abbildungsgüte von bis zu 89,8 % im F1-Maß zu erzielen. Durch das Bestimmen, ob ein Anforderungselement keine funktionalen Aspekte enthält, lassen sich irrelevante Teile der Anforderungen vor der Verarbeitung durch FTLR herausfiltern. Ein Vergleich der Leistung FTLRs mit und ohne einen derartigen Anforderungselementfilter ergab, dass ein signifikanter Leistungszuwachs im F1-Maß durch das Filtern erzielt werden kann. FTLR erzielt hierbei Werte im F1-Maß von bis zu 55,5 % und im Mittelwert der durchschnittlichen Präzision von 59,6 %. Neben der Repräsentation der Semantik durch ausschließlich auf natürlichsprachlichem Text vortrainierten Worteinbettungen wurden außerdem bimodale Sprachmodelle für den Einsatz in FTLR untersucht. Diese auf großen dualen Korpora, bestehend aus Quelltextmethoden und ihrer natürlichsprachlichen Dokumentation, vortrainierten Sprachmodelle erzielen in verwandten Aufgabenstellungen aus der Softwaretechnik, wie Quelltextsuche oder Fehlerlokalisierung, vielversprechende Ergebnisse. Um die Eignung für die automatische Wiederherstellung von Nachverfolgbarkeitsverbindungen zwischen Anforderungen und Quelltext zu untersuchen, wurden zwei Integrationsmöglichkeiten des bimodalen Sprachmodells UniXcoder in FTLR entwickelt. In einem Vergleich auf fünf Datensätzen zur Wiederherstellung von Nachverfolgbarkeitsverbindungen zwischen Anforderungen und Quelltext konnte kein Leistungszuwachs durch den Einsatz dieser Art von Modellen gegenüber den leichtgewichtigeren Worteinbettungen festgestellt werden. Abschließend wurde die Leistung FTLRs in Bezug zu bestehenden Ansätzen zur unüberwachten automatischen Wiederherstellung von Nachverfolgbarkeitsverbindungen zwischen Anforderungen und Quelltext gesetzt. Hierbei zeigt sich, dass FTLR auf Projekten, die ausschließlich objektorientierten Quelltext enthalten, eine höhere durchschnittliche Präzision und ein höheres F1-Maß als bestehende Verfahren erzielt. Allerdings verdeutlichen die Ergebnisse auch, dass, insbesondere auf großen Projekten, alle bestehenden Ansätze und auch FTLR noch weit von einer Abbildungsgüte entfernt sind, die es für eine vollständige Automatisierung der Wiederherstellung von Nachverfolgbarkeitsverbindungen in der Praxis benötigt

KITopen

Knowledge-based Sense Disambiguation of Multiword Expressions in Requirements Documents

Author: Hey Tobias
Keim Jan
Tichy Walter F.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 02/11/2021
Field of study

Understanding the meaning and the senses of expressions is essential to analyze natural language requirements. Disambiguation of expressions in their context is needed to prevent misinterpretations. Current knowledge-based disambiguation approaches only focus on senses of single words and miss out on linking the shared meaning of expressions consisting of multiple words. As these expressions are common in requirements, we propose a sense disambiguation approach that is able to detect and disambiguate multiword expressions. We use a two-tiered approach to be able to use different techniques for detection and disambiguation. Initially, a conditional random field detects multiword expressions. Afterwards, the approach disambiguates these expressions and retrieves the corresponding senses using a knowledge-based approach. The knowledge-based approach has the benefit that only the knowledge base has to be exchanged to adapt the approach to new domains and knowledge. Our approach is able to detect multiword expressions with an

\text{F}_{1}

-score of 88.4% in an evaluation on 997 requirement sentences. The sense disambiguation achieves up to 57%

\text{F}_{1}

-score

KITopen

Recovering Trace Links Between Software Documentation And Code

Author: Corallo Sophie
Fuchß Dominik
Hey Tobias
Keim Jan
Koziolek Anne
Telge Tobias
Publication venue
Publication date: 18/12/2023
Field of study

Introduction Software development involves creating various artifacts at different levels of abstraction and establishing relationships between them is essential. Traceability link recovery (TLR) automates this process, enhancing software quality by aiding tasks like maintenance and evolution. However, automating TLR is challenging due to semantic gaps resulting from different levels of abstraction. While automated TLR approaches exist for requirements and code, architecture documentation lacks tailored solutions, hindering the preservation of architecture knowledge and design decisions. Methods This paper presents our approach TransArC for TLR between architecture documentation and code, using componentbased architecture models as intermediate artifacts to bridge the semantic gap. We create transitive trace links by combining the existing approach ArDoCo for linking architecture documentation to models with our novel approach ArCoTL for linking architecture models to code. Results We evaluate our approaches with five open-source projects, comparing our results to baseline approaches. The model-to-code TLR approach achieves an average F1-score of 0.98, while the documentation-to-code TLR approach achieves a promising average F1-score of 0.82, significantly outperforming baselines. Conclusion Combining two specialized approaches with an intermediate artifact shows promise for bridging the semantic gap. In future research, we will explore further possibilities for such transitive approaches

KITopen

Recovering Trace Links Between Software Documentation And Code

Author: Corallo Sophie
Fuchß Dominik
Hey Tobias
Keim Jan
Koziolek Anne
Telge Tobias
Publication venue
Publication date: 18/12/2023
Field of study

KITopen

A Taxonomy for Design Decisions in Software Architecture Documentation

Author: Hey Tobias
Keim Jan
Koziolek Anne
Sauer Bjarne
Publication venue
Publication date: 16/08/2022
Field of study

A software system is the result of all design decisions that were made during development and maintenance. Documentation, such as software architecture documentation, captures a variety of diﬀerent design decisions. Classifying the kinds of design decisions facilitates various downstream tasks by enabling more targeted analyses. In this paper, we propose a taxonomy for design decisions in software architecture documentation to primarily support consistency checking. Existing taxonomies about design decisions have diﬀerent purposes and do not fit well because they are too coarse. We take an iterative approach, starting with an initial taxonomy based on literature and considerations regarding consistency checking. Then, we mine open-source repositories to extract 17 software architecture documentations that we use to refine the taxonomy. We evaluate the resulting taxonomy with regard to purpose, structure, and application. Additionally, we explore the automatic identification and classification of design decisions in software architecture documentation according to the taxonomy. We apply diﬀerent machine learning techniques, such as Logistic Regression, Decision Trees, Random Forests, and BERT to the 17 software architecture documentations. The evaluation yields a F1-score of up to 92.1% for identifying design decisions and a F1-score of up to 55.2% for the classification of the kind of design decision

KITopen

Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations

Author: Chen Fei
Hey Tobias
Tichy Walter F.
Weigelt Sebastian
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 26/11/2021
Field of study

Traceability information is a fundamental prerequisite for many essential software maintenance and evolution tasks, such as change impact and software reusability analyses. However, manually generating traceability information is costly and error-prone. Therefore, researchers have developed automated approaches that utilize textual similarities between artifacts to establish trace links. These approaches tend to achieve low precision at reasonable recall levels, as they are not able to bridge the semantic gap between high-level natural language requirements and code. We propose to overcome this limitation by leveraging fine-grained, method and sentence level, similarities between the artifacts for traceability link recovery. Our approach uses word embeddings and a Word Mover\u27s Distance-based similarity to bridge the semantic gap. The fine-grained similarities are aggregated according to the artifacts structure and participate in a majority vote to retrieve coarse-grained, requirement-to-class, trace links. In a comprehensive empirical evaluation, we show that our approach is able to outperform state-of-the-art unsupervised traceability link recovery approaches. Additionally, we illustrate the benefits of fine-grained structural analyses to word embedding-based trace link generation

KITopen