11 research outputs found
Contrastive Learning for API Aspect Analysis
We present a novel approach - CLAA - for API aspect detection in API reviews
that utilizes transformer models trained with a supervised contrastive loss
objective function. We evaluate CLAA using performance and impact analysis. For
performance analysis, we utilized a benchmark dataset on developer discussions
collected from Stack Overflow and compare the results to those obtained using
state-of-the-art transformer models. Our experiments show that contrastive
learning can significantly improve the performance of transformer models in
detecting aspects such as Performance, Security, Usability, and Documentation.
For impact analysis, we performed empirical and developer study. On a randomly
selected and manually labeled 200 online reviews, CLAA achieved 92% accuracy
while the SOTA baseline achieved 81.5%. According to our developer study
involving 10 participants, the use of 'Stack Overflow + CLAA' resulted in
increased accuracy and confidence during API selection. Replication package:
https://github.com/shahariar-shibli/Contrastive-Learning-for-API-Aspect-AnalysisComment: Accepted in the 38th IEEE/ACM International Conference on Automated
Software Engineering (ASE2023
Assorted, Archetypal and Annotated Two Million (3A2M) Cooking Recipes Dataset based on Active Learning
Cooking recipes allow individuals to exchange culinary ideas and provide food
preparation instructions. Due to a lack of adequate labeled data, categorizing
raw recipes found online to the appropriate food genres is a challenging task
in this domain. Utilizing the knowledge of domain experts to categorize recipes
could be a solution. In this study, we present a novel dataset of two million
culinary recipes labeled in respective categories leveraging the knowledge of
food experts and an active learning technique. To construct the dataset, we
collect the recipes from the RecipeNLG dataset. Then, we employ three human
experts whose trustworthiness score is higher than 86.667% to categorize 300K
recipe by their Named Entity Recognition (NER) and assign it to one of the nine
categories: bakery, drinks, non-veg, vegetables, fast food, cereals, meals,
sides and fusion. Finally, we categorize the remaining 1900K recipes using
Active Learning method with a blend of Query-by-Committee and Human In The Loop
(HITL) approaches. There are more than two million recipes in our dataset, each
of which is categorized and has a confidence score linked with it. For the 9
genres, the Fleiss Kappa score of this massive dataset is roughly 0.56026. We
believe that the research community can use this dataset to perform various
machine learning tasks such as recipe genre classification, recipe generation
of a specific genre, new recipe creation, etc. The dataset can also be used to
train and evaluate the performance of various NLP tasks such as named entity
recognition, part-of-speech tagging, semantic role labeling, and so on. The
dataset will be available upon publication: https://tinyurl.com/3zu4778y
Towards Automated Recipe Genre Classification using Semi-Supervised Learning
Sharing cooking recipes is a great way to exchange culinary ideas and provide
instructions for food preparation. However, categorizing raw recipes found
online into appropriate food genres can be challenging due to a lack of
adequate labeled data. In this study, we present a dataset named the
``Assorted, Archetypal, and Annotated Two Million Extended (3A2M+) Cooking
Recipe Dataset" that contains two million culinary recipes labeled in
respective categories with extended named entities extracted from recipe
descriptions. This collection of data includes various features such as title,
NER, directions, and extended NER, as well as nine different labels
representing genres including bakery, drinks, non-veg, vegetables, fast food,
cereals, meals, sides, and fusions. The proposed pipeline named 3A2M+ extends
the size of the Named Entity Recognition (NER) list to address missing named
entities like heat, time or process from the recipe directions using two NER
extraction tools. 3A2M+ dataset provides a comprehensive solution to the
various challenging recipe-related tasks, including classification, named
entity recognition, and recipe generation. Furthermore, we have demonstrated
traditional machine learning, deep learning and pre-trained language models to
classify the recipes into their corresponding genre and achieved an overall
accuracy of 98.6\%. Our investigation indicates that the title feature played a
more significant role in classifying the genre
Design and development of vertical axis wind turbine
A wind turbine is a device that converts kinetic energy from the wind into electrical power. Vertical Axis Wind Turbine (VAWT) is one type of wind turbine where the main rotor shaft is set vertically and it can capture wind from any direction. The aim of this work is to develop a theoretical model for the design and performance of Darrieus type vertical axis wind turbine for small scale energy applications. A small 3 bladed turbine (prototype) is constructed and investigated the performance for low wind velocity. The model is based on NACA 0018 airfoil & light wood is used as blade material. The full scale Vertical Axis Wind Turbine is made for 36 inch height, 24 inch diameter, blade cord length is 3.937 inch & blade height is 24 inch. A 100 watt 24 volt brushless DC motor is used to measure output power. The whirling speed of blade & electric power output for the corresponding speed is measured through Tachometer & Wattmeter. The power curves show the relation between the rotational wind speed of the turbine and the power produced for a range of wind speeds. This approach indicates to develop vertical axis wind turbine with better performance to meet the increasing power demand
Comparison of real-time NOx emission measurements from two heavy-duty diesel engines
This study investigated the real-time NOx emissions from a heavy-duty diesel truck and a bulk carrier ship. The test road vehicle was driven on a combination of a flat and hilly route from Brisbane to Toowoomba that covered urban, rural and motorway driving. On-board ship emissions were measured on the sea from the port of Gladstone to Newcastle. NOx emissions from both engines were compared and analysed to understand the influence of engine parameters as well as route variables and power transmission on NOx emissions. Results from these measurements show that truck NOx emissions increased with the engine power and speed however a significant NOx emission can be seen during the idling condition while producing low power. On the other hand, ship NOx emissions followed an approximate cubic relationship as a function of engine RPM that is completely different from that of the truck
Optimisation of driving-parameters and emissions of a diesel-vehicle using principal component analysis (PCA)
Light-duty diesel vehicles contribute significantly to urban air pollution. Laboratory-based standard driving test cycles do not take into account external driving factors, which greatly impact the vehicle emissions compared to the real-world driving emission (RDE) measurements. This results in higher emission levels obtained by RDE tests, compared to the standard approaches. In the current study, an RDE measurement campaign has been conducted in Brisbane city traffic using a portable emission measurement system (PEMS). Thirty drivers with a wide variety of driving experiences participated using a Hyundai iLoad van in a custom test route. RDEs and driving parameters were recorded during each trip. Principal component analysis (PCA) was applied to investigate the relationship between driving dynamics and vehicle emissions. Also, the impact of different trips, driving time, and driving experience on driving behaviour and emissions. Route familiarity, traffic density, and driving experience have a strong impact on driving behaviour and emissions. The driver\u27s response to changing traffic, unknown routes, and vehicles significantly vary among different drivers which results in a high volume of transient events (frequent acceleration and deceleration). Transient events are very common in city driving which has a strong correlation to vehicle emissions