277 research outputs found
Modelling tree biomass using direct and additive methods with point cloud deep learning in a temperate mixed forest
ABSTRACT: Airborne laser scanning (ALS) data has been widely used for total aboveground tree biomass (AGB) modelling, however, there is less research focusing on estimating specific tree biomass components (wood, branches, bark, and foliage). Knowledge about these biomass components is essential for carbon accounting, understanding forest nutrient cycling, and other applications. In this study, we compare additive AGB estimation (sum of estimated components) with direct AGB estimation using deep neural network (DNN) and random forest (RF) models. We utilise two point cloud DNNs: point-based Dynamic Graph Convolutional Neural Network (DGCNN) and Octree-based Convolutional Neural Network (OCNN). DNN and RF models were trained using a dataset comprised of 2336 sample plots from a mixed temperate forest in New Brunswick, Canada. Results indicate that additive AGB models perform similarly to direct models in terms of coefficient of determination (R2) and root-mean square error (RMSE), and reduced the mean absolute percentage error (MAPE) by 22% on average. Compared to RF, the DNNs provided a small improvement in performance, with OCNN explaining 5% more variation in the data (R2 = 0.76) and reducing MAPE by 20% on average. Overall, this study showcases the effectiveness of additive tree AGB models and highlights the potential of DNNs for enhanced AGB estimation. To further improve DNN performance, we recommend using larger training datasets, implementing hyperparameter optimization, and incorporating additional data such as multispectral imagery
Machine Learning Approaches for Semantic Segmentation on Partly-Annotated Medical Images
Semantic segmentation of medical images plays a crucial role in assisting medical practitioners in providing accurate and swift diagnoses; nevertheless, deep neural networks require extensive labelled data to learn and generalise appropriately. This is a major issue in medical imagery because most of the datasets are not fully annotated. Training models with partly-annotated datasets generate plenty of predictions that belong to correct unannotated areas that are categorised as false positives; as a result, standard segmentation metrics and objective functions do not work correctly, affecting the overall performance of the models. In this thesis, the semantic segmentation of partly-annotated medical datasets is extensively and thoroughly studied. The general objective is to improve the segmentation results of medical images via innovative supervised and semi-supervised approaches. The main contributions of this work are the following. Firstly, a new metric, specifically designed for this kind of dataset, can provide a reliable score to partly-annotated datasets with positive expert feedback in their generated predictions by exploiting all the confusion matrix values except the false positives. Secondly, an innovative approach to generating better pseudo-labels when applying co-training with the disagreement selection strategy. This method expands the pixels in disagreement utilising the combined predictions as a guide. Thirdly, original attention mechanisms based on disagreement are designed for two cases: intra-model and inter-model. These attention modules leverage the disagreement between layers (from the same or different model instances) to enhance the overall learning process and generalisation of the models. Lastly, innovative deep supervision methods improve the segmentation results by training neural networks one subnetwork at a time following the order of the supervision branches. The methods are thoroughly evaluated on several histopathological datasets showing significant improvements
Advanced analytical methods for fraud detection: a systematic literature review
The developments of the digital era demand new ways of producing goods and rendering
services. This fast-paced evolution in the companies implies a new approach from the
auditors, who must keep up with the constant transformation. With the dynamic
dimensions of data, it is important to seize the opportunity to add value to the companies.
The need to apply more robust methods to detect fraud is evident.
In this thesis the use of advanced analytical methods for fraud detection will be
investigated, through the analysis of the existent literature on this topic.
Both a systematic review of the literature and a bibliometric approach will be applied to
the most appropriate database to measure the scientific production and current trends.
This study intends to contribute to the academic research that have been conducted, in
order to centralize the existing information on this topic
Learning representations for effective and explainable software bug detection and fixing
Software has an integral role in modern life; hence software bugs, which undermine software quality and reliability, have substantial societal and economic implications. The advent of machine learning and deep learning in software engineering has led to major advances in bug detection and fixing approaches, yet they fall short of desired precision and recall. This shortfall arises from the absence of a \u27bridge,\u27 known as learning code representations, that can transform information from source code into a suitable representation for effective processing via machine and deep learning.
This dissertation builds such a bridge. Specifically, it presents solutions for effectively learning code representations using four distinct methods?context-based, testing results-based, tree-based, and graph-based?thus improving bug detection and fixing approaches, as well as providing developers insight into the foundational reasoning. The experimental results demonstrate that using learning code representations can significantly enhance explainable bug detection and fixing, showcasing the practicability and meaningfulness of the approaches formulated in this dissertation toward improving software quality and reliability
Resilient and Scalable Forwarding for Software-Defined Networks with P4-Programmable Switches
Traditional networking devices support only fixed features and limited configurability.
Network softwarization leverages programmable software and hardware platforms to remove those limitations.
In this context the concept of programmable data planes allows directly to program the packet processing pipeline of networking devices and create custom control plane algorithms.
This flexibility enables the design of novel networking mechanisms where the status quo struggles to meet high demands of next-generation networks like 5G, Internet of Things, cloud computing, and industry 4.0.
P4 is the most popular technology to implement programmable data planes.
However, programmable data planes, and in particular, the P4 technology, emerged only recently.
Thus, P4 support for some well-established networking concepts is still lacking and several issues remain unsolved due to the different characteristics of programmable data planes in comparison to traditional networking.
The research of this thesis focuses on two open issues of programmable data planes.
First, it develops resilient and efficient forwarding mechanisms for the P4 data plane as there are no satisfying state of the art best practices yet.
Second, it enables BIER in high-performance P4 data planes.
BIER is a novel, scalable, and efficient transport mechanism for IP multicast traffic which has only very limited support of high-performance forwarding platforms yet.
The main results of this thesis are published as 8 peer-reviewed and one post-publication peer-reviewed publication. The results cover the development of suitable resilience mechanisms for P4 data planes, the development and implementation of resilient BIER forwarding in P4, and the extensive evaluations of all developed and implemented mechanisms. Furthermore, the results contain a comprehensive P4 literature study.
Two more peer-reviewed papers contain additional content that is not directly related to the main results.
They implement congestion avoidance mechanisms in P4 and develop a scheduling concept to find cost-optimized load schedules based on day-ahead forecasts
Security and Privacy for Modern Wireless Communication Systems
The aim of this reprint focuses on the latest protocol research, software/hardware development and implementation, and system architecture design in addressing emerging security and privacy issues for modern wireless communication networks. Relevant topics include, but are not limited to, the following: deep-learning-based security and privacy design; covert communications; information-theoretical foundations for advanced security and privacy techniques; lightweight cryptography for power constrained networks; physical layer key generation; prototypes and testbeds for security and privacy solutions; encryption and decryption algorithm for low-latency constrained networks; security protocols for modern wireless communication networks; network intrusion detection; physical layer design with security consideration; anonymity in data transmission; vulnerabilities in security and privacy in modern wireless communication networks; challenges of security and privacy in node–edge–cloud computation; security and privacy design for low-power wide-area IoT networks; security and privacy design for vehicle networks; security and privacy design for underwater communications networks
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Large Language Models for Software Engineering: A Systematic Literature Review
Large Language Models (LLMs) have significantly impacted numerous domains,
notably including Software Engineering (SE). Nevertheless, a well-rounded
understanding of the application, effects, and possible limitations of LLMs
within SE is still in its early stages. To bridge this gap, our systematic
literature review takes a deep dive into the intersection of LLMs and SE, with
a particular focus on understanding how LLMs can be exploited in SE to optimize
processes and outcomes. Through a comprehensive review approach, we collect and
analyze a total of 229 research papers from 2017 to 2023 to answer four key
research questions (RQs). In RQ1, we categorize and provide a comparative
analysis of different LLMs that have been employed in SE tasks, laying out
their distinctive features and uses. For RQ2, we detail the methods involved in
data collection, preprocessing, and application in this realm, shedding light
on the critical role of robust, well-curated datasets for successful LLM
implementation. RQ3 allows us to examine the specific SE tasks where LLMs have
shown remarkable success, illuminating their practical contributions to the
field. Finally, RQ4 investigates the strategies employed to optimize and
evaluate the performance of LLMs in SE, as well as the common techniques
related to prompt optimization. Armed with insights drawn from addressing the
aforementioned RQs, we sketch a picture of the current state-of-the-art,
pinpointing trends, identifying gaps in existing research, and flagging
promising areas for future study
Single-cell analysis of cell competition using quantitative microscopy and machine learning
Cell competition is a widely conserved, fundamental biological quality control mechanism. The cell competition assay of MDCK wild-type versus mutant MDCK Scribble-knockdown (ScribKD) relies on a mechanical mechanism of competition, which posits that the emergence of compressing stresses within the tissue at high confluency drive the competitive outcome. According to this mechanism, proliferating wild-type cells out-compete mutant ScribKD cells, resulting in their apoptosis and apical extrusion. Previous studies show that there is an increased division rate of wild-type cells in neighbourhoods with high numbers of ScribKD cells, but what still remains a mystery is whether this is a cause or consequence of
increased apoptosis in the “loser” cell population. This project also interrogated the competitive assay of wild-type versus RasV12 , which is hypothesized to operate on a biochemical mechanism and results in the apical extrusion (but not apoptosis) of the loser RasV12 population. For both these mechanisms of competition it is still unknown which population of cells are driving the winner/loser outcome. Is the winner cell proliferation prompting the loser cell demise? Or is an autonomous loser elimination prompting a subsequent winner cell proliferation?
In my research, I have employed multi-modal, time-lapse microscopy to image competition assays continuously for several days. These data were then segmented into wild-type or mutant instances using a Convolutional Neural Network (CNN) that can differentiate between the cell types, after which they were tracked across cellular generations using a Bayesian multi-object tracker. A conjugate analysis of fluorescent cell-cycle indicator probes was then utilised to automatically identify key time points of cellular fate commitment using deep-learning image classification. A spatio-temporal analysis was then conducted in order to quantify any correlation between wild-type proliferation and mutant cell demise. For the case of wild-type versus ScribKD , there was no clear evidence for the wild-type cells mitoses directly impacting upon the ScribKD cell apoptotic elimination. Instead, a subsequent analysis found that a more subtle mechanism of pre-emptive, local density increases around the apoptosis site appeared to be determining the eventual ScribKD fate. On the other hand, there was clear evidence of a direct impact of wild-type mitoses on the subsequent apical extrusion and competitive elimination of RasV12 cells. Both of these conclusions agree with the prevailing classification of cell competition types: mechanical interactions are more diffuse and occur over a larger spatio-temporal domain, whereas biochemical interactions are constrained to nearest neighbour cells. The hypothesized density-dependency of ScribKD elimination was further quantified on a single-cell scale by these analyses, as well as a potential new understanding of RasV12 extrusion. Most interestingly, it appears that there is a clear biophysical mechanism to the elimination in the biochemical RasV12 cell competition. This suggests that perhaps a new semantic approach is needed in the field of cell competition in order to accurately classify different mechanisms of elimination
Implant Global and Local Hierarchy Information to Sequence based Code Representation Models
Source code representation with deep learning techniques is an important
research field. There have been many studies that learn sequential or
structural information for code representation. But sequence-based models and
non-sequence-models both have their limitations. Researchers attempt to
incorporate structural information to sequence-based models, but they only mine
part of token-level hierarchical structure information. In this paper, we
analyze how the complete hierarchical structure influences the tokens in code
sequences and abstract this influence as a property of code tokens called
hierarchical embedding. The hierarchical embedding is further divided into
statement-level global hierarchy and token-level local hierarchy. Furthermore,
we propose the Hierarchy Transformer (HiT), a simple but effective sequence
model to incorporate the complete hierarchical embeddings of source code into a
Transformer model. We demonstrate the effectiveness of hierarchical embedding
on learning code structure with an experiment on variable scope detection task.
Further evaluation shows that HiT outperforms SOTA baseline models and show
stable training efficiency on three source code-related tasks involving
classification and generation tasks across 8 different datasets.Comment: Accepted by ICPC 202
- …