64 research outputs found
Improving SeNA-CNN by Automating Task Recognition
Catastrophic forgetting arises when a neural network is not
capable of preserving the past learned task when learning a new task.
There are already some methods proposed to mitigate this problem in
arti cial neural networks. In this paper we propose to improve upon
our previous state-of-the-art method, SeNA-CNN, such as to enable the
automatic recognition in test time of the task to be solved and we experimentally
show that it has excellent results. The experiments show
the learning of up to 4 di erent tasks with a single network, without
forgetting how to solve previous learned tasks.info:eu-repo/semantics/publishedVersio
Automatic Detection of Epileptic Seizures in Neonatal Intensive Care Units through EEG, ECG and Video Recordings: A Survey
In Neonatal Intensive Care Units (NICUs), the early detection of neonatal seizures is of utmost importance for a timely, effective and efficient clinical intervention. The continuous video electroencephalogram (v-EEG) is the gold standard for monitoring neonatal seizures, but it requires specialized equipment and expert staff available 24/24h. The purpose of this study is to present an overview of the main Neonatal Seizure Detection (NSD) systems developed during the last ten years that implement Artificial Intelligence techniques to detect and report the temporal occurrence of neonatal seizures. Expert systems based on the analysis of EEG, ECG and video recordings are investigated, and their usefulness as support tools for the medical staff in detecting and diagnosing neonatal seizures in NICUs is evaluated. EEG-based NSD systems show better performance than systems based on other signals. Recently ECG analysis, particularly the related HRV analysis, seems to be a promising marker of brain damage. Moreover, video analysis could be helpful to identify inconspicuous but pathological movements. This study highlights possible future developments of the NSD systems: a multimodal approach that exploits and combines the results of the EEG, ECG and video approaches and a system able to automatically characterize etiologies might provide additional support to clinicians in seizures diagnosis
Artificial Intelligence Applied to Supply Chain Management and Logistics: Systematic Literature Review
The growing impact of automation and artificial intelligence (AI) on supply chain management and
logistics is remarkable. This technological advance has the potential to significantly transform the
handling and transport of goods. The implementation of these technologies has boosted efficiency,
predictive capabilities and the simplification of operations. However, it has also raised critical
questions about AI-based decision-making. To this end, a systematic literature review was carried
out, offering a comprehensive view of this phenomenon, with a specific focus on management. The
aim is to provide insights that can guide future research and decision-making in the logistics and
supply chain management sectors. Both the articles in this thesis and that form chapters present
detailed methodologies and transparent results, reinforcing the credibility of the research for
researchers and managers. This contributes to a deeper understanding of the impact of technology
on logistics and supply chain management. This research offers valuable information for both
academics and professionals in the logistics sector, revealing innovative solutions and strategies
made possible by automation. However, continuous development requires vigilance, adaptation,
foresight and a rapid problem-solving capacity. This research not only sheds light on the current
panorama, but also offers a glimpse into the future of logistics in a world where artificial
intelligence is set to prevail
Towards Robust and Deployable Gesture and Activity Recognisers
Smartphones and wearables have become an extension of one's self, with gestures providing quick access to command execution, and activity tracking helping users log their daily life. Recent research in gesture recognition points towards common events like a user re-wearing or readjusting their smartwatch deteriorate recognition accuracy significantly. Further, the available state-of-the-art deep learning models for gesture or activity recognition are too large and computationally heavy to be deployed and run continuously in the background. This problem of engineering robust yet deployable gesture recognisers for use in wearables is open-ended. This thesis provides a review of known approaches in machine learning and human activity recognition (HAR) for addressing model robustness. This thesis also proposes variations of convolution based models for use with raw or spectrogram sensor data. Finally, a cross-validation based evaluation approach for quantifying individual and situational-variabilities is used to demonstrate that with an application-oriented design, models can be made two orders of magnitude smaller while improving on both recognition accuracy and robustness
Towards fully automated analysis of sputum smear microscopy images
Sputum smear microscopy is used for diagnosis and treatment monitoring of pulmonary tuberculosis (TB). Automation of image analysis can make this technique less laborious and more consistent. This research employs artificial intelligence to improve automation of Mycobacterium tuberculosis (Mtb) cell detection, bacterial load quantification, and phenotyping from fluorescence microscopy images.
I first introduce a non-learning, computer vision (CV) approach for bacteria detection, employing ridge-based approach using the Hessian matrix to detect ridges of Mtb bacteria, complemented by geometric analysis. The effectiveness of this approach is
assessed through a custom metric using the Hu moment vector. Results demonstrate lower performance relative to literature metrics, motivating the need for deep learning (DL) to capture bacterial morphology.
Subsequently, I develop an automated pipeline for detection, classification, and counting of bacteria using DL techniques. Firstly, Cycle-GANs transfer labels from labelled to unlabeled fields of view (FOVs). Pre-trained DL models are used for subsequent classification and regression tasks. An ablation study confirms pipeline efficacy, with a count error within 5%.
For downstream analysis, microscopy slides are divided into tiles, each of which is sequentially cropped and magnified. A subsequent filtering stage eliminates non-salient FOVs by applying pre-trained DL models along with a novel method that employs dual convolutional neural network (CNN)-based encoders for feature extraction: one encoder is dedicated to learning bacterial appearance, and the other focuses on bacterial shape, which both precede into a bottleneck of a smaller CNN classifier network. The proposed model outperforms others in accuracy, yields no false positives, and excels across decision thresholds.
Mtb cell lipid content and length may be related to antibiotic tolerance, underscoring the need to locate bacteria within paired FOV images stained with distinct cell identification and lipid detection, and to measure bacterial dimensions. I employ a
proposed UNet-like model for precise bacterial localization. By combining CNNs and feature descriptors, my method automates reporting of both lipid content and cell length. Application of the approaches described here may assist clinical TB care
and therapeutics research
Towards Efficient Ice Surface Localization From Hockey Broadcast Video
Using computer vision-based technology in ice hockey has recently been embraced as it allows for the automatic collection of analytics. This data would be too expensive and time-consuming to otherwise collect manually. The insights gained from these analytics allow for a more in-depth understanding of the game, which can influence coaching and management decisions. A fundamental component of automatically deriving analytics from hockey broadcast video is ice rink localization. In broadcast video of hockey games, the camera pans, tilts, and zooms to follow the play. To compensate for this motion and get the absolute locations of the players and puck on the ice, an ice rink localization pipeline must find the perspective transform that maps each frame to an overhead view of the rink.
The lack of publicly available datasets makes it difficult to perform research into ice rink localization. A novel annotation tool and dataset are presented, which includes 7,721 frames from National Hockey League game broadcasts.
Since ice rink localization is a component of a full hockey analytics pipeline, it is important that these methods be as efficient as possible to reduce the run time. Small neural networks that reduce inference time while maintaining high accuracy can be used as an intermediate step to perform ice rink localization by segmenting the lines from the playing surface.
Ice rink localization methods tend to infer the camera calibration of each frame in a broadcast sequence individually. This results in perturbations in the output of the pipeline, as there is no consideration of the camera calibrations of the frames before and after in the sequence. One way to reduce the noise in the output is to add a post-processing step after the ice has been localized to smooth the camera parameters and closely simulate the camera’s motion. Several methods for extracting the pan, tilt, and zoom from the perspective transform matrix are explored. The camera parameters obtained from the inferred perspective transform can be smoothed to give a visually coherent video output. Deep neural networks have allowed for the development of architectures that can perform several tasks at once. A basis for networks that can regress the ice rink localization parameters and simultaneously smooth them is presented.
This research provides several approaches for improving ice rink localization methods. Specifically, the analytics pipelines can become faster and provide better results visually. This can allow for improved insight into hockey games, which can increase the performance of the hockey team with reduced cost
Combiner and HyperCombiner Networks: Rules to Combine Multimodality MR Images for Prostate Cancer Localisation
One of the distinct characteristics in radiologists' reading of
multiparametric prostate MR scans, using reporting systems such as PI-RADS
v2.1, is to score individual types of MR modalities, T2-weighted,
diffusion-weighted, and dynamic contrast-enhanced, and then combine these
image-modality-specific scores using standardised decision rules to predict the
likelihood of clinically significant cancer. This work aims to demonstrate that
it is feasible for low-dimensional parametric models to model such decision
rules in the proposed Combiner networks, without compromising the accuracy of
predicting radiologic labels: First, it is shown that either a linear mixture
model or a nonlinear stacking model is sufficient to model PI-RADS decision
rules for localising prostate cancer. Second, parameters of these (generalised)
linear models are proposed as hyperparameters, to weigh multiple networks that
independently represent individual image modalities in the Combiner network
training, as opposed to end-to-end modality ensemble. A HyperCombiner network
is developed to train a single image segmentation network that can be
conditioned on these hyperparameters during inference, for much improved
efficiency. Experimental results based on data from 850 patients, for the
application of automating radiologist labelling multi-parametric MR, compare
the proposed combiner networks with other commonly-adopted end-to-end networks.
Using the added advantages of obtaining and interpreting the modality combining
rules, in terms of the linear weights or odds-ratios on individual image
modalities, three clinical applications are presented for prostate cancer
segmentation, including modality availability assessment, importance
quantification and rule discovery.Comment: 30 pages, 6 figure
Preclinical risk of bias assessment and PICO extraction using natural language processing
Drug development starts with preclinical studies which test the efficacy and
toxicology of potential candidates in living animals, before proceeding to
clinical trials examined on human subjects. Many drugs shown to be effective
in preclinical animal studies fail in clinical trials, indicating the potential
reproducibility issues and translation failure. To obtain less biased research
findings, systematic reviews are performed to collate all relevant evidence from
publications. However, systematic reviews are time-consuming and
researchers have advocated the use of automation techniques to speed the
process and reduce human efforts. Good progress has been made in
implementing automation tools into reviews for clinical trials while the tools
developed for preclinical systematic reviews are scarce. Tools for preclinical
systematic reviews should be designed specifically because preclinical
experiments differ from clinical trials. In this thesis, I explore natural language
processing models for facilitating two stages in preclinical systematic reviews:
risk of bias assessment and PICO extraction.
There are a range of measures used to reduce bias in animal experiments and
many checklist criteria require the reporting of those measures in publications.
In the first part of the thesis, I implement several binary classification models
to indicate the reporting of random allocation to groups, blinded assessment
of outcome, conflict of interests, compliance of animal welfare regulations, and
statement of animal exclusions in preclinical publications. I compare traditional
machine learning classifiers with several text representation methods,
convolutional/recurrent/hierarchical neural networks, and propose two
strategies to adapt BERT models to long documents. My findings indicate that
neural networks and BERT-based models achieve better performance than
traditional classifiers and rule-based approaches. The attention mechanism
and hierarchical architecture in neural networks do not improve performance
but are useful for extracting relevant words or sentences from publications to
inform users’ judgement. The advantages of the transformer structure are
hindered when documents are long and computing resources are limited.
In literature retrieval and citation screening of published evidence, the key
elements of interest are Population, Intervention, Comparator and Outcome,
which compose the framework of PICO. In the second part of the thesis, I first
apply several question answering models based on attention flows and
transformers to extract phrases describing intervention or method of induction
of disease models from clinical abstracts and preclinical full texts. For
preclinical datasets describing multiple interventions or induction methods in
the full texts, I apply additional unsupervised information retrieval methods to
extract relevant sentences. The question answering models achieve good
performance when the text is at abstract-level and contains only one
intervention or induction method, while for truncated documents with multiple
PICO mentions, the performance is less satisfactory. Considering this
limitation, I then collect preclinical abstracts with finer-grained PICO
annotations and develop named entity recognition models for extraction of
preclinical PICO elements including Species, Strain, Induction, Intervention,
Comparator and Outcome. I decompose PICO extraction into two independent
tasks: 1) PICO sentences classification, and 2) PICO elements detection. For
PICO extraction, BERT-based models pre-trained from biomedical corpus
outperform recurrent networks and the conditional probabilistic module only
shows advantages in recurrent networks. Self-training strategy applied to
enlarge training set from unlabelled abstracts yields better performance for
PICO elements which lack enough amount of instances.
Experimental results demonstrate the possibilities of facilitating preclinical risk
of bias assessment and PICO extraction by natural language processing
Artificial Intelligence for detection and prevention of mold contamination in tomato processing
openIl presente elaborato si propone di analizzare l'uso dell'intelligenza artificiale attraverso il
riconoscimento di immagini per rilevare la presenza di muffa nei pomodori durante il processo
di essiccazione. La muffa nei pomodori rappresenta un rischio sia per la salute umana sia per
l'industria alimentare, comportando, anche, una serie di problemi che vanno oltre l'aspetto
estetico. Essa è causata principalmente da funghi che si diffondono rapidamente sulla
superficie dei pomodori. Tale processo compromette così la qualità con la conseguente
produzione di tossine che possono influire sulla salute umana.
L'obiettivo sperimentale di questo lavoro è il problema dello spreco e della perdita di prodotto
nell'industria alimentare. Quando i pomodori sono colpiti da muffe, infatti, diventano inadatti
al consumo, con conseguente perdita di cibo. Lo spreco di pomodori a causa delle muffe
rappresenta anche la perdita di preziose risorse, utili alla produzione, come terra, acqua,
energia e tempo. Il proposito è testare, anche nella fase iniziale, la capacità di un algoritmo di
rilevamento degli oggetti per identificare la muffa, e adottare misure preventive. L'analisi
sperimentale ha previsto l'addestramento dell'algoritmo con un'ampia serie di foto, tra cui
pomodori sani e rovinati di diversi tipi, forme e consistenze. Per etichettare le immagini e
creare le epoche di addestramento è stato quindi utilizzato YOLOv7, l'algoritmo di
rilevamento degli oggetti scelto, basato su reti neurali. Per valutare le prestazioni sono state
utilizzate metriche di valutazione, tra cui “Precision” e “Recall”.
L'ipotesi di applicazione dell'intelligenza artificiale in futuro sarà un grande potenziale per
migliorare i processi di produzione alimentare, facilitando, così, l'identificazione delle muffe.
Il rilevamento rapido delle muffe faciliterebbe la separazione tempestiva dei prodotti
contaminati, riducendo così il rischio di diffusione delle tossine e preservando la qualità degli
alimenti non contaminati. Questo approccio contribuirebbe a ridurre al minimo gli sprechi
alimentari e le inefficienze delle risorse associate allo scarto di grandi quantità di prodotto.
Inoltre, l'integrazione della computer vision nel contesto dell'HACCP (Hazard Analysis
Critical Control Points) potrebbe migliorare i protocolli di sicurezza alimentare grazie a un
rilevamento accurato e tempestivo. Questa tecnologia potrà offrire, dando priorità alla
prevenzione, una promettente opportunità per migliorare la qualità, l'efficienza e la
sostenibilità dei futuri processi di produzione alimentare.This study investigates the use of computer vision couples with artificial intelligence to detect
mold in tomatoes during the drying process.
Mold presence in tomatoes poses threats to human health and the food industry as it leads to
several issues beyond appearance. It is primarily caused by fungi that spread rapidly over the
tomato surface, compromising their quality, and potentially producing toxins that can harm
human health.
The experimental aim of this work focused on the issue of wastage and loss within the food
industry. When tomatoes succumb to mold, they become unsuitable for consumption, resulting
in a loss of food and resources. Considering that tomato production requires resources such as
land, water, energy, and time, wasting tomatoes due to mold also represents a waste of these
valuable resources.
The goal was to evaluate the mold detection capabilities of an object detection algorithm,
particularly in its early stages, to facilitate preventative measures. This experimental analysis
entailed training the algorithm with an extensive array of images, encompassing a variety of
healthy and spoiled tomatoes of different shapes, types, textures and drying stages. The chosen
object detection algorithm, YOLOv7, is convolutional neural network-based and was utilized
for image labeling and training epochs. Evaluation metrics, including precision and recall,
were utilized to assess the algorithm's performance.
The implementation of artificial intelligence in the future has significant potential for
enhancing food production processes by streamlining mold identification. Prompt mold
detection would expedite segregation of contaminated products, thus reducing the risk of toxin
dissemination and preserving the quality of uncontaminated food. This approach could
minimize food waste and resource inefficiencies linked to discarding significant product
amounts. Furthermore, integrating computer vision in the HACCP (Hazard Analysis Critical
Control Points) context could enhance food safety protocols via accurate and prompt
detection. By prioritizing prevention, this technology offers a promising chance to optimize
quality, efficiency, and sustainability of future food production processes
Flexible Automation and Intelligent Manufacturing: The Human-Data-Technology Nexus
This is an open access book. It gathers the first volume of the proceedings of the 31st edition of the International Conference on Flexible Automation and Intelligent Manufacturing, FAIM 2022, held on June 19 – 23, 2022, in Detroit, Michigan, USA. Covering four thematic areas including Manufacturing Processes, Machine Tools, Manufacturing Systems, and Enabling Technologies, it reports on advanced manufacturing processes, and innovative materials for 3D printing, applications of machine learning, artificial intelligence and mixed reality in various production sectors, as well as important issues in human-robot collaboration, including methods for improving safety. Contributions also cover strategies to improve quality control, supply chain management and training in the manufacturing industry, and methods supporting circular supply chain and sustainable manufacturing. All in all, this book provides academicians, engineers and professionals with extensive information on both scientific and industrial advances in the converging fields of manufacturing, production, and automation
- …