12 research outputs found
Language Models for Hierarchical Classification of Radiology Reports with Attention Mechanisms, BERT and GPT-4
Radiology reports are a valuable source of textual information used to improve clinical care and support research. In recent years, deep learning techniques have been shown to be effective in classifying radiology reports. This article investigates the use of deep learning techniques with attention mechanisms to achieve better performance in the classification of radiology reports.We focus on various Natural Language Processing approaches, such as LSTM with Attention, BERT, and GPT-4, evaluated on a chest tomography report dataset regarding neoplastic diseases collected from an Italian hospital. In particular, we compare the results with a previous machine learning system, showing that models based on attention mechanisms can achieve higher performance. The Attention Mechanism allows us to identify the most relevant bits of text used by the model to make its predictions. We show that our model achieves state-of-the-art results on the hierarchical classification of radiology reports. Moreover, we evaluate the performance of GPT-4 on the classification of these reports in a zero-shot setup through prompt engineering, showing interesting results even with a small context and a non-English language. Our findings suggest that deep learning techniques with attention mechanisms may be successful in the classification of radiology reports even in non-English languages for which it is not possible to leverage on large text corpus
Recurrent Neural Networks for Daily Estimation of COVID-19 Prognosis with Uncertainty Handling
Most ML-based applications for COVID-19 assess the general conditions of a patient trained and tested on cohorts of patients collected over a short period of time and are capable of providing an alarm a few days in advance, helping clinicians in emergency situations, monitor hospitalised patients and identify potentially critical situations at an early stage. However, the pandemic continues to evolve due to new variants, treatments, and vaccines; considering datasets over short periods could not capture this aspect. In addition, these applications often avoid dealing with the uncertainty associated with the prediction provided by machine learning models, potentially causing costly mistakes. In this work, we present a system based on Recurrent Neural Networks (RNN) for the daily estimate of the prognosis of COVID-19 patients that is built and tested using data collected over a long period of time. Our system achieves high predictive performance and uses an algorithm to effectively determine and discard those patients for whom RNN cannot predict the prognosis with sufficient confidence
Statistical analysis of the electricity market: relationship between spot prices and intraday prices in the light of recent reforms
reservedAnalisi della relazione tra prezzi spot e prezzi infragiornalieri alla luce delle recenti riforme del mercato elettrico italiano. Stima di modelli per serie storiche per la previsione dei prezzi spot utilizzando i prezzi infragiornalieri come variabili esplicative
Distilling Knowledge with a Teacher’s Multitask Model for Biomedical Named Entity Recognition
Single-task models (STMs) struggle to learn sophisticated representations from a finite set of annotated data. Multitask learning approaches overcome these constraints by simultaneously training various associated tasks, thereby learning generic representations among various tasks by sharing some layers of the neural network architecture. Because of this, multitask models (MTMs) have better generalization properties than those of single-task learning. Multitask model generalizations can be used to improve the results of other models. STMs can learn more sophisticated representations in the training phase by utilizing the extracted knowledge of an MTM through the knowledge distillation technique where one model supervises another model during training by using its learned generalizations. This paper proposes a knowledge distillation technique in which different MTMs are used as the teacher model to supervise different student models. Knowledge distillation is applied with different representations of the teacher model. We also investigated the effect of the conditional random field (CRF) and softmax function for the token-level knowledge distillation approach, and found that the softmax function leveraged the performance of the student model compared to CRF. The result analysis was also extended with statistical analysis by using the Friedman test
A Comparative Analysis on the use of Autoencoders for Robot Security Anomaly Detection
While robots are more and more deployed among people in public spaces, the impact of cyber-security attacks is significantly increasing. Most of consumer and professional robotic systems are affected by multiple vulnerabilities and the research in this field is just started. This paper addresses the problem of automatic detection of anomalous behaviors possibly coming from cyber-security attacks. The proposed solution is based on extracting system logs from a set of internal variables of a robotic system, on transforming such data into images, and on training different Autoencoder architectures to classify robot behaviors to detect anomalies. Experimental results in two different scenarios (autonomous boats and social robots) show effectiveness and general applicability of the proposed method
Deep learning for classification of radiology reports with a hierarchical schema
Radiological reports are a valuable source of textual information, which can be exploited to improve clinical care and to support research. Such information can be extracted and put into a structured form using machine learning techniques. Some of them rely not only on the classification labels but also on the manual annotation of relevant snippets, which is a time consuming job and requires domain experts. In this paper, we apply deep learning techniques and in particular Long Short Term Memory (LSTM) networks to perform such a task relying only on the classification labels. We focus on the classification of chest computed tomography reports in Italian according to a classification schema proposed for this task by the radiologists of Spedali Civili di Brescia. Each report is classified according to such schema using a combination of neural network classifiers. The resulting system is a novel classification system, which we compare to a previous system based on standard machine learning techniques which used annotations of relevant snippets
Machine Learning Models for Predicting Short-Long Length of Stay of COVID-19 Patients
During 2020 and 2021, managing limited healthcare resources and hospital beds has been a fundamental aspect of the fight against the COVID-19 pandemic. Predicting in advance the length of stay, and in particular identifying whether a patient is going to stay in the hospital longer or less than a week, can provide important support in handling resources allocation. However, there have been significant changes in terms of containment measures, virus diffusion, new treatments, vaccines, and new variants of SARS-CoV-2 during the last period. These changes pose several conceptual drift issues that can limit the usefulness of machine learning in this context. In this work, we present a machine learning system trained and tested using data from more than 6000 hospitalised patients in northern Italy, distributed over almost two years of pandemic. We show how machine learning can be effective even by analysing data over this long period of time, also exploiting a model that predicts the patient's outcome in terms of discharge or death. Furthermore, learning from data that also consider deceased patients is a common issue in predicting the length of stay because they have severe conditions similar to patients with a long stay period, but may actually have a very short duration of hospitalisation. For this purpose, we present a method for handling data from alive and deceased patients, exploiting more patient records, increasing the robustness of the model and its performance in this task. Finally, we investigate the features that are most relevant to the prediction of the simplified length of stay
Goal Recognition as a Deep Learning Task:the GRNet Approach
Recognising the goal of an agent from a trace of observations is an important task with many applications. The state-of-the-art approach to goal recognition (GR) relies on the application of automated planning techniques. We study an alternative approach, called GRNet, where GR is formulated as a classification task addressed by machine learning. GRNet is primarily aimed at solving GR instances more accurately and more quickly by learning how to solve them in a given domain, which is specified by a set of propositions and a set of action names. The goal classification instances in the domain are solved by a Recurrent Neural Network (RNN). The only information required as input of the trained RNN is a trace of action labels, each one indicating just the name of an observed action. A run of the RNN processes a trace of observed actions to compute how likely it is that each domain proposition is part of the agent's goal, for the problem instance under consideration. These predictions are then aggregated to choose one of the candidate goals.
An experimental analysis confirms that GRNet achieves good performance in terms of both goal classification accuracy and runtime, obtaining better results w.r.t. a state-of-the-art GR system over the considered benchmarks. Moreover, such a state-of-the-art system and GRNet can be combined achieving higher performance than with each of the two integrated systems alone
Learning General Policies for Planning through GPT Models
Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning models using these architectures not only possess remarkable linguistic knowledge, but they also exhibit forms of factual knowledge, common sense, and even programming skills. However, the scientific community still debates about their reasoning capabilities, which have been recently tested in the context of automated AI planning; the literature presents mixed results, and the prevailing view is that current transformer-based models may not be adequate for planning. In this paper, we address this challenge differently. We introduce a GPT-based model customised for planning (PLANGPT) to learn a general policy for classical planning by training the model from scratch with a dataset of solved planning instances. Once PLANGPT has been trained for a domain, it can be used to generate a solution plan for an input problem instance in that domain. Our training procedure exploits automated planning knowledge to enhance the performance of the trained model. We build and evaluate our GPT model with several planning domains, and we compare its performance w.r.t. other recent deep learning techniques for generalised planning, demonstrating the effectiveness of the proposed approach
Recurrent Neural Networks for Daily Estimation of COVID-19 Prognosis with Uncertainty Handling
Most ML-based applications for COVID-19 assess the general conditions of a patient trained and tested on cohorts of patients collected over a short period of time and are capable of providing an alarm a few days in advance, helping clinicians in emergency situations, monitor hospitalised patients and identify potentially critical situations at an early stage. However, the pandemic continues to evolve due to new variants, treatments, and vaccines; considering datasets over short periods could not capture this aspect. In addition, these applications often avoid dealing with the uncertainty associated with the prediction provided by machine learning models, potentially causing costly mistakes. In this work, we present a system based on Recurrent Neural Networks (RNN) for the daily estimate of the prognosis of COVID-19 patients that is built and tested using data collected over a long period of time. Our system achieves high predictive performance and uses an algorithm to effectively determine and discard those patients for whom RNN cannot predict the prognosis with sufficient confidence