121 research outputs found
A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the Italian language
In the last decade, the demand for readily accessible corpora has touched all areas of natural language processing, including
coreference resolution. However, it is one of the least considered sub-fields in recent developments. Moreover, almost all
existing resources are only available for the English language. To overcome this lack, this work proposes a methodology to
create a corpus for coreference resolution in Italian exploiting knowledge of annotated resources in other languages.
Starting from OntonNotes, the methodology translates and refines English utterances to obtain utterances respecting Italian
grammar, dealing with language-specific phenomena and preserving coreference and mentions. A quantitative and qualitative
evaluation is performed to assess the well-formedness of generated utterances, considering readability, grammaticality,
and acceptability indexes. The results have confirmed the effectiveness of the methodology in generating a good
dataset for coreference resolution starting from an existing one. The goodness of the dataset is also assessed by training a
coreference resolution model based on BERT language model, achieving the promising results. Even if the methodology
has been tailored for English and Italian languages, it has a general basis easily extendable to other languages, adapting a
small number of language-dependent rules to generalize most of the linguistic phenomena of the language under
examination
A Comprehensive Star Rating Approach for Cruise Ships Based on Interactive Group Decision Making with Personalized Individual Semantics
This article proposes a comprehensive star rating approach for cruise ships by the combination
of subject and objective evaluation. To do that, it firstly established a index system of star
rating for cruise ships. Then, the modified TOPSIS is adopted to tackle objective data for obtaining
star ratings for basic cruise indicators and service capabilities of cruise ships. Thus, the concept of
distributed linguistic star rating function (DLSRF) is defined to analyze the subjective evaluation from
experts and users. Hence, a novel weight calculation method with interactive group decision making
is presented to assign the importance of the main indicators. Particularly, in order to enable decision
makers to effectively deal with the uncertainty in this star rating process, it adopts the personalized
individual semantics (PIS) model. Finally, data of nine cruise ships is collected to obtain their final
star rating results and some suggestions for improving cruise service capabilities and star indicators
were put forward.National Natural Science Foundation of China (NSFC) 71971135,72001134,72071056
China Scholarship Council 202108310183
Innovative Talent Training Project of Graduate Students in Shanghai Maritime University of China 2021YBR00
Mining Temporal Association Rules with Temporal Soft Sets
This work was partially supported by the National Natural Science Foundation of China (grant no. 11301415), the Shaanxi Provincial Key Research and Development Program (grant no. 2021SF-480), and the Natural Science Basic Research Plan in Shaanxi Province of China (grant no. 2018JM1054).Traditional association rule extraction may run into some difficulties due to ignoring the temporal aspect of the collected data.
Particularly, it happens in many cases that some item sets are frequent during specific time periods, although they are not frequent
in the whole data set. In this study, we make an effort to enhance conventional rule mining by introducing temporal soft sets. We
define temporal granulation mappings to induce granular structures for temporal transaction data. Using this notion, we define
temporal soft sets and their Q-clip soft sets to establish a novel framework for mining temporal association rules. A number of
useful characterizations and results are obtained, including a necessary and sufficient condition for fast identification of strong
temporal association rules. By combining temporal soft sets with NegNodeset-based frequent item set mining techniques, we
develop the negFIN-based soft temporal association rule mining (negFIN-STARM) method to extract strong temporal association
rules. Numerical experiments are conducted on commonly used data sets to show the feasibility of our approach. Moreover,
comparative analysis demonstrates that the newly proposed method achieves higher execution efficiency than three well-known
approaches in the literature.National Natural Science Foundation of China (NSFC) 11301415Shaanxi Provincial Key Research and Development Program 2021SF-480Natural Science Basic Research Plan in Shaanxi Province of China 2018JM105
Classification of health deterioration by geometric invariants
The authors are grateful to the Operational Programme "Development of the Internal Grant Agency of the University of Hradec Kralove", reg. no. CZ.02.2.69/0.0/0.0/19_073/0016949, project no. IGRA-TYM-2021008 (investigators: Damian Busovsky and Katerina Voglova) .This study was also possible thanks to the project TP01010032 "The Centre of Creative Activities and Knowledge Transfer at University Hradec Kralove." This project was co -financed by the state budget of the Technology Agency of the Czech Republic under the GAMA 2 Progamme.Furthermore, the authors are grateful to the Excellence project PrF UHK 2215/2023-2024 for its financial support.Background and Objectives: Prediction of patient deterioration is essential in medical care, and its automation may reduce the risk of patient death. The precise monitoring of a patient's medical state requires devices placed on the body, which may cause discomfort. Our approach is based on the processing of long-term ballistocardiography data, which were measured using a sensory pad placed under the patient's mattress.Methods: The investigated dataset was obtained via long-term measurements in retirement homes and intensive care units (ICU). Data were measured unobtrusively using a measuring pad equipped with piezoceramic sensors. The proposed approach focused on the processing methods of the measured ballistocardiographic signals, Cartan curvature (CC), and Euclidean arc length (EAL).Results: For analysis, 218,979 normal and 216,259 aberrant 2-second samples were collected and classified using a convolutional neural network. Experiments using cross-validation with expert threshold and data length revealed the accuracy, sensitivity, and specificity of the proposed method to be 86.51Conclusions: The proposed method provides a unique approach for an early detection of health concerns in an unobtrusive manner. In addition, the suitability of EAL over the CC was determined.Operational Programme "Development of the Internal Grant Agency of the University of Hradec Kralove"
CZ.02.2.69/0.0/0.0/19_073/0016949,
IGRA-TYM-2021008Centre of Creative Activities and Knowledge Transfer at Uni- versity Hradec KraloveState budget of the Technology Agency of the Czech RepublicCentre of Creative Activities and Knowledge Transfer at University Hradec KraloveExcellence project PrF UHKTP01010032,
2215/2023-202
Best Practices of Convolutional Neural Networks for Question Classification
Question Classification (QC) is of primary importance in question answering systems,
since it enables extraction of the correct answer type. State-of-the-art solutions for short text
classification obtained remarkable results by Convolutional Neural Networks (CNNs). However,
implementing such models requires choices, usually based on subjective experience, or on rare
works comparing different settings for general text classification, while peculiar solutions should be
individuated for QC task, depending on language and on dataset size. Therefore, this work aims at
suggesting best practices for QC using CNNs. Different datasets were employed: (i) A multilingual
set of labelled questions to evaluate the dependence of optimal settings on language; (ii) a large,
widely used dataset for validation and comparison. Numerous experiments were executed, to perform
a multivariate analysis, for evaluating statistical significance and influence on QC performance of all
the factors (regarding text representation, architectural characteristics, and learning hyperparameters)
and some of their interactions, and for finding the most appropriate strategies for QC. Results show
the influence of CNN settings on performance. Optimal settings were found depending on language.
Tests on different data validated the optimization performed, and confirmed the transferability of
the best settings. Comparisons to configurations suggested by previous works highlight the best
classification accuracy by those optimized here. These findings can suggest the best choices to
configure a CNN for QC
An ELECTRA-Based Model for Neural Coreference Resolution
In last years, coreference resolution has received a sensibly performance boost exploiting
different pre-trained Neural Language Models, from BERT to SpanBERT until Longformer. This work is
aimed at assessing, for the rst time, the impact of ELECTRA model on this task, moved by the experimental
evidence of an improved contextual representation and better performance on different downstream tasks.
In particular, ELECTRA has been employed as representation layer in an assessed neural coreference
architecture able to determine entity mentions among spans of text and to best cluster them. The architecture
itself has been optimized: i) by simplifying the modality of representation of spans of text but still considering
both the context they appear and their entire content, ii) by maximizing both the number and length
of input textual segments to exploit better the improved contextual representation power of ELECTRA,
iii) by maximizing the number of spans of text to be processed, since potentially representing mentions,
preserving computational ef ciency. Experimental results on the OntoNotes dataset have shown the effectiveness
of this solution from both a quantitative and qualitative perspective, and also with respect to other
state-of-the-art models, thanks to a more pro cient token and span representation. The results also hint at
the possible use of this solution also for low-resource languages, simply requiring a pre-trained version of
ELECTRA instead of language-speci c models trained to handle either spans of text or long documents
A New Italian Cultural Heritage Data Set: Detecting Fake Reviews With BERT and ELECTRA Leveraging the Sentiment
Consiglio Nazionale delle Ricerche-CARI-CARE-ITALY’
within the CRUI CARE Agreemen
Automated detection of pain levels using deep feature extraction from shutter blinds‑based dynamic‑sized horizontal patches with facial images
Pain intensity classification using facial images is a challenging problem in computer vision research.
This work proposed a patch and transfer learning-based model to classify various pain intensities
using facial images. The input facial images were segmented into dynamic-sized horizontal patches
or “shutter blinds”. A lightweight deep network DarkNet19 pre-trained on ImageNet1K was used
to generate deep features from the shutter blinds and the undivided resized segmented input facial
image. The most discriminative features were selected from these deep features using iterative
neighborhood component analysis, which were then fed to a standard shallow fine k-nearest neighbor
classifier for classification using tenfold cross-validation. The proposed shutter blinds-based model
was trained and tested on datasets derived from two public databases—University of Northern
British Columbia-McMaster Shoulder Pain Expression Archive Database and Denver Intensity of
Spontaneous Facial Action Database—which both comprised four pain intensity classes that had
been labeled by human experts using validated facial action coding system methodology. Our shutter
blinds-based classification model attained more than 95% overall accuracy rates on both datasets.
The excellent performance suggests that the automated pain intensity classification model can be
deployed to assist doctors in the non-verbal detection of pain using facial images in various situations
(e.g., non-communicative patients or during surgery). This system can facilitate timely detection and
management of pain
- …