Search CORE

122 research outputs found

Doctor of Philosophy

Author: Scarton Lou Anna A.
Publication venue: University of Utah
Publication date: 01/01/2018
Field of study

dissertationThe use of the various complementary and alternative medicine (CAM) modalities for the management of chronic illnesses is widespread, and still on the rise. Unfortunately, tools to support consumers in seeking information on the efficacy of these treatments are sparse and incomplete. The goals of this work were to understand CAM information needs in acquiring CAM information, assess currently available information resources, and investigate informatics methods to provide a foundation for the development of CAM information resources. This dissertation consists of four studies. The first was a quantitative study that aimed to assess the feasibility of delivering CAM-drug interaction information through a web-based application. This study resulted in an 85% participation rate and 33% of those patients reported the use of CAMs that had potential interactions with their conventional treatments. The next study aimed to assess online CAM information resources that provide information on drug-herb interactions to consumers. None of the sites scored high on the combination of completeness and accuracy and all sites were beyond the recommended reading level per the US Department of Health and Human Services. The third study investigated information-seeking behaviors for CAM information using an existing cohort of cancer survivors. The study showed that patients in the cohort continued to use CAM well into survivorship. Patients felt very much on their own in dealing with issues outside of direct treatment, which often resulted in a search for options and CAM use. Finally, a study was conducted to investigate two methods to semi-automatically extract CAM treatment relations from the biomedical literature. The methods rely on a database (SemMedDB) of semantic relations extracted from PubMed abstracts. This study demonstrated that SemMedDB can be used to reduce manual efforts, but review of the extracted sentences is still necessary due to a low mean precision of 23.7% and 26.4%. In summary, this dissertation provided greater insight into consumer information needs for CAM. Our findings provide an opportunity to leverage existing resources to improve the information-seeking experience for consumers through high-quality online tools, potentially moving them beyond the reliance on anecdotal evidence in the decision-making process for CAM

The University of Utah: J. Willard Marriott Digital Library

Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision

Author: Bontcheva Kalina
Leite João A.
Razuvayevskaya Olesya
Scarton Carolina
Publication venue
Publication date: 14/09/2023
Field of study

Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content. Automating the task of credibility signal extraction, however, is very challenging as it requires high-accuracy signal-specific extractors to be trained, while there are currently no sufficiently large datasets annotated with all credibility signals. This paper investigates whether large language models (LLMs) can be prompted effectively with a set of 18 credibility signals to produce weak labels for each signal. We then aggregate these potentially noisy labels using weak supervision in order to predict content veracity. We demonstrate that our approach, which combines zero-shot LLM credibility signal labeling and weak supervision, outperforms state-of-the-art classifiers on two misinformation datasets without using any ground-truth labels for training. We also analyse the contribution of the individual credibility signals towards predicting content veracity, which provides new valuable insights into their role in misinformation detection

arXiv.org e-Print Archive

Mediterranean developed coasts: what future for the foredune restoration?

Author: Buffa G.
Della Bella A.
Fantinato E.
Scarton F.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

The feasibility and efficacy of soft engineering foredune restoration approaches still lack insight from research and monitoring activities, especially in areas where dunes are under persisting human disturbance. We evaluated the efficacy of Mediterranean foredune restoration in dune areas freely accessible to tourists. Foredunes were reconstructed using only sand already available at nearby places and consolidated through the plantation of seedlings of native ecosystem engineer species and foredune focal species. We monitored transplanted and spontaneous seedlings for one year to assess their mortality and growth in relation to the distance from the closest beach access, either formal or informal, as proxy of human disturbance.We also tested whether species differing in their ecology (i.e., affinity to a given habitat) and growth form showed different response to human disturbance. The relationship between seedling mortality and growth and the distance from the closest beach access was tested through Generalized Linear Mixed Models. We found a clear spatial pattern of seedling survival and growth, which decreased as the proximity to the closest beach access increased. Only invasive alien plants and erect leafy species showed to better perform at lower distances from beach accesses. In dune areas with a strong tourist vocation, foredune restoration should be coupled with the implementation of integrated management plans aiming at optimising the relationship between protection and use. Management plans should not only rely on passive conservation measures; rather they should include educational activities to stimulate a pro-environmental behaviour, increase the acceptance of behaviour rules and no entry zones, and actively engage stakeholders in long-term conservation

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Testing the performance of an innovative markerless technique for quantitative and qualitative gait analysis

Author: Gerli F.
Gori F.
Macchi C.
Pasquini G.
Pogliaghi S.
Scarton A.
Simoni L.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Gait abnormalities such as high stride and step frequency/cadence (SF-stride/second, CAD-step/second), stride variability (SV) and low harmony may increase the risk of injuries and be a sentinel of medical conditions. This research aims to present a new markerless video-based technology for quantitative and qualitative gait analysis. 86 healthy individuals (mead age 32 years) performed a 90 s test on treadmill at self-selected walking speed. We measured SF and CAD by a photoelectric sensors system; then, we calculated average \ub1 standard deviation (SD) and within-subject coefficient of variation (CV) of SF as an index of SV. We also recorded a 60 fps video of the patient. With a custom-designed web-based video analysis software, we performed a spectral analysis of the brightness over time for each pixel of the image, that reinstituted the frequency contents of the videos. The two main frequency contents (F1 and F2) from this analysis should reflect the forcing/dominant variables, i.e., SF and CAD. Then, a harmony index (HI) was calculated, that should reflect the proportion of the pixels of the image that move consistently with F1 or its supraharmonics. The higher the HI value, the less variable the gait. The correspondence SF-F1 and CAD-F2 was evaluated with both paired t-Test and correlation and the relationship between SV and HI with correlation. SF and CAD were not significantly different from and highly correlated with F1 (0.893 \ub1 0.080 Hz vs. 0.895 \ub1 0.084 Hz, p < 0.001, r2 = 0.99) and F2 (1.787 \ub1 0.163 Hz vs. 1.791 \ub1 0.165 Hz, p < 0.001, r2 = 0.97). The SV was 1.84% \ub1 0.66% and it was significantly and moderately correlated with HI (0.082 \ub1 0.028, p < 0.001, r2 = 0.13). The innovative video-based technique of global, markerless gait analysis proposed in our study accurately identifies the main frequency contents and the variability of gait in healthy individuals, thus providing a time-efficient, low-cost means to quantitatively and qualitatively study human locomotion

Multidisciplinary Digital Publishing Institute

Florence Research

Catalogo dei prodotti della ricerca

ASSET : a dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations

Author: Alva-Manchego F.
Bordes A.
Martin L.
Sagot B.
Scarton C.
Specia L.
Publication venue
Publication date
Field of study

In order to simplify a sentence, human editors perform multiple rewriting transformations: they split it into several shorter sentences, paraphrase words (i.e. replacing complex words or phrases by simpler synonyms), reorder components, and/or delete information deemed unnecessary. Despite these varied range of possible text alterations, current models for automatic sentence simplification are evaluated using datasets that are focused on a single transformation, such as lexical paraphrasing or splitting. This makes it impossible to understand the ability of simplification models in more realistic settings. To alleviate this limitation, this paper introduces ASSET, a new dataset for assessing sentence simplification in English. ASSET is a crowdsourced multi-reference corpus where each simplification was produced by executing several rewriting transformations. Through quantitative and qualitative experiments, we show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task. Furthermore, we motivate the need for developing better methods for automatic evaluation using ASSET, since we show that current popular metrics may not be suitable when multiple simplification transformations are performed

White Rose Research Online

Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

Author: Bontcheva Kalina
Heppell Freddy
Leite Joao A.
Razuvayevskaya Olesya
Scarton Carolina
Song Xingyi
Srba Ivan
Wu Ben
Publication venue
Publication date: 14/08/2023
Field of study

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks

arXiv.org e-Print Archive

White Rose Research Online

Solution of the End Problem of a Liquid-Filled Cylindrical Acoustic Waveguide Using a Biorthogonality Principle

Author: Associate Professor. H A Scarton
Mem Asme
W F Albers
Publication venue
Publication date: 11/04/2020
Field of study

This paper treats the forced motion of an isothermal, Newtonian liquid in a sem

CiteSeerX

Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting

Author: Birch A.
Forcada M.L.
Haddow B.
Scarton C.
Specia L.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language. Evaluation of the usefulness of MT for gisting is surprisingly uncommon. The classical method uses reading comprehension questionnaires (RCQ), in which informants are asked to answer professionally-written questions in their language about a foreign text that has been machine-translated into their language. Recently, gap-filling (GF), a form of cloze testing, has been proposed as a cheaper alternative to RCQ. In GF, certain words are removed from reference translations and readers are asked to fill the gaps left using the machine-translated text as a hint. This paper reports, for thefirst time, a comparative evaluation, using both RCQ and GF, of translations from multiple MT systems for the same foreign texts, and a systematic study on the effect of variables such as gap density, gap-selection strategies, and document context in GF. The main findings of the study are: (a) both RCQ and GF clearly identify MT to be useful, (b) global RCQ and GF rankings for the MT systems are mostly in agreement, (c) GF scores vary very widely across informants, making comparisons among MT systems hard, and (d) unlike RCQ, which is framed around documents, GF evaluation can be framed at the sentence level. These findings support the use of GF as a cheaper alternative to RCQ

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

White Rose Research Online

Probing for idiomaticity in vector space models

Author: Garcia M.
Idiart M.
Scarton C.
Vieira T.K.
Villavicencio A.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/04/2021
Field of study

Contextualised word representation models have been successfully used for capturing different word usages and they may be an attractive alternative for representing idiomaticity in language. In this paper, we propose probing measures to assess if some of the expected linguistic properties of noun compounds, especially those related to idiomatic meanings, and their dependence on context and sensitivity to lexical choice, are readily available in some standard and widely used representations. For that, we constructed the Noun Compound Senses Dataset, which contains noun compounds and their paraphrases, in context neutral and context informative naturalistic sentences, in two languages: English and Portuguese. Results obtained using four types of probing measures with models like ELMo, BERT and some of its variants, indicate that idiomaticity is not yet accurately represented by contextualised models

White Rose Research Online

Categorising fine-to-coarse grained misinformation : an empirical study of COVID-19 infodemic

Author: Aker A.
Bontcheva K.
Jiang Y.
Scarton C.
Song X.
Publication venue
Publication date
Field of study

The spreading COVID-19 misinformation over social media already draws the attention of many researchers. According to Google Scholar, about 26000 COVID-19 related misinformation studies have been published to date. Most of these studies focusing on 1) detect and/or 2) analysing the characteristics of COVID-19 related misinformation. However, the study of the social behaviours related to misinformation is often neglected. In this paper, we introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation (e.g. comment or question to the misinformation). The dataset not only allows social behaviours analysis but also suitable for both evidence-based or non-evidence-based misinformation classification task. In addition, we introduce leave claim out validation in our experiments and demonstrate the misinformation classification performance could be significantly different when applying to real-world unseen misinformation

White Rose Research Online