507 research outputs found

    Mitigating Health Data Poverty: Generative Approaches versus Resampling for Time-series Clinical Data

    Get PDF
    Several approaches have been developed to mitigate algorithmic bias stemming from health data poverty, where minority groups are underrepresented in training datasets. Augmenting the minority class using resampling (such as SMOTE) is a widely used approach due to the simplicity of the algorithms. However, these algorithms decrease data variability and may introduce correlations between samples, giving rise to generative approaches based on GAN. Generation of high-dimensional, time-series, authentic data that provide a wide distribution coverage of the real data, remains a challenging task for both resampling and GAN-based approaches. In this work we propose CA-GAN architecture that addresses some of the shortcomings of the current approaches, where we provide a detailed comparison with both SMOTE and WGAN-GP, using a high-dimensional, time-series, real dataset of 3343 hypotensive Caucasian and Black patients. We show that our approach is better at both generating authentic data of the minority class and remaining within the original distribution of the real data

    Generating Medical Prescriptions with Conditional Transformer

    Full text link
    Access to real-world medication prescriptions is essential for medical research and healthcare quality improvement. However, access to real medication prescriptions is often limited due to the sensitive nature of the information expressed. Additionally, manually labelling these instructions for training and fine-tuning Natural Language Processing (NLP) models can be tedious and expensive. We introduce a novel task-specific model architecture, Label-To-Text-Transformer (\textbf{LT3}), tailored to generate synthetic medication prescriptions based on provided labels, such as a vocabulary list of medications and their attributes. LT3 is trained on a set of around 2K lines of medication prescriptions extracted from the MIMIC-III database, allowing the model to produce valuable synthetic medication prescriptions. We evaluate LT3's performance by contrasting it with a state-of-the-art Pre-trained Language Model (PLM), T5, analysing the quality and diversity of generated texts. We deploy the generated synthetic data to train the SpacyNER model for the Named Entity Recognition (NER) task over the n2c2-2018 dataset. The experiments show that the model trained on synthetic data can achieve a 96-98\% F1 score at Label Recognition on Drug, Frequency, Route, Strength, and Form. LT3 codes and data will be shared at \url{https://github.com/HECTA-UoM/Label-To-Text-Transformer}Comment: Accepted to: Workshop on Synthetic Data Generation with Generative AI (SyntheticData4ML Workshop) at NeurIPS 202

    Large Language Models and Control Mechanisms Improve Text Readability of Biomedical Abstracts

    Full text link
    Biomedical literature often uses complex language and inaccessible professional terminologies. That is why simplification plays an important role in improving public health literacy. Applying Natural Language Processing (NLP) models to automate such tasks allows for quick and direct accessibility for lay readers. In this work, we investigate the ability of state-of-the-art large language models (LLMs) on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}). The methods applied include domain fine-tuning and prompt-based learning (PBL) on: 1) Encoder-decoder models (T5, SciFive, and BART), 2) Decoder-only GPT models (GPT-3.5 and GPT-4) from OpenAI and BioGPT, and 3) Control-token mechanisms on BART-based models. We used a range of automatic evaluation metrics, including BLEU, ROUGE, SARI, and BERTscore, and also conducted human evaluations. BART-Large with Control Token (BART-L-w-CT) mechanisms reported the highest SARI score of 46.54 and T5-base reported the highest BERTscore 72.62. In human evaluation, BART-L-w-CTs achieved a better simplicity score over T5-Base (2.9 vs. 2.2), while T5-Base achieved a better meaning preservation score over BART-L-w-CTs (3.1 vs. 2.6). We also categorised the system outputs with examples, hoping this will shed some light for future research on this task. Our code, fine-tuned models, and data splits are available at \url{https://github.com/HECTA-UoM/PLABA-MU}Comment: working pape

    Multiplicity dependence of light (anti-)nuclei production in p–Pb collisions at sNN=5.02 TeV

    Get PDF
    The measurement of the deuteron and anti-deuteron production in the rapidity range −1 < y < 0 as a function of transverse momentum and event multiplicity in p–Pb collisions at √sNN = 5.02 TeV is presented. (Anti-)deuterons are identified via their specific energy loss dE/dx and via their time-of- flight. Their production in p–Pb collisions is compared to pp and Pb–Pb collisions and is discussed within the context of thermal and coalescence models. The ratio of integrated yields of deuterons to protons (d/p) shows a significant increase as a function of the charged-particle multiplicity of the event starting from values similar to those observed in pp collisions at low multiplicities and approaching those observed in Pb–Pb collisions at high multiplicities. The mean transverse particle momenta are extracted from the deuteron spectra and the values are similar to those obtained for p and particles. Thus, deuteron spectra do not follow mass ordering. This behaviour is in contrast to the trend observed for non-composite particles in p–Pb collisions. In addition, the production of the rare 3He and 3He nuclei has been studied. The spectrum corresponding to all non-single diffractive p-Pb collisions is obtained in the rapidity window −1 < y < 0 and the pT-integrated yield dN/dy is extracted. It is found that the yields of protons, deuterons, and 3He, normalised by the spin degeneracy factor, follow an exponential decrease with mass number

    Measurement of inclusive J/ψ\psi pair production cross section in pp collisions at s=13\sqrt{s} = 13 TeV

    No full text
    International audienceThe production cross section of inclusive J/ψ\psi pairs in pp collisions at a centre-of-mass energy s=13\sqrt{s} = 13 TeV is measured with ALICE. The measurement is performed for J/ψ\psi in the rapidity interval 2.502.5 0. The production cross section of inclusive J/ψ\psi pairs is reported to be 10.3±2.3(stat.)±1.3(syst.)10.3 \pm 2.3 {\rm (stat.)} \pm 1.3 {\rm (syst.)} nb in this kinematic interval. The contribution from non-prompt J/ψ\psi (i.e. originated from beauty-hadron decays) to the inclusive sample is evaluated. The results are discussed and compared with data

    Inclusive and multiplicity dependent production of electrons from heavy-flavour hadron decays in pp and p-Pb collisions

    No full text
    International audienceMeasurements of the production of electrons from heavy-flavour hadron decays in pp collisions at s=13\sqrt{s} = 13 TeV at midrapidity with the ALICE detector are presented down to a transverse momentum (pTp_{\rm T}) of 0.2 GeV/c/c and up to pT=35p_{\rm T} = 35 GeV/c/c, which is the largest momentum range probed for inclusive electron measurements in ALICE. In p-Pb collisions, the production cross section and the nuclear modification factor of electrons from heavy-flavour hadron decays are measured in the pTp_{\rm T} range 0.5<pT<260.5 < p_{\rm T} < 26 GeV/c/c at sNN=8.16\sqrt{s_{\rm NN}} = 8.16 TeV. The nuclear modification factor is found to be consistent with unity within the statistical and systematic uncertainties. In both collision systems, first measurements of the yields of electrons from heavy-flavour hadron decays in different multiplicity intervals normalised to the multiplicity-integrated yield (self-normalised yield) at midrapidity are reported as a function of the self-normalised charged-particle multiplicity estimated at midrapidity. The self-normalised yields in pp and p-Pb collisions grow faster than linear with the self-normalised multiplicity. A strong pTp_{\rm T} dependence is observed in pp collisions, where the yield of high-pTp_{\rm T} electrons increases faster as a function of multiplicity than the one of low-pTp_{\rm T} electrons. The measurement in p-Pb collisions shows no pTp_{\rm T} dependence within uncertainties. The self-normalised yields in pp and p-Pb collisions are compared with measurements of other heavy-flavour, light-flavour, and strange particles, and with Monte Carlo simulations

    Observation of medium-induced yield enhancement and acoplanarity broadening of low-pTp_\mathrm{T} jets from measurements in pp and central Pb-Pb collisions at sNN=5.02\sqrt{s_{\rm NN}}=5.02 TeV

    No full text
    International audienceThe ALICE Collaboration reports the measurement of semi-inclusive distributions of charged-particle jets recoiling from a high transverse momentum (high pTp_{\rm T}) hadron trigger in proton-proton and central Pb-Pb collisions at sNN=5.02\sqrt{s_{\rm NN}} = 5.02 TeV. A data-driven statistical method is used to mitigate the large uncorrelated background in central Pb-Pb collisions. Recoil jet distributions are reported for jet resolution parameter R=0.2R=0.2, 0.4, and 0.5 in the range 7<pT,jet<1407 < p_{\rm T,jet} < 140 GeV/c/c and trigger-recoil jet azimuthal separation π/2<Δφ<π\pi/2 < \Delta\varphi < \pi. The measurements exhibit a marked medium-induced jet yield enhancement at low pTp_{\rm T} and at large azimuthal deviation from Δφπ\Delta\varphi\sim\pi. The enhancement is characterized by its dependence on Δφ\Delta\varphi, which has a slope that differs from zero by 4.7σ\sigma. Comparisons to model calculations incorporating different formulations of jet quenching are reported. These comparisons indicate that the observed yield enhancement arises from the response of the QGP medium to jet propagation

    Probing the Chiral Magnetic Wave with charge-dependent flow measurements in Pb-Pb collisions at the LHC

    No full text
    International audienceThe Chiral Magnetic Wave (CMW) phenomenon is essential to provide insights into the strong interaction in QCD, the properties of the quark-gluon plasma, and the topological characteristics of the early universe, offering a deeper understanding of fundamental physics in high-energy collisions. Measurements of the charge-dependent anisotropic flow coefficients are studied in Pb-Pb collisions at center-of-mass energy per nucleon-nucleon collision sNN=\sqrt{s_{\mathrm{NN}}}= 5.02 TeV to probe the CMW. In particular, the slope of the normalized difference in elliptic (v2v_{2}) and triangular (v3v_{3}) flow coefficients of positively and negatively charged particles as a function of their event-wise normalized number difference, is reported for inclusive and identified particles. The slope r3Normr_{3}^{\rm Norm} is found to be larger than zero and to have a magnitude similar to r2Normr_{2}^{\rm Norm}, thus pointing to a large background contribution for these measurements. Furthermore, r2Normr_{2}^{\rm Norm} can be described by a blast wave model calculation that incorporates local charge conservation. In addition, using the event shape engineering technique yields a fraction of CMW (fCMWf_{\rm CMW}) contribution to this measurement which is compatible with zero. This measurement provides the very first upper limit for fCMWf_{\rm CMW}, and in the 10-60% centrality interval it is found to be 26% (38%) at 95% (99.7%) confidence level

    Measurement of the Cross Sections of Ξc0\Xi^0_{c} and Ξc+\Xi^+_{c} Baryons and of the Branching-Fraction Ratio BR(Ξc0Ξe+νe\Xi^0_{c} \rightarrow \Xi^-{e}^+\nu_{ e})/BR(Ξc0Ξπ+\Xi^0_{c} \rightarrow \Xi^-\pi^+) in pp collisions at 13 TeV

    No full text
    The pTp_T-differential cross sections of prompt charm-strange baryons Ξc0_c^0 and Ξc+_c^+ were measured at midrapidity (|y|<0.5) in proton-proton (pp) collisions at a center-of-mass energy s\sqrt{s} = 13 TeV with the ALICE detector at the LHC. The Ξc0_c^0 baryon was reconstructed via both the semileptonic decay (Ξ^-e+^+νe_e) and the hadronic decay (Ξ^-π+^+) channels. The Ξc+_c^+ baryon was reconstructed via the hadronic decay (Ξ^-π+^+π+^+) channel. The branching-fraction ratio BR(Ξc0_c^0→Ξ^-e+^+νe_e)/BR(Ξc0_c^0→Ξ^-π+^+) = 1.38±0.14(stat)±0.22(syst) was measured with a total uncertainty reduced by a factor of about 3 with respect to the current world average reported by the Particle Data Group. The transverse momentum (pTp_T) dependence of the Ξc0_c^0- and Ξc+_c^+-baryon production relative to the D0^0 meson and to the Σc0,+,++_c^{0,+,++}- and Λc+_c^+-baryon production are reported. The baryon-to-meson ratio increases toward low pTp_T up to a value of approximately 0.3. The measurements are compared with various models that take different hadronization mechanisms into consideration. The results provide stringent constraints to these theoretical calculations and additional evidence that different processes are involved in charm hadronization in electron-positron (e+^+e^-) and hadronic collisions
    corecore