16 research outputs found

    The time is now: Achieving FH paediatric screening across Europe – The Prague Declaration

    Get PDF
    ReviewFamilial Hypercholesterolaemia (FH) is severely under-recognized, under-diagnosed and under-treated in Europe, leading to a significantly higher risk of premature cardiovascular diseases in those affected. FH stands for inherited, very high cholesterol and affects 1:300 individuals regardless of their age, race, sex, and lifestyle, making it the most common inherited metabolic disorder and a non-modifiable cardiovascular disease risk factor in the world..info:eu-repo/semantics/publishedVersio

    Slovene Natural Language Inference Dataset SI-NLI

    No full text
    SI-NLI (Slovene Natural Language Inference Dataset) contains 5,937 human-created Slovene sentence pairs (premise and hypothesis) that are manually labeled with the labels "entailment", "contradiction", and "neutral". We created the dataset using sentences that appear in the Slovenian reference corpus ccKres (http://hdl.handle.net/11356/1034). Annotators were tasked to modify the hypothesis in a candidate pair in a way that reflects one of the labels. The dataset is balanced since the annotators created three modifications (entailment, contradiction, neutral) for each candidate sentence pair. The dataset is split into train, validation, and test sets, with sizes of 4,392, 547, and 998. We used Slovenian pre-trained language models to create splits, thereby ensuring that difficult and easy instances are evenly distributed in all three subsets. The dataset is released in a tabular TSV format. The README.txt file contains a description of the attributes. Only the hypothesis and premise are given in the test set (i.e. no annotations) since SI-NLI is integrated into the Slovene evaluation framework SloBENCH (https://slobench.cjvt.si/). If you use the dataset to train your models, please consider submitting the test set predictions to SloBENCH to get the evaluation score and see how it compares to others

    A simplified nasopharyngeal swab collection procedure for minimizing patient discomfort while retaining sample quality

    Full text link
    A nasopharyngeal swab (NPS) is the most frequently collected sample type when molecular diagnosis of respiratory viruses, including SARS CoV-2, is required. An optimal collection technique would provide sufficient sample quality for the diagnostic process and would minimize the discomfort felt by the patient. This study compares a simplified NPS collection procedure with only one rotation of the swab to a more standard procedure with five rotations. Swabs were collected from 76 healthy volunteers by the same healthcare professional on 2 consecutive days at a similar hour to minimize variability. The number of Ubiquitin C copy number per sample was measured by real-time quantitative PCR and patient discomfort was assessed by questionnaire. No statistically significant difference (p = 0.15) was observed in the Ubiquitin C copy number per sample between a NPS collected with one rotation (5.2 ± 0.6 log UBC number copies/sample) or five rotations (5.3 ± 0.5 log UBC number copies/sample). However, a statistically significant difference was observed in discomfort between these two procedures, the second being much more uncomfortable. Additional analysis of the results showed a weak correlation between discomfort and the number of human cells recovered (Spearman\u27s rho = 0.202) and greater discomfort in younger people. The results of this study show that a NPS collected with one slow rotation has the same quality as a NPS collected with five rotations. However, the collection time is shorter and, most importantly, less unpleasant for patients

    Performance of nasopharyngeal swab and saliva in detecting Delta and Omicron SARS-CoV-2 variants

    Full text link
    A prospective cohort study was conducted during the Delta and Omicron severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) epidemic waves from paired nasopharyngeal swab (NPS or NP swab) and saliva samples taken from 624 participants. The study aimed to assess if any differences among participants from both waves could be observed and if any difference in molecular diagnostic performance could be observed among the two sample types. Samples were transported immediately to the laboratory to ensure the highest possible sample quality without any freezing and thawing steps before processing. Nucleic acids from saliva and NPS were prospectively extracted and SARS-CoV-2 was detected using a real-time reverse-transcription polymerase chain reaction. All observed results were statistically analyzed. Although the results obtained with NP and saliva agreed overall, higher viral loads were observed in NP swabs regardless of the day of specimen collection in both SARS-CoV-2 epidemic waves. No significant difference could be observed between the two epidemic waves characterized by Delta or Omicron SARS-CoV-2. To note, Delta infection resulted in higher viral loads both in NP and saliva and more symptoms, including rhinorrhea, cough, and dyspnea, whereas Omicron wave patients more frequently reported sore throat. An increase in the mean log RNA of SARS-CoV-2 was observed with the number of expressed symptoms in both waves, however, the difference was not significant. Data confirmed that results from saliva were concordant with those from NP swabs, although saliva proved to be a challenging sample with frequent inhibitions that required substantial retesting

    Corpus extraction tool LIST 1.2

    No full text
    The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that can be imported into Microsoft Excel or similar statistical processing software. Version 1.2 adds support for Gigafida 2.0 in XML format and fixes a bug which disabled the extraction of character-level n-grams from normalized forms in the GOS 1.0 corpus

    Corpus extraction tool LIST 1.0

    No full text
    The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that can be imported into Microsoft Excel or similar statistical processing software

    Morphological lexicon Sloleks 2.0

    No full text
    Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100,000 most frequent Slovenian lemmas, their inflected or derivative word forms and the corresponding grammatical description. Lemmatization rules, part-of-speech categorization and the set of feature-value pairs follow the JOS morphosyntactic specifications. In addition to grammatical information, each word form is also given the information on its absolute corpus frequency and its compliance with the reference language standard. Sloleks 2.0 includes accents automatically assigned by the use of neural networks (Krsnik 2017) and partially manually corrected, as well as automatically generated IPA and SAMPA transcriptions on lemmas and word-forms. The canonical version is encoded in XML, against the Sloleks LMF DTD. The resource is also available as a TSV file in the MULTEXT-East format, with wordform, lemma, MSD and frequency columns, also mapped to Universal Dependencies features. References: Kaja Dobrovoljc, Simon Krek and Tomaž Erjavec, 2017: The Sloleks Morphological Lexicon and its Future Development. In (Vojko Gorjanc, Polona Gantar, Iztok Kosem and Simon Krek, eds.): Dictionary of Modern Slovene: Problems and Solutions. Ljubljana University Press, Faculty of Arts. https://e-knjige.ff.uni-lj.si/znanstvena-zalozba/catalog/download/2/1/47-1 Krsnik, Luka. Napovedovanje naglasa slovenskih besed z metodami strojnega učenja: magistrsko delo: magistrski program druge stopnje Računalništvo in informatika. Ljubljana: [L. Krsnik], 2017. http://eprints.fri.uni-lj.si/3978

    Morphological lexicon Sloleks 3.0

    No full text
    Sloleks is a reference morphological lexicon of Slovene that was developed to be used in various NLP applications and language manuals. It contains Slovene lemmas, their inflected or derivative word forms and the corresponding grammatical description. In addition to the approx. 100,000 entries already available in Sloleks 2.0 (http://hdl.handle.net/11356/1230), Sloleks 3.0 contains an additional cca. 265,000 newly generated entries from the most frequent lemmas in Gigafida 2.0 (http://hdl.handle.net/11356/1320) not yet included in previous versions of Sloleks. For verbs, adjectives, adverbs, and common nouns, the lemmas were checked manually by three annotators and included in Sloleks only if confirmed as legitimate by at least one annotator. No manual checking was performed on proper nouns. Lemmatization rules, part-of-speech categorization and the set of feature-value pairs follow the MULTEXT-East morphosyntactic specifications for Slovenian (https://nl.ijs.si/ME/V6/msd/html/msd-sl.html). In addition to grammatical information, each word form is also given the information on its absolute corpus frequency and its compliance with the reference language standard. In addition, most entries contain information on their morphological patterns (see http://hdl.handle.net/11356/1411 for more on morphological patterns). The lexicon also includes accentuated word forms automatically generated through neural networks (Krsnik 2017). For the 100,000 entries from Sloleks 2.0, the accentuated forms were manually corrected, whereas the accentuated forms for the other 265,000 entries are fully automatic. IPA and SAMPA phonetic transcriptions were generated automatically using an improved G2P system for Slovene developed within the RSDO project (see https://github.com/clarinsi/slovene_g2p). Version 3.0 is encoded in XML, but unlike 2.0, which used the LMF format, the new version uses a custom XML format developed for the morphological lexicon by the Centre for Language Resources and Technologies of the University of Ljubljana (see the included .xsd files and "00README.txt" for details). Reference: Krsnik, Luka. Napovedovanje naglasa slovenskih besed z metodami strojnega učenja: magistrsko delo: magistrski program druge stopnje Računalništvo in informatika. Ljubljana: [L. Krsnik], 2017. http://eprints.fri.uni-lj.si/3978

    Thesaurus of Modern Slovene 1.0 (ELEXIS)

    No full text
    Slovar sopomenk sodobne slovenščine 1.0. This is an automatically created Slovene thesaurus from Slovene data available in a comprehensive English–Slovenian dictionary, a monolingual dictionary, and a corpus. A network analysis on the bilingual dictionary word co-occurrence graph was used, together with additional information from the distributional thesaurus data available as part of the Sketch Engine tool and extracted from the 1.2 billion word Gigafida corpus and the monolingual dictionary. See also: http://hdl.handle.net/11356/116
    corecore