11 research outputs found

    CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data

    Full text link
    In recent years, the field of document understanding has progressed a lot. A significant part of this progress has been possible thanks to the use of language models pretrained on large amounts of documents. However, pretraining corpora used in the domain of document understanding are single domain, monolingual, or nonpublic. Our goal in this paper is to propose an efficient pipeline for creating a big-scale, diverse, multilingual corpus of PDF files from all over the Internet using Common Crawl, as PDF files are the most canonical types of documents as considered in document understanding. We analysed extensively all of the steps of the pipeline and proposed a solution which is a trade-off between data quality and processing time. We also share a CCpdf corpus in a form or an index of PDF files along with a script for downloading them, which produces a collection useful for language model pretraining. The dataset and tools published with this paper offer researchers the opportunity to develop even better multilingual language models.Comment: Accepted at ICDAR 202

    Immune-escape mutations and stop-codons in HBsAg develop in a large proportion of patients with chronic HBV infection exposed to anti-HBV drugs in Europe

    Get PDF
    Background: HBsAg immune-escape mutations can favor HBV-transmission also in vaccinated individuals, promote immunosuppression-driven HBV-reactivation, and increase fitness of drug-resistant strains. Stop-codons can enhance HBV oncogenic-properties. Furthermore, as a consequence of the overlapping structure of HBV genome, some immune-escape mutations or stop-codons in HBsAg can derive from drug-resistance mutations in RT. This study is aimed at gaining insight in prevalence and characteristics of immune-associated escape mutations, and stop-codons in HBsAg in chronically HBV-infected patients experiencing nucleos(t)ide analogues (NA) in Europe. Methods: This study analyzed 828 chronically HBV-infected European patients exposed to ≥ 1 NA, with detectable HBV-DNA and with an available HBsAg-sequence. The immune-associated escape mutations and the NA-induced immune-escape mutations sI195M, sI196S, and sE164D (resulting from drug-resistance mutation rtM204 V, rtM204I, and rtV173L) were retrieved from literature and examined. Mutations were defined as an aminoacid substitution with respect to a genotype A or D reference sequence. Results: At least one immune-associated escape mutation was detected in 22.1% of patients with rising temporal-trend. By multivariable-analysis, genotype-D correlated with higher selection of ≥ 1 immune-associated escape mutation (OR[95%CI]:2.20[1.32-3.67], P = 0.002). In genotype-D, the presence of ≥ 1 immune-associated escape mutations was significantly higher in drug-exposed patients with drug-resistant strains than with wild-type virus (29.5% vs 20.3% P = 0.012). Result confirmed by ana

    Evaluation of Clinical Biomarkers Related to CD4 Recovery in HIV-Infected Patients—5-Year Observation

    No full text
    Human Immunodeficiency Virus infection leads to the impairment of immune system function. Even long-term antiretroviral therapy uncommonly leads to the normalization of CD4 count and CD4:CD8 ratio. The aim of this study was to evaluate possible clinical biomarkers which may be related to CD4 and CD4:CD8 ratio recovery among HIV-infected patients with long-term antiretroviral therapy. The study included 68 HIV-infected patients undergoing sustained antiretroviral treatment for a minimum of 5 years. Clinical biomarkers such as age, gender, advancement of HIV infection, coinfections, comorbidities and applied ART regimens were analyzed in relation to the rates of CD4 and CD4:CD8 increase and normalization rates. The results showed that higher rates of CD4 normalization are associated with younger age (p = 0.034), higher CD4 count (p = 0.034) and starting the therapy during acute HIV infection (p = 0.012). Higher rates of CD4:CD8 ratio normalization are correlated with higher CD4 cell count (p = 0.022), high HIV viral load (p = 0.006) and acute HIV infection (p = 0.013). We did not observe statistically significant differences in CD4 recovery depending on gender, HCV/HBV coinfections, comorbidities and opportunistic infections. The obtained results advocate for current recommendations of introducing antiretroviral therapy as soon as possible, preferably during acute HIV infection, since it increases the chances of sufficient immune reconstruction

    Combined analysis of the prevalence of drug-resistant Hepatitis B virus in antiviral therapy-experienced patients in Europe (CAPRE)

    No full text
    Background European guidelines recommend treatment of chronic hepatitis B virus infection (CHB) with the nucleos(t)ide analogs (NAs) entecavir or tenofovir. However, many European CHB patients have been exposed to other NAs, which are associated with therapy failure and resistance. The CAPRE study was performed to gain insight in prevalence and characteristics of NA resistance in Europe. Methods A survey was performed on genotypic resistance testing results acquired during routine monitoring of CHB patients with detectable serum hepatitis B virus DNA in European tertiary referral centers. Results Data from 1568 patients were included. The majority (73.8%) were exposed to lamivudine monotherapy. Drug-resistant strains were detected in 52.7%. The most frequently encountered primary mutation was M204V/I (48.7%), followed by A181T/V (3.8%) and N236T (2.6%). In patients exposed to entecavir (n = 102), full resistance was present in 35.3%. Independent risk factors for resistance were age, viral load, and lamivudine exposure (

    Combined Analysis of the Prevalence of Drug-Resistant Hepatitis B Virus in Antiviral Therapy-Experienced Patients in Europe (CAPRE)

    No full text
    European guidelines recommend treatment of chronic hepatitis B virus infection (CHB) with the nucleos(t)ide analogs (NAs) entecavir or tenofovir. However, many European CHB patients have been exposed to other NAs, which are associated with therapy failure and resistance. The CAPRE study was performed to gain insight in prevalence and characteristics of NA resistance in Europe. A survey was performed on genotypic resistance testing results acquired during routine monitoring of CHB patients with detectable serum hepatitis B virus DNA in European tertiary referral centers. Data from 1568 patients were included. The majority (73.8%) were exposed to lamivudine monotherapy. Drug-resistant strains were detected in 52.7%. The most frequently encountered primary mutation was M204V/I (48.7%), followed by A181T/V (3.8%) and N236T (2.6%). In patients exposed to entecavir (n = 102), full resistance was present in 35.3%. Independent risk factors for resistance were age, viral load, and lamivudine exposure (P < .001). These findings support resistance testing in cases of apparent NA therapy failure. This survey highlights the impact of exposure to lamivudine and adefovir on development of drug resistance and cross-resistance. Continued use of these NAs needs to be reconsidered at a pan-European level

    Immune-escape mutations and stop-codons in HBsAg develop in a large proportion of patients with chronic HBV infection exposed to anti-HBV drugs in Europe

    Get PDF
    Background: HBsAg immune escape mutations can favor HBV-transmission also in vaccinated individuals, promote immunosuppression-driven HBV-reactivation, and increase fitness of drug-resistant strains. Stop-codons can enhance HBV oncogenic-properties. Furthermore, as a consequence of the overlapping structure of HBV genome, some immune-escape mutations or stop-codons in HBsAg can derive from drug-resistance mutations in RT. This study is aimed at gaining insight in prevalence and characteristics of immune-associated escape mutations, and stop-codons in HBsAg in chronically HBV-infected patients experiencing nucleos(t)side analogues (NA) in Europe. Methods: This study analyzed 828 chronically HBV-infected European patients exposed to >= 1 NA, with detectable HBV-DNA and with an available HBsAg-sequence. The immune-associated escape mutations and the NA-induced immune-escape mutations sl195M, sl196S, and sE164D (resulting from drug-resistance mutation rtM204 V, rtM204l, and rtV173L) were retrieved from literature and examined. Mutations were defined as an aminoacid substitution with respect to a genotype A or D reference sequence. Results: At least one immune associated escape mutation was detected in 22.1% of patients with rising temporal-trend. By multivariable-analysis, genotype-D correlated with higher selection of >= 1 immune associated escape mutation (OR[95%Cl]:2.20[1.32-3.67], P = 0.002). In genotype D, the presence of >= 1 immune associated escape mutations was significantly higher in drug-exposed patients with drug-resistant strains than with wild-type virus (29.5% vs 20.3% P = 0.012). Result confirmed by analysing drug-naive patients (29.5% vs 21. 2%, P= 0.032). Strong correlation was observed between sP120T and rtM204l/V (P < 0.001), and their co- presence determined an increased HBV-DNA. At least one NA-induced immune-escape mutation occurred in 28.6% of patients, and their selection correlated with genotype-A (OR[95%Cl]:2.03[1.32-3.10],P = 0.001). Finally, stop-codons are present in 8.4% of patients also at HBsAg-positions 172 and 182, described to enhance viral oncogenic-properties. Conclusions: Immune-escape mutations and stop-codons develop in a large fraction of NA-exposed patients from Europe. This may represent a potential threat for horizontal and vertical HBV transmission also to vaccinated persons, and fuel drug-resistance emergence

    Immune-escape mutations and stop-codons in HBsAg develop in a large proportion of patients with chronic HBV infection exposed to anti-HBV drugs in Europe

    No full text
    Abstract Background HBsAg immune-escape mutations can favor HBV-transmission also in vaccinated individuals, promote immunosuppression-driven HBV-reactivation, and increase fitness of drug-resistant strains. Stop-codons can enhance HBV oncogenic-properties. Furthermore, as a consequence of the overlapping structure of HBV genome, some immune-escape mutations or stop-codons in HBsAg can derive from drug-resistance mutations in RT. This study is aimed at gaining insight in prevalence and characteristics of immune-associated escape mutations, and stop-codons in HBsAg in chronically HBV-infected patients experiencing nucleos(t)ide analogues (NA) in Europe. Methods This study analyzed 828 chronically HBV-infected European patients exposed to ≥ 1 NA, with detectable HBV-DNA and with an available HBsAg-sequence. The immune-associated escape mutations and the NA-induced immune-escape mutations sI195M, sI196S, and sE164D (resulting from drug-resistance mutation rtM204 V, rtM204I, and rtV173L) were retrieved from literature and examined. Mutations were defined as an aminoacid substitution with respect to a genotype A or D reference sequence. Results At least one immune-associated escape mutation was detected in 22.1% of patients with rising temporal-trend. By multivariable-analysis, genotype-D correlated with higher selection of ≥ 1 immune-associated escape mutation (OR[95%CI]:2.20[1.32–3.67], P = 0.002). In genotype-D, the presence of ≥ 1 immune-associated escape mutations was significantly higher in drug-exposed patients with drug-resistant strains than with wild-type virus (29.5% vs 20.3% P = 0.012). Result confirmed by analysing drug-naïve patients (29.5% vs 21.2%, P = 0.032). Strong correlation was observed between sP120T and rtM204I/V (P

    Immune-escape mutations and stop-codons in HBsAg develop in a large proportion of patients with chronic HBV infection exposed to anti-HBV drugs in Europe

    No full text
    Abstract Background HBsAg immune-escape mutations can favor HBV-transmission also in vaccinated individuals, promote immunosuppression-driven HBV-reactivation, and increase fitness of drug-resistant strains. Stop-codons can enhance HBV oncogenic-properties. Furthermore, as a consequence of the overlapping structure of HBV genome, some immune-escape mutations or stop-codons in HBsAg can derive from drug-resistance mutations in RT. This study is aimed at gaining insight in prevalence and characteristics of immune-associated escape mutations, and stop-codons in HBsAg in chronically HBV-infected patients experiencing nucleos(t)ide analogues (NA) in Europe. Methods This study analyzed 828 chronically HBV-infected European patients exposed to ≥ 1 NA, with detectable HBV-DNA and with an available HBsAg-sequence. The immune-associated escape mutations and the NA-induced immune-escape mutations sI195M, sI196S, and sE164D (resulting from drug-resistance mutation rtM204 V, rtM204I, and rtV173L) were retrieved from literature and examined. Mutations were defined as an aminoacid substitution with respect to a genotype A or D reference sequence. Results At least one immune-associated escape mutation was detected in 22.1% of patients with rising temporal-trend. By multivariable-analysis, genotype-D correlated with higher selection of ≥ 1 immune-associated escape mutation (OR[95%CI]:2.20[1.32–3.67], P = 0.002). In genotype-D, the presence of ≥ 1 immune-associated escape mutations was significantly higher in drug-exposed patients with drug-resistant strains than with wild-type virus (29.5% vs 20.3% P = 0.012). Result confirmed by analysing drug-naïve patients (29.5% vs 21.2%, P = 0.032). Strong correlation was observed between sP120T and rtM204I/V (P
    corecore