4 research outputs found

    Systematic Characterizations of Text Similarity in Full Text Biomedical Publications

    Get PDF
    Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text articles are becoming increasingly available, yet the similarities among them have not been systematically studied. Here, we quantitatively investigated the full text similarity of biomedical publications in PubMed Central.72,011 full text articles from PubMed Central (PMC) were parsed to generate three different datasets: full texts, sections, and paragraphs. Text similarity comparisons were performed on these datasets using the text similarity algorithm eTBLAST. We measured the frequency of similar text pairs and compared it among different datasets. We found that high abstract similarity can be used to predict high full text similarity with a specificity of 20.1% (95% CI [17.3%, 23.1%]) and sensitivity of 99.999%. Abstract similarity and full text similarity have a moderate correlation (Pearson correlation coefficient: -0.423) when the similarity ratio is above 0.4. Among pairs of articles in PMC, method sections are found to be the most repetitive (frequency of similar pairs, methods: 0.029, introduction: 0.0076, results: 0.0043). In contrast, among a set of manually verified duplicate articles, results are the most repetitive sections (frequency of similar pairs, results: 0.94, methods: 0.89, introduction: 0.82). Repetition of introduction and methods sections is more likely to be committed by the same authors (odds of a highly similar pair having at least one shared author, introduction: 2.31, methods: 1.83, results: 1.03). There is also significantly more similarity in pairs of review articles than in pairs containing one review and one nonreview paper (frequency of similar pairs: 0.0167 and 0.0023, respectively).While quantifying abstract similarity is an effective approach for finding duplicate citations, a comprehensive full text analysis is necessary to uncover all potential duplicate citations in the scientific literature and is helpful when establishing ethical guidelines for scientific publications

    Medical theses and derivative articles: dissemination of contents and publication patterns

    No full text
    Doctoral theses are an important source of publication in universities, although little research has been carried out on the publications resulting from theses, on so-called derivative articles. This study investigates how derivative articles can be identified through a text analysis based on the full-text of a set of medical theses and the full-text of articles, with which they shared authorship. The text similarity analysis methodology applied consisted in exploiting the full-text articles according to organization of scientific discourse IMRaD (Introduction, Methodology, Results and Discussion) using the TurnItIn plagiarism tool. The study found that the text similarity rate in the Discussion section can be used to discriminate derivative articles from non-derivative articles. Additional findings were: the first position of the thesis’s author dominated in 85 % of derivative articles, the participation of supervisors as coauthors occurred in 100 % of derivative articles, the authorship credit retained by the thesis’s author was 42 % in derivative articles, the number of coauthors by article was 5 in derivative articles versus 6.4 coauthors, as average, in non-derivative articles and the time differential regarding the year of thesis completion showed that 87.5 % of derivative articles were published before or in the same year of thesis completion.</p

    The quest to slow ageing through drug discovery

    No full text
    Although death is inevitable, individuals have long sought to alter the course of the ageing process. Indeed, ageing has proved to be modifiable; by intervening in biological systems, such as nutrient sensing, cellular senescence, the systemic environment and the gut microbiome, phenotypes of ageing can be slowed sufficiently to mitigate age-related functional decline. These interventions can also delay the onset of many disabling, chronic diseases, including cancer, cardiovascular disease and neurodegeneration, in animal models. Here, we examine the most promising interventions to slow ageing and group them into two tiers based on the robustness of the preclinical, and some clinical, results, in which the top tier includes rapamycin, senolytics, metformin, acarbose, spermidine, NAD+ enhancers and lithium. We then focus on the potential of the interventions and the feasibility of conducting clinical trials with these agents, with the overall aim of maintaining health for longer before the end of life
    corecore