61 research outputs found

    SARS-CoV-2 sero-surveillance in Greece: evolution over time and epidemiological attributes during the pre-vaccination pandemic era

    Get PDF
    BACKGROUND: Nation-wide SARS-CoV-2 seroprevalence surveys provide valuable insights into the course of the pandemic, including information often not captured by routine surveillance of reported cases. METHODS: A serosurvey of IgG antibodies against SARS-CoV-2 was conducted in Greece between March and December 2020. It was designed as a cross-sectional survey repeated at monthly intervals. The leftover sampling methodology was used and a geographically stratified sampling plan was applied. RESULTS: Of 55,947 serum samples collected, 705 (1.26%) were found positive for anti-SARS-CoV-2 antibodies, with higher seroprevalence (9.09%) observed in December 2020. Highest seropositivity levels were observed in the "0-29" and "30-49" year age groups. Seroprevalence increased with age in the "0-29" age group. Highly populated metropolitan areas were characterized with elevated seroprevalence levels (11.92% in Attica, 12.76% in Thessaloniki) compared to the rest of the country (5.90%). The infection fatality rate (IFR) was estimated at 0.451% (95% CI: 0.382-0.549%) using aggregate data until December 2020, and the ratio of actual to reported cases was 9.59 (7.88-11.33). CONCLUSIONS: The evolution of seroprevalence estimates aligned with the course of the pandemic and varied widely by region and age group. Young and middle-aged adults appeared to be drivers of the pandemic during a severe epidemic wave under strict policy measures

    Nearest Template Prediction: A Single-Sample-Based Flexible Class Prediction with Confidence Assessment

    Get PDF
    Gene-expression signature-based disease classification and clinical outcome prediction has not been widely introduced in clinical medicine as initially expected, mainly due to the lack of extensive validation needed for its clinical deployment. Obstacles include variable measurement in microarray assay, inconsistent assay platform, analytical requirement for comparable pair of training and test datasets, etc. Furthermore, as medical device helping clinical decision making, the prediction needs to be made for each single patient with a measure of its reliability. To address these issues, there is a need for flexible prediction method less sensitive to difference in experimental and analytical conditions, applicable to each single patient, and providing measure of prediction confidence. The nearest template prediction (NTP) method provides a convenient way to make class prediction with assessment of prediction confidence computed in each single patient's gene-expression data using only a list of signature genes and a test dataset. We demonstrate that the method can be flexibly applied to cross-platform, cross-species, and multiclass predictions without any optimization of analysis parameters

    Sharing Detailed Research Data Is Associated with Increased Citation Rate

    Get PDF
    BACKGROUND: Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available. PRINCIPAL FINDINGS: We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression. SIGNIFICANCE: This correlation between publicly available data and increased literature impact may further motivate investigators to share their detailed research data

    A Simple but Highly Effective Approach to Evaluate the Prognostic Performance of Gene Expression Signatures

    Get PDF
    BACKGROUND: Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRINCIPAL FINDINGS: A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC) in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ∼40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number. CONCLUSIONS: We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited

    Mine, Yours, Ours? Sharing Data on Human Genetic Variation

    Get PDF
    The achievement of a robust, effective and responsible form of data sharing is currently regarded as a priority for biological and bio-medical research. Empirical evaluations of data sharing may be regarded as an indispensable first step in the identification of critical aspects and the development of strategies aimed at increasing availability of research data for the scientific community as a whole. Research concerning human genetic variation represents a potential forerunner in the establishment of widespread sharing of primary datasets. However, no specific analysis has been conducted to date in order to ascertain whether the sharing of primary datasets is common-practice in this research field. To this aim, we analyzed a total of 543 mitochondrial and Y chromosomal datasets reported in 508 papers indexed in the Pubmed database from 2008 to 2011. A substantial portion of datasets (21.9%) was found to have been withheld, while neither strong editorial policies nor high impact factor proved to be effective in increasing the sharing rate beyond the current figure of 80.5%. Disaggregating datasets for research fields, we could observe a substantially lower sharing in medical than evolutionary and forensic genetics, more evident for whole mtDNA sequences (15.0% vs 99.6%). The low rate of positive responses to e-mail requests sent to corresponding authors of withheld datasets (28.6%) suggests that sharing should be regarded as a prerequisite for final paper acceptance, while making authors deposit their results in open online databases which provide data quality control seems to provide the best-practice standard. Finally, we estimated that 29.8% to 32.9% of total resources are used to generate withheld datasets, implying that an important portion of research funding does not produce shared knowledge. By making the scientific community and the public aware of this important aspect, we may help popularize a more effective culture of data sharing

    Expanding the Understanding of Biases in Development of Clinical-Grade Molecular Signatures: A Case Study in Acute Respiratory Viral Infections

    Get PDF
    The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them.Using a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures.Several recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution

    A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bioactivity profiling using high-throughput <it>in vitro </it>assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex <it>in vitro/in vivo </it>datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML) methods.</p> <p>Results</p> <p>The classification performance of Artificial Neural Networks (ANN), K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Naïve Bayes (NB), Recursive Partitioning and Regression Trees (RPART), and Support Vector Machines (SVM) in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated <it>in vitro </it>assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant) features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5) were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA.</p> <p>Conclusion</p> <p>We have developed a novel simulation model to evaluate machine learning methods for the analysis of data sets in which in vitro bioassay data is being used to predict in vivo chemical toxicology. From our analysis, we can recommend that several ML methods, most notably SVM and ANN, are good candidates for use in real world applications in this area.</p

    Genome-wide gene expression profiling suggests distinct radiation susceptibilities in sporadic and post-Chernobyl papillary thyroid cancers

    Get PDF
    Papillary thyroid cancers (PTCs) incidence dramatically increased in the vicinity of Chernobyl. The cancer-initiating role of radiation elsewhere is debated. Therefore, we searched for a signature distinguishing radio-induced from sporadic cancers. Using microarrays, we compared the expression profiles of PTCs from the Chernobyl Tissue Bank (CTB, n=12) and from French patients with no history of exposure to ionising radiations (n=14). We also compared the transcriptional responses of human lymphocytes to the presumed aetiological agents initiating these tumours, γ-radiation and H2O2. On a global scale, the transcriptomes of CTB and French tumours are indistinguishable, and the transcriptional responses to γ-radiation and H2O2 are similar. On a finer scale, a 118 genes signature discriminated the γ-radiation and H2O2 responses. This signature could be used to classify the tumours as CTB or French with an error of 15–27%. Similar results were obtained with an independent signature of 13 genes involved in homologous recombination. Although sporadic and radio-induced PTCs represent the same disease, they are distinguishable with molecular signatures reflecting specific responses to γ-radiation and H2O2. These signatures in PTCs could reflect the susceptibility profiles of the patients, suggesting the feasibility of a radiation susceptibility test
    corecore