29 research outputs found

    Mutation-enrichment next-generation sequencing for quantitative detection of KRAS mutations in urine cell-free DNA from patients with advanced cancers

    Get PDF
    Purpose: Tumor-derived cell-free DNA (cfDNA) from urine of patients with cancer offers noninvasive biological material for detection of cancer-related molecular abnormalities such as mutations in Exon 2 of KRASExperimental Design: A quantitative, mutation-enrichment next-generation sequencing test for detecting KRASG12/G13 mutations in urine cfDNA was developed, and results were compared with clinical testing of archival tumor tissue and plasma cfDNA from patients with advanced cancer.Results: With 90 to 110 mL of urine, the KRASG12/G13 cfDNA test had an analytical sensitivity of 0.002% to 0.006% mutant copies in wild-type background. In 71 patients, the concordance between urine cfDNA and tumor was 73% (sensitivity, 63%; specificity, 96%) for all patients and 89% (sensitivity, 80%; specificity, 100%) for patients with urine samples of 90 to 110 mL. Patients had significantly fewer KRASG12/G13 copies in urine cfDNA during systemic therapy than at baseline or disease progression (P = 0.002). Compared with no changes or increases in urine cfDNA KRASG12/G13 copies during therapy, decreases in these measures were associated with longer median time to treatment failure (P = 0.03).Conclusions: A quantitative, mutation-enrichment next-generation sequencing test for detecting KRASG12/G13 mutations in urine cfDNA had good concordance with testing of archival tumor tissue. Changes in mutated urine cfDNA were associated with time to treatment failure

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Whole-genome sequencing reveals host factors underlying critical COVID-19

    Get PDF
    Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2,3,4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes—including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)—in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease

    Comparing sequences with segment rearrangements

    No full text
    Abstract. Computational genomics involves comparing sequences based on "similarity " for detecting evolutionary and functional relation-ships. Until very recently, available portions of the human genome sequence (and that of other species) were fairly short and sparse. Mostsequencing effort was focused on genes and other short units; similarity between such sequences was measured based on character level differ-ences. However with the advent of whole genome sequencing technology there is emerging consensus that the measure of similarity between long genome sequences must capture the rearrangements of large segmentsfound in abundance in the human genome. In this paper, we abstract the general problem of computing sequence similarity in the presence of segment rearrangements. This problem isclosely related to computing the smallest grammar for a string or the block edit distance between two strings. Our problem, like these otherproblems, is NP hard. Our main result here is a simple O(1) factor approximation algorithm for this problem. In contrast, best known approxi-mations for the related problems are factor \Omega (log n) off from the optimal. Our algorithm works in linear time, and in one pass. In proving our re-sult, we relate sequence similarity measures based on different segment rearrangements, to each other, tight up to constant factors. 1 Introduction Similarity comparison between biomolecular sequences play an important rolein computational genomics due to the premise that sequence similarity usuall
    corecore