149 research outputs found

    Effective and Efficient Query-aware Snippet Extraction for Web Search

    Full text link
    Query-aware webpage snippet extraction is widely used in search engines to help users better understand the content of the returned webpages before clicking. Although important, it is very rarely studied. In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE, aiming to select a few sentences which can best summarize the webpage content in the context of input query. DeepQSE first learns query-aware sentence representations for each sentence to capture the fine-grained relevance between query and sentence, and then learns document-aware query-sentence relevance representations for snippet extraction. Since the query and each sentence are jointly modeled in DeepQSE, its online inference may be slow. Thus, we further propose an efficient version of DeepQSE, named Efficient-DeepQSE, which can significantly improve the inference speed of DeepQSE without affecting its performance. The core idea of Efficient-DeepQSE is to decompose the query-aware snippet extraction task into two stages, i.e., a coarse-grained candidate sentence selection stage where sentence representations can be cached, and a fine-grained relevance modeling stage. Experiments on two real-world datasets validate the effectiveness and efficiency of our methods.Comment: Accepted by EMNLP202

    LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval

    Full text link
    In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency. Despite it deeply exploiting the lexicon-representing capability of pre-trained language models, a crucial gap remains between language modeling and lexicon-weighting retrieval -- the former preferring certain or low-entropy words whereas the latter favoring pivot or high-entropy words -- becoming the main barrier to lexicon-weighting performance for large-scale retrieval. To bridge this gap, we propose a brand-new pre-training framework, lexicon-bottlenecked masked autoencoder (LexMAE), to learn importance-aware lexicon representations. Essentially, we present a lexicon-bottlenecked module between a normal language modeling encoder and a weakened decoder, where a continuous bag-of-words bottleneck is constructed to learn a lexicon-importance distribution in an unsupervised fashion. The pre-trained LexMAE is readily transferred to the lexicon-weighting retrieval via fine-tuning. On the ad-hoc retrieval benchmark, MS-Marco, it achieves 42.6% MRR@10 with 45.8 QPS for the passage dataset and 44.4% MRR@100 with 134.8 QPS for the document dataset, by a CPU machine. And LexMAE shows state-of-the-art zero-shot transfer capability on BEIR benchmark with 12 datasets.Comment: Appeared at ICLR 202

    Histone-like TAFs within the PCAF Histone Acetylase Complex

    Get PDF
    AbstractPCAF histone acetylase plays a role in regulation of transcription, cell cycle progression, and differentiation. Here, we show that PCAF is found in a complex consisting of more than 20 distinct polypeptides. Strikingly, some polypeptides are identical to TBP-associated factors (TAFs), which are subunits of TFIID. Like TFIID, histone fold–containing factors are present within the PCAF complex. The histone H3– and H2B–like subunits within the PCAF complex are identical to those within TFIID, namely, hTAFII31 and hTAFII20/15, respectively. The PCAF complex has a novel histone H4–like subunit with similarity to hTAFII80 that interacts with the histone H3–like domain of hTAFII31. Moreover, the PCAF complex has a novel subunit with WD40 repeats having a similarity to hTAFII100

    Prognosis and oncogenomic profiling of patients with tropomyosin receptor kinase fusion cancer in the 100,000 genomes project

    Get PDF
    INTRODUCTION: Neurotrophic tyrosine receptor kinase (NTRK) gene fusions are oncogenic drivers in various tumor types. Limited data exist on the overall survival (OS) of patients with tumors with NTRK gene fusions and on the co-occurrence of NTRK fusions with other oncogenic drivers. MATERIALS AND METHODS: This retrospective study included patients enrolled in the Genomics England 100,000 Genomes Project who had linked clinical data from UK databases. Patients who had undergone tumor whole genome sequencing between March 2016 and July 2019 were included. Patients with and without NTRK fusions were matched. OS was analyzed along with oncogenic alterations in ALK, BRAF, EGFR, ERBB2, KRAS, and ROS1, and tumor mutation burden (TMB) and microsatellite instability (MSI). RESULTS: Of 15,223 patients analyzed, 38 (0.25%) had NTRK gene fusions in 11 tumor types, the most common were breast cancer, colorectal cancer (CRC), and sarcoma. Median OS was not reached in both the NTRK gene fusion-positive and -negative groups (hazard ratio 1.47, 95% CI 0.39-5.57, P = 0.572). A KRAS mutation was identified in two (5%) patients with NTRK gene fusions, and both had hepatobiliary cancer. High TMB and MSI were both more common in patients with NTRK gene fusions, due to the CRC subset. While there was a higher risk of death in patients with NTRK gene fusions compared to those without, the difference was not statistically significant. CONCLUSION: This study supports the hypothesis that NTRK gene fusions are primary oncogenic drivers and the co-occurrence of NTRK gene fusions with other oncogenic alterations is rare

    Pan-cancer analysis reveals that G6PD is a prognostic biomarker and therapeutic target for a variety of cancers

    Get PDF
    BackgroundDespite accumulating evidence revealing that Glucose-6-phosphate dehydrogenase (G6PD) is highly expressed in many tumor tissues and plays a remarkable role in cancer tumorigenesis and progression, there is still a lack of G6PD pan-cancer analysis. This study was designed to analyze the expression status and prognostic significance of G6PD in pan-cancer.MethodsG6PD expression data were obtained from multiple data resources including the Genotype-Tissue Expression, the Cancer Genome Atlas, and the Tumor Immunity Estimation Resource. These data were used to assess the G6PD expression, prognostic value, and clinical characteristics. The ESTIMATE algorithms were used to analyze the association between G6PD expression and immune-infiltrating cells and the tumor microenvironment. The functional enrichment analysis was also performed across pan-cancer. In addition, the GDSC1 database containing 403 drugs was utilized to explore the relationship between drug sensitivity and G6PD expression levels. Furthermore, we also performed clinical validation and in vitro experiments to further validate the role of G6PD in hepatocellular carcinoma (HCC) cells and its correlation with prognosis. The R software was used for statistical analysis and data visualization.ResultsG6PD expression was upregulated in most cancers compared to their normal counterparts. The study also revealed that G6PD expression was a prognostic indicator and high levels of G6PD expression were correlated with worse clinical prognosis including overall survival, disease-specific survival, and progression-free interval in multiple cancers. Furthermore, the G6PD level was also related to cancer immunity infiltration in most of the cancers, especially in KIRC, LGG, and LIHC. In addition to this, G6PD expression was positively related to pathological stages of KIRP, BRCA, KIRC, and LIHC. Functional analysis and protein-protein interactions network results revealed that G6PD was involved in metabolism-related activities, immune responses, proliferation, and apoptosis. Drug sensitivity analysis showed that IC50 values of most identified anti-cancer drugs were positively correlated with the G6PD expression. Notably, in vitro functional validation showed that G6PD knockdown attenuated the phenotypes of proliferation in HCC.ConclusionG6PD may serve as a potential prognostic biomarker for cancers and may be a potential therapeutic target gene for tumor therapy

    Decoding the spermatogonial stem cell niche under physiological and recovery conditions in adult mice and humans

    Get PDF
    The intricate interaction between spermatogonial stem cell (SSC) and testicular niche is essential for maintaining SSC homeostasis; however, this interaction remains largely uncharacterized. In this study, to characterize the underlying signaling pathways and related paracrine factors, we delineated the intercellular interactions between SSC and niche cell in both adult mice and humans under physiological conditions and dissected the niche-derived regulation of SSC maintenance under recovery conditions, thus uncovering the essential role of C-C motif chemokine ligand 24 and insulin-like growth factor binding protein 7 in SSC maintenance. We also established the clinical relevance of specific paracrine factors in human fertility. Collectively, our work on decoding the adult SSC niche serves as a valuable reference for future studies on the aetiology, diagnosis, and treatment of male infertility.</p

    Programmable Base Editing of the Sheep Genome Revealed No Genome-Wide Off-Target Mutations

    Get PDF
    Since its emergence, CRISPR/Cas9-mediated base editors (BEs) with cytosine deaminase activity have been used to precisely and efficiently introduce single-base mutations in genomes, including those of human cells, mice, and crop species. Most production traits in livestock are induced by point mutations, and genome editing using BEs without homology-directed repair of double-strand breaks can directly alter single nucleotides. The p.96R &gt; C variant of Suppressor cytokine signaling 2 (SOCS2) has profound effects on body weight, body size, and milk production in sheep. In the present study, we successfully obtained lambs with defined point mutations resulting in a p.96R &gt; C substitution in SOCS2 by the co-injection of BE3 mRNA and a single guide RNA (sgRNA) into sheep zygotes. The observed efficiency of the single nucleotide exchange in newborn animals was as high as 25%. Observations of body size and body weight in the edited group showed that gene modification contributes to enhanced growth traits in sheep. Moreover, targeted deep sequencing and unbiased family trio-based whole genome sequencing revealed undetectable off-target mutations in the edited animals. This study demonstrates the potential for the application of BE-mediated point mutations in large animals for the improvement of production traits in livestock species

    What’s New in Clinical Oncology? Highlights from the 2009 ASCO Annual Meeting

    No full text
    This review summarizes the major advances in clinical oncology presented at the 2009 ASCO annual meeting, with emphasis on studies that are potentially practice-changing. Emerging new drugs are also discussed
    • …
    corecore