3 research outputs found

    CPS-MEBR: Click Feedback-Aware Web Page Summarization for Multi-Embedding-Based Retrieval

    Full text link
    Embedding-based retrieval (EBR) is a technique to use embeddings to represent query and document, and then convert the retrieval problem into a nearest neighbor search problem in the embedding space. Some previous works have mainly focused on representing the web page with a single embedding, but in real web search scenarios, it is difficult to represent all the information of a long and complex structured web page as a single embedding. To address this issue, we design a click feedback-aware web page summarization for multi-embedding-based retrieval (CPS-MEBR) framework which is able to generate multiple embeddings for web pages to match different potential queries. Specifically, we use the click data of users in search logs to train a summary model to extract those sentences in web pages that are frequently clicked by users, which are more likely to answer those potential queries. Meanwhile, we introduce sentence-level semantic interaction to design a multi-embedding-based retrieval (MEBR) model, which can generate multiple embeddings to deal with different potential queries by using frequently clicked sentences in web pages. Offline experiments show that it can perform high quality candidate retrieval compared to single-embedding-based retrieval (SEBR) model.Comment: Related authors disagre

    Rubella Virus Genotypes in the People's Republic of China between 1979 and 2007: a Shift in Endemic Viruses during the 2001 Rubella Epidemic▿ †

    No full text
    The incidence of rubella cases in China from 1991 to 2007 was reviewed, and the nucleotide sequences from 123 rubella viruses collected during 1999 to 2007 and 4 viral sequences previously reported from 1979 to 1984 were phylogenetically analyzed. Rubella vaccination was not included in national immunization programs in China before 2007. Changes in endemic viruses were compared with incidences of rubella epidemics. The results showed that rubella epidemics occur approximately every 6 to 8 years (1993/1994, 2001, and 2007), and a shift of disease burden to susceptible young adults was observed. The Chinese rubella virus sequences were categorized into 5 of the 13 rubella virus genotypes, 1a, 1E, 1F, 2A, and 2B; cocirculations of these different genotypes were found in China. In Anhui province, a shift in the predominant genotype from 1F and 2B to 1E coincided with the 2001 rubella epidemic. This shift may have occurred throughout China during 2001 to 2007. This study investigated the genotype distribution of rubella viruses in China over a 28-year period to establish an important genetic baseline in China during its prevaccination era
    corecore