37 research outputs found
Exploiting Query’s Temporal Patterns for Query Autocompletion
Query autocompletion (QAC) is a common interactive feature of web search engines. It aims at assisting users to formulate queries and avoiding spelling mistakes by presenting them with a list of query completions as soon as they start typing in the search box. Existing QAC models mostly rank the query completions by their past popularity collected in the query logs. For some queries, their popularity exhibits relatively stable or periodic behavior while others may experience a sudden rise in their query popularity. Current time-sensitive QAC models focus on either periodicity or recency and are unable to respond swiftly to such sudden rise, resulting in a less optimal QAC performance. In this paper, we propose a hybrid QAC model that considers two temporal patterns of query’s popularity, that is, periodicity and burst trend. In detail, we first employ the Discrete Fourier Transform (DFT) to identify the periodicity of a query’s popularity, by which we forecast its future popularity. Then the burst trend of query’s popularity is detected and incorporated into the hybrid model with its cyclic behavior. Extensive experiments on a large, real-world query log dataset infer that modeling the temporal patterns of query popularity in the form of its periodicity and its burst trend can significantly improve the effectiveness of ranking query completions
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain
Extensive studies have been devoted to privatizing general-domain Large
Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain
data. However, these privatization efforts often ignored a critical aspect:
Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic
ability of LLMs ensures that they can maintain a consistent stance when
confronted with both positive and negative statements about the same fact. Our
study focuses on how the dual logic ability of LLMs is affected during the
privatization process in the medical domain. We conduct several experiments to
analyze the dual logic ability of LLMs by examining the consistency of the
stance in responses to paired questions about the same fact. In our
experiments, interestingly, we observed a significant decrease in the dual
logic ability of existing LLMs after privatization. Besides, our results
indicate that incorporating general domain dual logic data into LLMs not only
enhances LLMs' dual logic ability but also further improves their accuracy.
These findings underscore the importance of prioritizing LLMs' dual logic
ability during the privatization process. Our study establishes a benchmark for
future research aimed at exploring LLMs' dual logic ability during the
privatization process and offers valuable guidance for privatization efforts in
real-world applications
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
Pre-trained models like BERT (Devlin et al., 2018) have dominated NLP / IR
applications such as single sentence classification, text pair classification,
and question answering. However, deploying these models in real systems is
highly non-trivial due to their exorbitant computational costs. A common remedy
to this is knowledge distillation (Hinton et al., 2015), leading to faster
inference. However -- as we show here -- existing works are not optimized for
dealing with pairs (or tuples) of texts. Consequently, they are either not
scalable or demonstrate subpar performance. In this work, we propose DiPair --
a novel framework for distilling fast and accurate models on text pair tasks.
Coupled with an end-to-end training strategy, DiPair is both highly scalable
and offers improved quality-speed tradeoffs. Empirical studies conducted on
both academic and real-world e-commerce benchmarks demonstrate the efficacy of
the proposed approach with speedups of over 350x and minimal quality drop
relative to the cross-attention teacher BERT model.Comment: 13 pages. Accepted to Findings of EMNLP 202
Using protection motivation theory to explain the intention to initiate human papillomavirus vaccination among men who have sex with men in China
Human papillomavirus (HPV) infection and related diseases are common among men who have sex with men (MSM). The most effective prevention is HPV vaccination. In China, however, men are not included in the HPV vaccination plan. We investigated the intention to initiate HPV vaccination and associated factors among MSM in China. Methods We surveyed 563 unvaccinated MSM aged 18 or older from six cities in China. Participants completed an electronic questionnaire about demographics, knowledge of and attitude towards HPV and HPV vaccine, intention to initiate HPV vaccination, willingness to recommend HPV vaccine to peers, feeling about government policy about HPV vaccination. We used the structural equation modeling (SEM) to analyze factors associated with HPV vaccine intention. Results The knowledge of HPV and HPV vaccine among participants was low. The mean score of knowledge about HPV and HPV vaccine was only 1.59 (range 0–11). The intention to initiate HPV vaccination within 6 months among participants was moderate (43.3% in total, 18.1% for ‘very high' and 25.2% for ‘above average')
Genetic evaluation for production and body size traits using different animal models in purebred-Duroc pigs
Duroc pigs are popular crossbred terminal sires, and accurate assessment of genetic parameters in the population can help to rationalize breeding programmes. The principle aim of this study were to evaluate the genetic parameters of production (birth weight, BW; age at 115 kg, AGE; feed conversion ratio, FCR) and body size (body length, BL; body height, BH; front cannon circumference, FCC) traits of Duroc pigs. The second objective was to analyze the fit of different genetic assessment models. The variance components and correlations of BW (28,348 records), AGE (28,335 records), FCR (11,135 records), BL (31,544 records), BH (21,862 records), and FCC (14,684 records) traits were calculated by using DMU and AIREMLF90 from BLUPF90 package. In the common environment model, the heritability of BW, AGE, FCR, BL, BH, and FCC traits were 0.17 ± 0.014, 0.30 ± 0.019, 0.28 ± 0.024, 0.16 ± 0.013, 0.14 ± 0.017, and 0.081 ± 0.016, with common litter effect values of 0.25, 0.20, 0.18, 0.23, 0.19, and 0.16, respectively. According to the results of the Akaike information criterion (AIC) calculations, models with smaller AIC values have a better fit. We found that the common environment model with litter effects as random effects for estimating genetic parameters had a better fit. In this Model, the estimated genetic correlations between AGE with BW, FCR, BL, BH, and FCC traits were −0.28 (0.040), 0.76 (0.038), −0.71 (0.036), −0.44 (0.060), and −0.60 (0.073), respectively, with phenotypic correlations of −0.17, 0.52, −0.22, −0.13 and −0.24, respectively. In our analysis of genetic trends for six traits in the Duroc population from 2012 to 2021, we observed significant genetic trends for AGE, BL, and BH. Particularly noteworthy is the rapid decline in the genetic trend for AGE, indicating an enhancement in the pig's growth rate through selective breeding. Therefore, we believe that some challenging-to-select traits can benefit from the genetic correlations between traits. By selecting easily measurable traits, they can gain from synergistic selection effects, leading to genetic progress. Conducting population genetic parameter analysis can assist us in devising breeding strategies
Asthma prevalence based on the Baidu index and China's Health Statistical Yearbook from 2011 to 2020 in China
BackgroundDue to environmental pollution, changes in lifestyle, and advancements in diagnostic technology, the prevalence of asthma has been increasing over the years. Although China has made early efforts in asthma epidemiology and prevention, there is still a lack of unified and comprehensive epidemiological research within the country. The objective of the study is to determine the nationwide prevalence distribution of asthma using the Baidu Index and China's Health Statistical Yearbook.MethodsBased on China's Health Statistical Yearbook, we analyzed the gender and age distribution of asthma in China from 2011 to 2020, as well as the length of hospitalization and associated costs. By utilizing the Baidu Index and setting the covering all 31 provinces and autonomous regions in China, we obtained the Baidu Index for the keyword 'asthma'. Heatmaps and growth ratios described the prevalence and growth of asthma in mainland China.ResultsThe average expenditure for discharged asthma (standard deviation) patients was ¥5,870 (808). The average length of stay (standard deviation) was 7.9 (0.38) days. During the period of 2011 to 2020, hospitalization expenses for asthma increased while the length of hospital stay decreased. The proportion of discharged patients who were children under the age of 5 were 25.3% (2011), 19.4% (2012), 16% (2013), 17.9% (2014), 13.9% (2015), 11.3% (2016), 10.2% (2017), 9.4% (2018), 8.1% (2019), and 7.2% (2020), respectively. The prevalence of asthma among boys was higher than girls before the age of 14. In contrast, the proportion of women with asthma was larger than men after the age of 14. During the period from 2011 to 2020, the median [The first quartile (Q1)-the third quartile (Q3)] daily asthma Baidu index in Guangdong, Beijing, Jiangsu, Sichuan, and Zhejiang were 419 (279–476), 328 (258–376), 315 (227–365), 272 (166–313), and 312 (233–362) respectively. Coastal regions showed higher levels of attention toward asthma, indicating a higher incidence rate. Since 2014, there has been a rapid increase in the level of attention toward asthma, with the provinces of Qinghai, Sichuan, and Guangdong experiencing the fastest growth.ConclusionThere are regional variations in the prevalence of asthma among different provinces in China, and the overall prevalence of asthma is increasing
Energetics of a collapsible channel flow with a nonlinear fluid-beam model
Data for figures in 'Energetics of a collapsible channel flow with a nonlinear fluid-beam model
Microstructural Evolution and Mechanical Properties of Ti<sub>2</sub>AlNb/GH99 Superalloy Brazed Joints Using TiZrCuNi Amorphous Filler Alloy
Dissimilar materials brazing of Ti2AlNb alloy to GH99 superalloy is of great pragmatic importance in the aerospace field, especially the lightweight space aircraft components manufacturing. In this work, TiZrCuNi amorphous filler alloy was used as brazing filler, and experiments were carried out at different brazing temperatures and times to investigate the changes in interfacial structures and properties of the joints. The typical interfacial microstructure was Ti2AlNb alloy/B2/β/Ti2Ni (Al, Nb) + B2/β + (Ti, Zr)2(Ni, Cu) + (Ti, Zr)(Ni, Cu)/(Cr, Ni, Ti) solid solution + (Ni, Cr) solid solution/GH99 superalloy when being brazed at 1000 °C for 8 min. The interfacial microstructure of the joints was influenced by diffusion and reaction between the filler alloy and the parent metal. The prolongation of brazing process parameters accelerated the diffusion and reaction of the liquid brazing alloy into both parent metals, which eventually led to the aggregation of (Ti, Zr)2(Ni, Cu) brittle phase and increased thickness of Ti2Ni (Al, Nb) layer. According to fracture analyses, cracks began in the Ti2Ni (Al, Nb) phase and spread with it as well as the (Ti, Zr)2(Ni, Cu) phase. The joints that were brazed at 1000 °C for 8 min had a maximum shear strength of ~216.2 MPa. Furthermore, increasing the brazing temperature or extending the holding time decreased the shear strength due to the coarse Ti2Ni (Al, Nb) phase and the continuous (Ti, Zr)2(Ni, Cu) phase
Triglycerides as Biomarker for Predicting Systemic Lupus Erythematosus Related Kidney Injury of Negative Proteinuria
Fewer biomarkers can be used to predict systemic lupus erythematosus (SLE) related kidney injury. This paper presents an apriori algorithm of association rules to mine the predictive biomarkers for SLE-related kidney injury of negative proteinuria. An apriori algorithm of association rules was employed to identify biomarkers, and logistic regression analysis and spearman correlation analysis were used to evaluate the correlation between triglycerides and SLE-related kidney injury of negative proteinuria. Triglycerides were mined out by the apriori algorithm of association rules. The level of triglycerides was significantly higher, and it was an independent risk factor for SLE-related kidney injury. In the high-triglycerides group, the number of patients with SLE-related kidney injury, SLEDAI-2K, urine P-CAST, the level of blood urea nitrogen, serum creatinine, and proteinuria were increased. Triglycerides level was positively correlated with proteinuria and P-CAST and negatively correlated with albumin and IgG. The area under the ROC curve of triglycerides and triglycerides combined proteinuria was 0.72 and 0.82, respectively. Significantly, 50% of SLE-related kidney injuries of negative proteinuria could be identified by high triglycerides levels. High triglycerides level was found at the time of onset of kidney injury, and it was opposite to glomerular filtration rate. Triglycerides may be a potential marker for predicting SLE-related kidney injury, especially in SLE-related kidney injury of negative proteinuria. Triglycerides combined proteinuria could predict SLE-related kidney injury effectively
Limited terrestrial carbon sinks and increasing carbon emissions from the Hu Line spatial pattern perspective in China
China’s commitment to achieving the goal of carbon peak and carbon neutrality (CPCN) has attracted worldwide attention, and the efforts made by China are critical to realization of the 1.5 °C temperature control objective of the 2015 Paris Agreement. However, it is unclear whether long-term spatiotemporal changes in China’s carbon emissions and carbon sinks exhibit specific spatial patterns, such as the Hu Line, which might affect China’s future policymaking. Based on the emission factor, inventory, and eddy covariance methods, this study calculated China’s carbon emissions (2003–2018) and terrestrial carbon sinks (2003–2020). Results showed that the increase in carbon sinks is limited in comparison with the increase in carbon emissions, and that the carbon sequestration ratio remains deficient and generally maintained at around 10 %. The spatial pattern of carbon emissions and carbon sinks showed an unbalanced state across the Hu Line, mainly reflected in accelerated increase in both the carbon emission rankings and the proportion of emissions to China’s total carbon emissions of provinces on the northwestern side of the Hu Line. Despite this, the gross domestic product (GDP) rankings of those provinces have not improved substantially, whereas provinces on the southeastern side of the Hu Line have contributed most to China’s GDP and terrestrial carbon sinks. The findings of this study improve understanding of the spatiotemporal relationship between carbon emissions and terrestrial carbon sinks in China, and represent alternative insights that could support adjustment of carbon-related policies and promote realization of CPCN in China