Search CORE

37 research outputs found

Exploiting Query’s Temporal Patterns for Query Autocompletion

Author: Danyang Jiang
Fei Cai
Honghui Chen
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Query autocompletion (QAC) is a common interactive feature of web search engines. It aims at assisting users to formulate queries and avoiding spelling mistakes by presenting them with a list of query completions as soon as they start typing in the search box. Existing QAC models mostly rank the query completions by their past popularity collected in the query logs. For some queries, their popularity exhibits relatively stable or periodic behavior while others may experience a sudden rise in their query popularity. Current time-sensitive QAC models focus on either periodicity or recency and are unable to respond swiftly to such sudden rise, resulting in a less optimal QAC performance. In this paper, we propose a hybrid QAC model that considers two temporal patterns of query’s popularity, that is, periodicity and burst trend. In detail, we first employ the Discrete Fourier Transform (DFT) to identify the periodicity of a query’s popularity, by which we forecast its future popularity. Then the burst trend of query’s popularity is detected and incorporated into the hybrid model with its cyclic behavior. Extensive experiments on a large, real-world query log dataset infer that modeling the temporal patterns of query popularity in the form of its periodicity and its burst trend can significantly improve the effectiveness of ranking query completions

Crossref

Directory of Open Access Journals

Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain

Author: Cai Muzhen
Cao Jiawei
Du Yanrui
Ma Ming
Qin Bing
Zhao Danyang
Zhao Sendong
Publication venue
Publication date: 23/02/2024
Field of study

Extensive studies have been devoted to privatizing general-domain Large Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain data. However, these privatization efforts often ignored a critical aspect: Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic ability of LLMs ensures that they can maintain a consistent stance when confronted with both positive and negative statements about the same fact. Our study focuses on how the dual logic ability of LLMs is affected during the privatization process in the medical domain. We conduct several experiments to analyze the dual logic ability of LLMs by examining the consistency of the stance in responses to paired questions about the same fact. In our experiments, interestingly, we observed a significant decrease in the dual logic ability of existing LLMs after privatization. Besides, our results indicate that incorporating general domain dual logic data into LLMs not only enhances LLMs' dual logic ability but also further improves their accuracy. These findings underscore the importance of prioritizing LLMs' dual logic ability during the privatization process. Our study establishes a benchmark for future research aimed at exploring LLMs' dual logic ability during the privatization process and offers valuable guidance for privatization efforts in real-world applications

arXiv.org e-Print Archive

DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling

Author: Bendersky Michael
Cai Danyang
Chen Jiecao
Emadzadeh Ehsan
Najork Marc
Raman Karthik
Yang Liu
Yeh Jung-Jung
Zhou Yun
Publication venue
Publication date: 01/01/2020
Field of study

Pre-trained models like BERT (Devlin et al., 2018) have dominated NLP / IR applications such as single sentence classification, text pair classification, and question answering. However, deploying these models in real systems is highly non-trivial due to their exorbitant computational costs. A common remedy to this is knowledge distillation (Hinton et al., 2015), leading to faster inference. However -- as we show here -- existing works are not optimized for dealing with pairs (or tuples) of texts. Consequently, they are either not scalable or demonstrate subpar performance. In this work, we propose DiPair -- a novel framework for distilling fast and accurate models on text pair tasks. Coupled with an end-to-end training strategy, DiPair is both highly scalable and offers improved quality-speed tradeoffs. Empirical studies conducted on both academic and real-world e-commerce benchmarks demonstrate the efficacy of the proposed approach with speedups of over 350x and minimal quality drop relative to the cross-attention teacher BERT model.Comment: 13 pages. Accepted to Findings of EMNLP 202

arXiv.org e-Print Archive

Crossref

Using protection motivation theory to explain the intention to initiate human papillomavirus vaccination among men who have sex with men in China

Author: Cai Yong
Cai Yong
Huang Ruonan
Huang Ruonan
Li Peiyang
Li Peiyang
Luo Danyang
Luo Danyang
Meng Xiaojun
Meng Xiaojun
Nadarzynski Tom
Nadarzynski Tom
Qian Han-Zhu
Qian Han-Zhu
Wang Guanghui
Wang Guanghui
Wang Ying
Wang Ying
Wang Zhenyu
Wang Zhenyu
Yuan Tanwei
Yuan Tanwei
Zhou Yepeng
Zhou Yepeng
Zou Huachun
Zou Huachun
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Human papillomavirus (HPV) infection and related diseases are common among men who have sex with men (MSM). The most effective prevention is HPV vaccination. In China, however, men are not included in the HPV vaccination plan. We investigated the intention to initiate HPV vaccination and associated factors among MSM in China. Methods We surveyed 563 unvaccinated MSM aged 18 or older from six cities in China. Participants completed an electronic questionnaire about demographics, knowledge of and attitude towards HPV and HPV vaccine, intention to initiate HPV vaccination, willingness to recommend HPV vaccine to peers, feeling about government policy about HPV vaccination. We used the structural equation modeling (SEM) to analyze factors associated with HPV vaccine intention. Results The knowledge of HPV and HPV vaccine among participants was low. The mean score of knowledge about HPV and HPV vaccine was only 1.59 (range 0–11). The intention to initiate HPV vaccination within 6 months among participants was moderate (43.3% in total, 18.1% for ‘very high' and 25.2% for ‘above average')

WestminsterResearch

Genetic evaluation for production and body size traits using different animal models in purebred-Duroc pigs

Author: Danyang Lin
Fuchen Zhou
Gengyuan Cai
Gengyuan Cai
Haiyu Zeng
Jian Ye
Linsong Dong
Yifeng Hong
Zhenfang Wu
Zhenfang Wu
Publication venue: Frontiers Media S.A.
Publication date: 01/12/2023
Field of study

Duroc pigs are popular crossbred terminal sires, and accurate assessment of genetic parameters in the population can help to rationalize breeding programmes. The principle aim of this study were to evaluate the genetic parameters of production (birth weight, BW; age at 115 kg, AGE; feed conversion ratio, FCR) and body size (body length, BL; body height, BH; front cannon circumference, FCC) traits of Duroc pigs. The second objective was to analyze the fit of different genetic assessment models. The variance components and correlations of BW (28,348 records), AGE (28,335 records), FCR (11,135 records), BL (31,544 records), BH (21,862 records), and FCC (14,684 records) traits were calculated by using DMU and AIREMLF90 from BLUPF90 package. In the common environment model, the heritability of BW, AGE, FCR, BL, BH, and FCC traits were 0.17 ± 0.014, 0.30 ± 0.019, 0.28 ± 0.024, 0.16 ± 0.013, 0.14 ± 0.017, and 0.081 ± 0.016, with common litter effect values of 0.25, 0.20, 0.18, 0.23, 0.19, and 0.16, respectively. According to the results of the Akaike information criterion (AIC) calculations, models with smaller AIC values have a better fit. We found that the common environment model with litter effects as random effects for estimating genetic parameters had a better fit. In this Model, the estimated genetic correlations between AGE with BW, FCR, BL, BH, and FCC traits were −0.28 (0.040), 0.76 (0.038), −0.71 (0.036), −0.44 (0.060), and −0.60 (0.073), respectively, with phenotypic correlations of −0.17, 0.52, −0.22, −0.13 and −0.24, respectively. In our analysis of genetic trends for six traits in the Duroc population from 2012 to 2021, we observed significant genetic trends for AGE, BL, and BH. Particularly noteworthy is the rapid decline in the genetic trend for AGE, indicating an enhancement in the pig's growth rate through selective breeding. Therefore, we believe that some challenging-to-select traits can benefit from the genetic correlations between traits. By selecting easily measurable traits, they can gain from synergistic selection effects, leading to genetic progress. Conducting population genetic parameter analysis can assist us in devising breeding strategies

Directory of Open Access Journals

Asthma prevalence based on the Baidu index and China's Health Statistical Yearbook from 2011 to 2020 in China

Author: Cai Chen
Danyang Lv
Fengxia Wu
Fulai Peng
Haitao Du
Ping Wang
Xingchen Wang
Xuekun Shao
Yahui Li
Yi Wang
Publication venue: Frontiers Media S.A.
Publication date: 01/10/2023
Field of study

BackgroundDue to environmental pollution, changes in lifestyle, and advancements in diagnostic technology, the prevalence of asthma has been increasing over the years. Although China has made early efforts in asthma epidemiology and prevention, there is still a lack of unified and comprehensive epidemiological research within the country. The objective of the study is to determine the nationwide prevalence distribution of asthma using the Baidu Index and China's Health Statistical Yearbook.MethodsBased on China's Health Statistical Yearbook, we analyzed the gender and age distribution of asthma in China from 2011 to 2020, as well as the length of hospitalization and associated costs. By utilizing the Baidu Index and setting the covering all 31 provinces and autonomous regions in China, we obtained the Baidu Index for the keyword 'asthma'. Heatmaps and growth ratios described the prevalence and growth of asthma in mainland China.ResultsThe average expenditure for discharged asthma (standard deviation) patients was ¥5,870 (808). The average length of stay (standard deviation) was 7.9 (0.38) days. During the period of 2011 to 2020, hospitalization expenses for asthma increased while the length of hospital stay decreased. The proportion of discharged patients who were children under the age of 5 were 25.3% (2011), 19.4% (2012), 16% (2013), 17.9% (2014), 13.9% (2015), 11.3% (2016), 10.2% (2017), 9.4% (2018), 8.1% (2019), and 7.2% (2020), respectively. The prevalence of asthma among boys was higher than girls before the age of 14. In contrast, the proportion of women with asthma was larger than men after the age of 14. During the period from 2011 to 2020, the median [The first quartile (Q1)-the third quartile (Q3)] daily asthma Baidu index in Guangdong, Beijing, Jiangsu, Sichuan, and Zhejiang were 419 (279–476), 328 (258–376), 315 (227–365), 272 (166–313), and 312 (233–362) respectively. Coastal regions showed higher levels of attention toward asthma, indicating a higher incidence rate. Since 2014, there has been a rapid increase in the level of attention toward asthma, with the provinces of Qinghai, Sichuan, and Guangdong experiencing the fastest growth.ConclusionThere are regional variations in the prevalence of asthma among different provinces in China, and the overall prevalence of asthma is increasing

Directory of Open Access Journals

Energetics of a collapsible channel flow with a nonlinear fluid-beam model

Author: Cai ZX
Luo Xiaoyu
Stewart Peter
Wang Danyang
Publication venue: University of Glasgow
Publication date: 27/01/2021
Field of study

Data for figures in 'Energetics of a collapsible channel flow with a nonlinear fluid-beam model

Enlighten: Research Data (University of Glasgow)

Microstructural Evolution and Mechanical Properties of Ti<sub>2</sub>AlNb/GH99 Superalloy Brazed Joints Using TiZrCuNi Amorphous Filler Alloy

Author: Danyang Lin
Hongbing Liu
Junjie Cai
Shengpeng Hu
Wei Fu
Xiaoguo Song
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

Dissimilar materials brazing of Ti2AlNb alloy to GH99 superalloy is of great pragmatic importance in the aerospace field, especially the lightweight space aircraft components manufacturing. In this work, TiZrCuNi amorphous filler alloy was used as brazing filler, and experiments were carried out at different brazing temperatures and times to investigate the changes in interfacial structures and properties of the joints. The typical interfacial microstructure was Ti2AlNb alloy/B2/β/Ti2Ni (Al, Nb) + B2/β + (Ti, Zr)2(Ni, Cu) + (Ti, Zr)(Ni, Cu)/(Cr, Ni, Ti) solid solution + (Ni, Cr) solid solution/GH99 superalloy when being brazed at 1000 °C for 8 min. The interfacial microstructure of the joints was influenced by diffusion and reaction between the filler alloy and the parent metal. The prolongation of brazing process parameters accelerated the diffusion and reaction of the liquid brazing alloy into both parent metals, which eventually led to the aggregation of (Ti, Zr)2(Ni, Cu) brittle phase and increased thickness of Ti2Ni (Al, Nb) layer. According to fracture analyses, cracks began in the Ti2Ni (Al, Nb) phase and spread with it as well as the (Ti, Zr)2(Ni, Cu) phase. The joints that were brazed at 1000 °C for 8 min had a maximum shear strength of ~216.2 MPa. Furthermore, increasing the brazing temperature or extending the holding time decreased the shear strength due to the coarse Ti2Ni (Al, Nb) phase and the continuous (Ti, Zr)2(Ni, Cu) phase

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Triglycerides as Biomarker for Predicting Systemic Lupus Erythematosus Related Kidney Injury of Negative Proteinuria

Author: Danyang Li
Haitao Yu
Jingyu Yang
Lili Jiang
Mingjun Si
Ting Liu
Yuanyan Cai
Publication venue: MDPI AG
Publication date: 01/07/2022
Field of study

Fewer biomarkers can be used to predict systemic lupus erythematosus (SLE) related kidney injury. This paper presents an apriori algorithm of association rules to mine the predictive biomarkers for SLE-related kidney injury of negative proteinuria. An apriori algorithm of association rules was employed to identify biomarkers, and logistic regression analysis and spearman correlation analysis were used to evaluate the correlation between triglycerides and SLE-related kidney injury of negative proteinuria. Triglycerides were mined out by the apriori algorithm of association rules. The level of triglycerides was significantly higher, and it was an independent risk factor for SLE-related kidney injury. In the high-triglycerides group, the number of patients with SLE-related kidney injury, SLEDAI-2K, urine P-CAST, the level of blood urea nitrogen, serum creatinine, and proteinuria were increased. Triglycerides level was positively correlated with proteinuria and P-CAST and negatively correlated with albumin and IgG. The area under the ROC curve of triglycerides and triglycerides combined proteinuria was 0.72 and 0.82, respectively. Significantly, 50% of SLE-related kidney injuries of negative proteinuria could be identified by high triglycerides levels. High triglycerides level was found at the time of onset of kidney injury, and it was opposite to glomerular filtration rate. Triglycerides may be a potential marker for predicting SLE-related kidney injury, especially in SLE-related kidney injury of negative proteinuria. Triglycerides combined proteinuria could predict SLE-related kidney injury effectively

Directory of Open Access Journals

PubMed Central

Limited terrestrial carbon sinks and increasing carbon emissions from the Hu Line spatial pattern perspective in China

Author: Baichi Zhou
Danyang Feng
Hezhen Lou
Mingyong Cai
Shengtian Yang
Xiaoyu Ren
Xuewei Shi
Yifan Zhu
Zihao Pan
Publication venue: Elsevier
Publication date: 01/05/2024
Field of study

China’s commitment to achieving the goal of carbon peak and carbon neutrality (CPCN) has attracted worldwide attention, and the efforts made by China are critical to realization of the 1.5 °C temperature control objective of the 2015 Paris Agreement. However, it is unclear whether long-term spatiotemporal changes in China’s carbon emissions and carbon sinks exhibit specific spatial patterns, such as the Hu Line, which might affect China’s future policymaking. Based on the emission factor, inventory, and eddy covariance methods, this study calculated China’s carbon emissions (2003–2018) and terrestrial carbon sinks (2003–2020). Results showed that the increase in carbon sinks is limited in comparison with the increase in carbon emissions, and that the carbon sequestration ratio remains deficient and generally maintained at around 10 %. The spatial pattern of carbon emissions and carbon sinks showed an unbalanced state across the Hu Line, mainly reflected in accelerated increase in both the carbon emission rankings and the proportion of emissions to China’s total carbon emissions of provinces on the northwestern side of the Hu Line. Despite this, the gross domestic product (GDP) rankings of those provinces have not improved substantially, whereas provinces on the southeastern side of the Hu Line have contributed most to China’s GDP and terrestrial carbon sinks. The findings of this study improve understanding of the spatiotemporal relationship between carbon emissions and terrestrial carbon sinks in China, and represent alternative insights that could support adjustment of carbon-related policies and promote realization of CPCN in China

Directory of Open Access Journals