Search CORE

64 research outputs found

The Short Text Matching Model Enhanced with Knowledge via Contrastive Learning

Author: Cui Mengmeng
Du Yanlong
Liu Xiangzheng
Mai Hanjie
Xu Shaohua
Zhang Qiang
Zhong Qiqiang
Publication venue
Publication date: 07/04/2023
Field of study

In recent years, short Text Matching tasks have been widely applied in the fields ofadvertising search and recommendation. The difficulty lies in the lack of semantic information and word ambiguity caused by the short length of the text. Previous works have introduced complement sentences or knowledge bases to provide additional feature information. However, these methods have not fully interacted between the original sentence and the complement sentence, and have not considered the noise issue that may arise from the introduction of external knowledge bases. Therefore, this paper proposes a short Text Matching model that combines contrastive learning and external knowledge. The model uses a generative model to generate corresponding complement sentences and uses the contrastive learning method to guide the model to obtain more semantically meaningful encoding of the original sentence. In addition, to avoid noise, we use keywords as the main semantics of the original sentence to retrieve corresponding knowledge words in the knowledge base, and construct a knowledge graph. The graph encoding model is used to integrate the knowledge base information into the model. Our designed model achieves state-of-the-art performance on two publicly available Chinese Text Matching datasets, demonstrating the effectiveness of our model.Comment: 11 pages,2 figure

arXiv.org e-Print Archive

Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction

Author: Du Yanlong
Li Li
Liu Zhaojie
Ouyang Wentao
Xing Xin
Zhang Xiuwu
Zou Heng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/07/2019
Field of study

Click-through rate (CTR) prediction is a critical task in online advertising systems. A large body of research considers each ad independently, but ignores its relationship to other ads that may impact the CTR. In this paper, we investigate various types of auxiliary ads for improving the CTR prediction of the target ad. In particular, we explore auxiliary ads from two viewpoints: one is from the spatial domain, where we consider the contextual ads shown above the target ad on the same page; the other is from the temporal domain, where we consider historically clicked and unclicked ads of the user. The intuitions are that ads shown together may influence each other, clicked ads reflect a user's preferences, and unclicked ads may indicate what a user dislikes to certain extent. In order to effectively utilize these auxiliary data, we propose the Deep Spatio-Temporal neural Networks (DSTNs) for CTR prediction. Our model is able to learn the interactions between each type of auxiliary data and the target ad, to emphasize more important hidden information, and to fuse heterogeneous data in a unified framework. Offline experiments on one public dataset and two industrial datasets show that DSTNs outperform several state-of-the-art methods for CTR prediction. We have deployed the best-performing DSTN in Shenma Search, which is the second largest search engine in China. The A/B test results show that the online CTR is also significantly improved compared to our last serving model.Comment: Accepted by KDD 201

arXiv.org e-Print Archive

Crossref

Hadoop Perfect File: A fast and memory-efficient metadata access archive file to face small files problem in HDFS

Author: Du Xiaojiang
Guizani Mohsen
Lin Kwei Jay
Tao Wenjun
Tchaye-Kondi Jude
Zhai Yanlong
Zhu Liehuang
Publication venue: 'Elsevier BV'
Publication date: 01/10/2021
Field of study

HDFS faces several issues when it comes to handling a large number of small files. These issues are well addressed by archive systems, which combine small files into larger ones. They use index files to hold relevant information for retrieving a small file content from the big archive file. However, existing archive-based solutions require significant overheads when retrieving a file content since additional processing and I/Os are needed to acquire the retrieval information before accessing the actual file content, therefore, deteriorating the access efficiency. This paper presents a new archive file named Hadoop Perfect File (HPF). HPF minimizes access overheads by directly accessing metadata from the part of the index file containing the information. It consequently reduces the additional processing and I/Os needed and improves the access efficiency from archive files. Our index system uses two hash functions. Metadata records are distributed across index files using a dynamic hash function. We further build an order-preserving perfect hash function that memorizes the position of a small file's metadata record within the index file.The authors thank the anonymous reviewers for their insightful suggestions. This work is supported by the National Natural Science Foundation of China (Grant No. 61602037 )

Qatar University Institutional Repository

Deep Neural Network for Fast and Accurate Single Image Super-Resolution via Channel-Attention-based Fusion of Orientation-aware Features

Author: Cao Yanlong
Cao Yanpeng
Chen Du
He Zewei
Tang Siliang
Yang Jiangxin
Yang Michael Ying
Zhuang Yueting
Publication venue
Publication date: 09/12/2019
Field of study

Recently, Convolutional Neural Networks (CNNs) have been successfully adopted to solve the ill-posed single image super-resolution (SISR) problem. A commonly used strategy to boost the performance of CNN-based SISR models is deploying very deep networks, which inevitably incurs many obvious drawbacks (e.g., a large number of network parameters, heavy computational loads, and difficult model training). In this paper, we aim to build more accurate and faster SISR models via developing better-performing feature extraction and fusion techniques. Firstly, we proposed a novel Orientation-Aware feature extraction and fusion Module (OAM), which contains a mixture of 1D and 2D convolutional kernels (i.e., 5 x 1, 1 x 5, and 3 x 3) for extracting orientation-aware features. Secondly, we adopt the channel attention mechanism as an effective technique to adaptively fuse features extracted in different directions and in hierarchically stacked convolutional stages. Based on these two important improvements, we present a compact but powerful CNN-based model for high-quality SISR via Channel Attention-based fusion of Orientation-Aware features (SISR-CA-OA). Extensive experimental results verify the superiority of the proposed SISR-CA-OA model, performing favorably against the state-of-the-art SISR models in terms of both restoration accuracy and computational efficiency. The source codes will be made publicly available.Comment: 12 pages, 11 figure

arXiv.org e-Print Archive

University of Twente Research Information

Quantitative trait loci and candidate genes for yield-related traits of upland cotton revealed by genome-wide association analysis under drought conditions

Author: Fenglei Sun
Jun Ma
Penglong Wang
Xiongming Du
Yanlong Yang
Publication venue: BMC
Publication date: 01/09/2023
Field of study

Abstract Background Due to the influence of extreme weather, the environment in China’s main cotton-producing areas is prone to drought stress conditions, which affect the growth and development of cotton and lead to a decrease in cotton yield. Results In this study, 188 upland cotton germplasm resources were phenotyped for data of 8 traits (including 3 major yield traits) under drought conditions in three environments for two consecutive years. Correlation analysis revealed significant positive correlations between the three yield traits. Genetic analysis showed that the estimated heritability of the seed cotton index (SC) under drought conditions was the highest (80.81%), followed by that of boll weight (BW) (80.64%) and the lint cotton index (LC) (70.49%) With genome-wide association study (GWAS) analysis, a total of 75 quantitative trait loci (QTLs) were identified, including two highly credible new QTL hotspots. Three candidate genes (Gh_D09G064400, Gh_D10G261000 and Gh_D10G254000) located in the two new QTL hotspots, QTL51 and QTL55, were highly expressed in the early stage of fiber development and showed significant correlations with SC, LC and BW. The expression of three candidate genes in two extreme materials after drought stress was analyzed by qRT-PCR, and the expression of these two materials in fibers at 15, 20 and 25 DPA. The expression of these three candidate genes was significantly upregulated after drought stress and was significantly higher in drought-tolerant materials than in drought-sensitive materials. In addition, the expression levels of the three candidate genes were higher in the early stage of fiber development (15 DPA), and the expression levels in drought-tolerant germplasm were higher than those in drought-sensitive germplasm. These three candidate genes may play an important role in determining cotton yield under drought conditions. Conclusions This study is helpful for understanding the regulatory genes affecting cotton yield under drought conditions and provides germplasm and candidate gene resources for breeding high-yield cotton varieties under these conditions

Directory of Open Access Journals

Toward Reinforcement-Learning-Based Service Deployment of 5G Mobile Edge Computing with Request-Aware Scheduling

Author: Bao Tianhong
Du Xiaojiang
Guizani Mohsen
Shen Meng
Zhai Yanlong
Zhu Liehuang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2020
Field of study

5G wireless network technology will not only significantly increase bandwidth but also introduce new features such as mMTC and URLLC. However, high request latency will remain a challenging problem even with 5G due to the massive requests generated by an increasing number of devices that require long travel distances to the services deployed in cloud centers. By pushing the services closer to the edge of the network, edge computing is recognized as a promising technology to reduce latency. However, properly deploying services among resource-constrained edge servers is an unsolved problem. In this article, we propose a deep reinforcement learning approach to preferably deploy the services to the edge servers with consideration of the request patterns and resource constraints of users, which have not been adequately explored. First, the system model and optimization objectives are formulated and investigated. Then the problem is modeled as a Markov decision process and solved using the Dueling-Deep Q-network algorithm. The experimental results, based on the evaluation of real-life mobile wireless datasets, show that this reinforcement learning approach could be applied to patterns of requests and improve performance.This work is supported by the National Nature Science Foundation of China (Grant No. 61602037) and the Equipment Pre-Research Field Foundation (Grant No. 61400010104)

Qatar University Institutional Repository

Study on the Relationship between Electrical Tree Development and Partial Discharge of XLPE Cables

Author: Chaofei Gao
Jian Du
Liwei Zheng
Wei Wang
Yanlong Yu
Zan Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

Based on the slice materials of 35 kV and 110 kV XLPE cables, an experimental platform is built to study the relationship between electrical tree and PDs in XLPE with different voltage levels. There are three significant statistical characteristics of the PDs during the growth of electrical trees. The analysis of the results shows that each growth stage has certain characteristics. Different features existed between the growth of the electrical trees and the PD in the insulation of the 35 and 110 kV cables. Evident characteristics such as large spans of time and frequency were present as the electrical trees grew violently in the equivalent time-frequency diagram at every stage. These results could provide criteria for the identification of the deterioration using PD to monitor cables in service at rated voltages. The results are important for the identification of defects in cable insulation in order to provide an early warning of insulation breakdown in the cables

Directory of Open Access Journals