64 research outputs found
The Short Text Matching Model Enhanced with Knowledge via Contrastive Learning
In recent years, short Text Matching tasks have been widely applied in the
fields ofadvertising search and recommendation. The difficulty lies in the lack
of semantic information and word ambiguity caused by the short length of the
text. Previous works have introduced complement sentences or knowledge bases to
provide additional feature information. However, these methods have not fully
interacted between the original sentence and the complement sentence, and have
not considered the noise issue that may arise from the introduction of external
knowledge bases. Therefore, this paper proposes a short Text Matching model
that combines contrastive learning and external knowledge. The model uses a
generative model to generate corresponding complement sentences and uses the
contrastive learning method to guide the model to obtain more semantically
meaningful encoding of the original sentence. In addition, to avoid noise, we
use keywords as the main semantics of the original sentence to retrieve
corresponding knowledge words in the knowledge base, and construct a knowledge
graph. The graph encoding model is used to integrate the knowledge base
information into the model. Our designed model achieves state-of-the-art
performance on two publicly available Chinese Text Matching datasets,
demonstrating the effectiveness of our model.Comment: 11 pages,2 figure
Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction
Click-through rate (CTR) prediction is a critical task in online advertising
systems. A large body of research considers each ad independently, but ignores
its relationship to other ads that may impact the CTR. In this paper, we
investigate various types of auxiliary ads for improving the CTR prediction of
the target ad. In particular, we explore auxiliary ads from two viewpoints: one
is from the spatial domain, where we consider the contextual ads shown above
the target ad on the same page; the other is from the temporal domain, where we
consider historically clicked and unclicked ads of the user. The intuitions are
that ads shown together may influence each other, clicked ads reflect a user's
preferences, and unclicked ads may indicate what a user dislikes to certain
extent. In order to effectively utilize these auxiliary data, we propose the
Deep Spatio-Temporal neural Networks (DSTNs) for CTR prediction. Our model is
able to learn the interactions between each type of auxiliary data and the
target ad, to emphasize more important hidden information, and to fuse
heterogeneous data in a unified framework. Offline experiments on one public
dataset and two industrial datasets show that DSTNs outperform several
state-of-the-art methods for CTR prediction. We have deployed the
best-performing DSTN in Shenma Search, which is the second largest search
engine in China. The A/B test results show that the online CTR is also
significantly improved compared to our last serving model.Comment: Accepted by KDD 201
Hadoop Perfect File: A fast and memory-efficient metadata access archive file to face small files problem in HDFS
HDFS faces several issues when it comes to handling a large number of small files. These issues are well addressed by archive systems, which combine small files into larger ones. They use index files to hold relevant information for retrieving a small file content from the big archive file. However, existing archive-based solutions require significant overheads when retrieving a file content since additional processing and I/Os are needed to acquire the retrieval information before accessing the actual file content, therefore, deteriorating the access efficiency. This paper presents a new archive file named Hadoop Perfect File (HPF). HPF minimizes access overheads by directly accessing metadata from the part of the index file containing the information. It consequently reduces the additional processing and I/Os needed and improves the access efficiency from archive files. Our index system uses two hash functions. Metadata records are distributed across index files using a dynamic hash function. We further build an order-preserving perfect hash function that memorizes the position of a small file's metadata record within the index file.The authors thank the anonymous reviewers for their insightful suggestions. This work is supported by the National Natural Science Foundation of China (Grant No. 61602037 )
Deep Neural Network for Fast and Accurate Single Image Super-Resolution via Channel-Attention-based Fusion of Orientation-aware Features
Recently, Convolutional Neural Networks (CNNs) have been successfully adopted
to solve the ill-posed single image super-resolution (SISR) problem. A commonly
used strategy to boost the performance of CNN-based SISR models is deploying
very deep networks, which inevitably incurs many obvious drawbacks (e.g., a
large number of network parameters, heavy computational loads, and difficult
model training). In this paper, we aim to build more accurate and faster SISR
models via developing better-performing feature extraction and fusion
techniques. Firstly, we proposed a novel Orientation-Aware feature extraction
and fusion Module (OAM), which contains a mixture of 1D and 2D convolutional
kernels (i.e., 5 x 1, 1 x 5, and 3 x 3) for extracting orientation-aware
features. Secondly, we adopt the channel attention mechanism as an effective
technique to adaptively fuse features extracted in different directions and in
hierarchically stacked convolutional stages. Based on these two important
improvements, we present a compact but powerful CNN-based model for
high-quality SISR via Channel Attention-based fusion of Orientation-Aware
features (SISR-CA-OA). Extensive experimental results verify the superiority of
the proposed SISR-CA-OA model, performing favorably against the
state-of-the-art SISR models in terms of both restoration accuracy and
computational efficiency. The source codes will be made publicly available.Comment: 12 pages, 11 figure
Quantitative trait loci and candidate genes for yield-related traits of upland cotton revealed by genome-wide association analysis under drought conditions
Abstract Background Due to the influence of extreme weather, the environment in China’s main cotton-producing areas is prone to drought stress conditions, which affect the growth and development of cotton and lead to a decrease in cotton yield. Results In this study, 188 upland cotton germplasm resources were phenotyped for data of 8 traits (including 3 major yield traits) under drought conditions in three environments for two consecutive years. Correlation analysis revealed significant positive correlations between the three yield traits. Genetic analysis showed that the estimated heritability of the seed cotton index (SC) under drought conditions was the highest (80.81%), followed by that of boll weight (BW) (80.64%) and the lint cotton index (LC) (70.49%) With genome-wide association study (GWAS) analysis, a total of 75 quantitative trait loci (QTLs) were identified, including two highly credible new QTL hotspots. Three candidate genes (Gh_D09G064400, Gh_D10G261000 and Gh_D10G254000) located in the two new QTL hotspots, QTL51 and QTL55, were highly expressed in the early stage of fiber development and showed significant correlations with SC, LC and BW. The expression of three candidate genes in two extreme materials after drought stress was analyzed by qRT-PCR, and the expression of these two materials in fibers at 15, 20 and 25 DPA. The expression of these three candidate genes was significantly upregulated after drought stress and was significantly higher in drought-tolerant materials than in drought-sensitive materials. In addition, the expression levels of the three candidate genes were higher in the early stage of fiber development (15 DPA), and the expression levels in drought-tolerant germplasm were higher than those in drought-sensitive germplasm. These three candidate genes may play an important role in determining cotton yield under drought conditions. Conclusions This study is helpful for understanding the regulatory genes affecting cotton yield under drought conditions and provides germplasm and candidate gene resources for breeding high-yield cotton varieties under these conditions
Toward Reinforcement-Learning-Based Service Deployment of 5G Mobile Edge Computing with Request-Aware Scheduling
5G wireless network technology will not only significantly increase bandwidth but also introduce new features such as mMTC and URLLC. However, high request latency will remain a challenging problem even with 5G due to the massive requests generated by an increasing number of devices that require long travel distances to the services deployed in cloud centers. By pushing the services closer to the edge of the network, edge computing is recognized as a promising technology to reduce latency. However, properly deploying services among resource-constrained edge servers is an unsolved problem. In this article, we propose a deep reinforcement learning approach to preferably deploy the services to the edge servers with consideration of the request patterns and resource constraints of users, which have not been adequately explored. First, the system model and optimization objectives are formulated and investigated. Then the problem is modeled as a Markov decision process and solved using the Dueling-Deep Q-network algorithm. The experimental results, based on the evaluation of real-life mobile wireless datasets, show that this reinforcement learning approach could be applied to patterns of requests and improve performance.This work is supported by the National Nature Science Foundation of China (Grant No. 61602037) and the Equipment Pre-Research Field Foundation (Grant No. 61400010104)
Study on the Relationship between Electrical Tree Development and Partial Discharge of XLPE Cables
Based on the slice materials of 35 kV and 110 kV XLPE cables, an experimental platform is built to study the relationship between electrical tree and PDs in XLPE with different voltage levels. There are three significant statistical characteristics of the PDs during the growth of electrical trees. The analysis of the results shows that each growth stage has certain characteristics. Different features existed between the growth of the electrical trees and the PD in the insulation of the 35 and 110 kV cables. Evident characteristics such as large spans of time and frequency were present as the electrical trees grew violently in the equivalent time-frequency diagram at every stage. These results could provide criteria for the identification of the deterioration using PD to monitor cables in service at rated voltages. The results are important for the identification of defects in cable insulation in order to provide an early warning of insulation breakdown in the cables
- …