8,832 research outputs found
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks
Knowledge distillation (KD) has demonstrated its effectiveness to boost the
performance of graph neural networks (GNNs), where its goal is to distill
knowledge from a deeper teacher GNN into a shallower student GNN. However, it
is actually difficult to train a satisfactory teacher GNN due to the well-known
over-parametrized and over-smoothing issues, leading to invalid knowledge
transfer in practical applications. In this paper, we propose the first
Free-direction Knowledge Distillation framework via Reinforcement learning for
GNNs, called FreeKD, which is no longer required to provide a deeper
well-optimized teacher GNN. The core idea of our work is to collaboratively
build two shallower GNNs in an effort to exchange knowledge between them via
reinforcement learning in a hierarchical way. As we observe that one typical
GNN model often has better and worse performances at different nodes during
training, we devise a dynamic and free-direction knowledge transfer strategy
that consists of two levels of actions: 1) node-level action determines the
directions of knowledge transfer between the corresponding nodes of two
networks; and then 2) structure-level action determines which of the local
structures generated by the node-level actions to be propagated. In essence,
our FreeKD is a general and principled framework which can be naturally
compatible with GNNs of different architectures. Extensive experiments on five
benchmark datasets demonstrate our FreeKD outperforms two base GNNs in a large
margin, and shows its efficacy to various GNNs. More surprisingly, our FreeKD
has comparable or even better performance than traditional KD algorithms that
distill knowledge from a deeper and stronger teacher GNN.Comment: Accepted to KDD 202
An Updated Meta-Analysis: Risk Conferred by Glutathione S-Transferases ( GSTM1
Purpose. To study the effects of glutathione S-transferase M1 (GSTM1) and T1 (GSTT1) polymorphisms on age-related cataract (ARC). Methods. After a systematic literature search, all relevant studies evaluating the association between GSTs polymorphisms and ARC were included. Results. Fifteen studies on GSTM1 and nine studies on GSTT1 were included in this meta-analysis. In the pooled analysis, a significant association between null genotype of GSTT1 and ARC was found (OR = 1.229, 95% CI = 1.057–1.429, and P=0.007). In subgroup analysis, the association between cortical cataract (CC) and GSTM1 null genotype was statistically significant (OR = 0.713, 95% CI = 0.598–0.850, and P<0.001). In addition, GSTM1 null genotype was significantly associated with ARC causing risk to individuals working indoors and not individuals working outdoors. The association between GSTT1 null genotype and risk of ARC was statistically significant in Asians (OR = 1.442, 95% CI = 1.137–1.830, and P=0.003) but not in Caucasians. Conclusions. GSTM1 positive genotype is associated with increased risk of CC and loses the protective role in persons who work outdoors. Considering the ethnic variation, GSTT1 null genotype is found to be associated with increased risk of ARC in Asians but not in Caucasians
Integrating Relation Constraints with Neural Relation Extractors
Recent years have seen rapid progress in identifying predefined relationship
between entity pairs using neural networks NNs. However, such models often make
predictions for each entity pair individually, thus often fail to solve the
inconsistency among different predictions, which can be characterized by
discrete relation constraints. These constraints are often defined over
combinations of entity-relation-entity triples, since there often lack of
explicitly well-defined type and cardinality requirements for the relations. In
this paper, we propose a unified framework to integrate relation constraints
with NNs by introducing a new loss term, ConstraintLoss. Particularly, we
develop two efficient methods to capture how well the local predictions from
multiple instance pairs satisfy the relation constraints. Experiments on both
English and Chinese datasets show that our approach can help NNs learn from
discrete relation constraints to reduce inconsistency among local predictions,
and outperform popular neural relation extraction NRE models even enhanced with
extra post-processing. Our source code and datasets will be released at
https://github.com/PKUYeYuan/Constraint-Loss-AAAI-2020.Comment: Accepted to AAAI-202
Learning to Generate Parameters of ConvNets for Unseen Image Data
Typical Convolutional Neural Networks (ConvNets) depend heavily on large
amounts of image data and resort to an iterative optimization algorithm (e.g.,
SGD or Adam) to learn network parameters, which makes training very time- and
resource-intensive. In this paper, we propose a new training paradigm and
formulate the parameter learning of ConvNets into a prediction task: given a
ConvNet architecture, we observe there exists correlations between image
datasets and their corresponding optimal network parameters, and explore if we
can learn a hyper-mapping between them to capture the relations, such that we
can directly predict the parameters of the network for an image dataset never
seen during the training phase. To do this, we put forward a new hypernetwork
based model, called PudNet, which intends to learn a mapping between datasets
and their corresponding network parameters, and then predicts parameters for
unseen data with only a single forward propagation. Moreover, our model
benefits from a series of adaptive hyper recurrent units sharing weights to
capture the dependencies of parameters among different network layers.
Extensive experiments demonstrate that our proposed method achieves good
efficacy for unseen image datasets on two kinds of settings: Intra-dataset
prediction and Inter-dataset prediction. Our PudNet can also well scale up to
large-scale datasets, e.g., ImageNet-1K. It takes 8967 GPU seconds to train
ResNet-18 on the ImageNet-1K using GC from scratch and obtain a top-5 accuracy
of 44.65 %. However, our PudNet costs only 3.89 GPU seconds to predict the
network parameters of ResNet-18 achieving comparable performance (44.92 %),
more than 2,300 times faster than the traditional training paradigm
Cooperative Caching with Content Popularity Prediction for Mobile Edge Caching
Mobile Edge Caching (MEC) can be exploited for reducing redundant data transmissions and improving content delivery performance in mobile networks. However, under the MEC architecture, dynamic user preference is challenging the delivery efficiency due to the imperfect match between users\u27 demands and cached content. In this paper, we propose a learning-based cooperative content caching policy to predict the content popularity and cache the desired content proactively. We formulate the optimal cooperative content caching problem as a 0-1 integer programming for minimizing the average downloading latency. After using an artificial neural network to learn content popularity, we use a greedy algorithm for its approximate solution. Numerical results validate that the proposed policy can significantly increase content cache hit rate and reduce content delivery latency when compared with popular caching strategies
Numerical Simulation Based Targeting of the Magushan Skarn Cu-Mo Deposit, Middle-Lower Yangtze Metallogenic Belt, China
The Magushan Cu–Mo deposit is a skarn deposit within the Nanling–Xuancheng mining district of the Middle-Lower Yangtze River Metallogenic Belt (MLYRMB), China. This study presents the results of a new numerical simulation that models the ore-forming processes that generated the Magushan deposit and enables the identification of unexplored areas that have significant exploration potential under areas covered by thick sedimentary sequences that cannot be easily explored using traditional methods. This study outlines the practical value of numerical simulation in determining the processes that operate during mineral deposit formation and how this knowledge can be used to enhance exploration targeting in areas of known mineralization. Our simulation also links multiple subdisciplines such as heat transfer, pressure, fluid flow, chemical reactions, and material migration. Our simulation allows the modeling of the formation and distribution of garnet, a gangue mineral commonly found within skarn deposits (including within the Magushan deposit). The modeled distribution of garnet matches the distribution of known mineralization as well as delineating areas that may well contain high garnet abundances within and around a concealed intrusion, indicating this area should be considered a prospective target during future mineral exploration. Overall, our study indicates that this type of numerical simulation-based approach to prospectivity modeling is both effective and economical and should be considered an additional tool for future mineral exploration to reduce exploration risks when targeting mineralization in areas with thick and unprospective sedimentary cover sequences
- …