8,832 research outputs found

    FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks

    Full text link
    Knowledge distillation (KD) has demonstrated its effectiveness to boost the performance of graph neural networks (GNNs), where its goal is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is actually difficult to train a satisfactory teacher GNN due to the well-known over-parametrized and over-smoothing issues, leading to invalid knowledge transfer in practical applications. In this paper, we propose the first Free-direction Knowledge Distillation framework via Reinforcement learning for GNNs, called FreeKD, which is no longer required to provide a deeper well-optimized teacher GNN. The core idea of our work is to collaboratively build two shallower GNNs in an effort to exchange knowledge between them via reinforcement learning in a hierarchical way. As we observe that one typical GNN model often has better and worse performances at different nodes during training, we devise a dynamic and free-direction knowledge transfer strategy that consists of two levels of actions: 1) node-level action determines the directions of knowledge transfer between the corresponding nodes of two networks; and then 2) structure-level action determines which of the local structures generated by the node-level actions to be propagated. In essence, our FreeKD is a general and principled framework which can be naturally compatible with GNNs of different architectures. Extensive experiments on five benchmark datasets demonstrate our FreeKD outperforms two base GNNs in a large margin, and shows its efficacy to various GNNs. More surprisingly, our FreeKD has comparable or even better performance than traditional KD algorithms that distill knowledge from a deeper and stronger teacher GNN.Comment: Accepted to KDD 202

    An Updated Meta-Analysis: Risk Conferred by Glutathione S-Transferases ( GSTM1

    Get PDF
    Purpose. To study the effects of glutathione S-transferase M1 (GSTM1) and T1 (GSTT1) polymorphisms on age-related cataract (ARC). Methods. After a systematic literature search, all relevant studies evaluating the association between GSTs polymorphisms and ARC were included. Results. Fifteen studies on GSTM1 and nine studies on GSTT1 were included in this meta-analysis. In the pooled analysis, a significant association between null genotype of GSTT1 and ARC was found (OR = 1.229, 95% CI = 1.057–1.429, and P=0.007). In subgroup analysis, the association between cortical cataract (CC) and GSTM1 null genotype was statistically significant (OR = 0.713, 95% CI = 0.598–0.850, and P<0.001). In addition, GSTM1 null genotype was significantly associated with ARC causing risk to individuals working indoors and not individuals working outdoors. The association between GSTT1 null genotype and risk of ARC was statistically significant in Asians (OR = 1.442, 95% CI = 1.137–1.830, and P=0.003) but not in Caucasians. Conclusions. GSTM1 positive genotype is associated with increased risk of CC and loses the protective role in persons who work outdoors. Considering the ethnic variation, GSTT1 null genotype is found to be associated with increased risk of ARC in Asians but not in Caucasians

    Integrating Relation Constraints with Neural Relation Extractors

    Full text link
    Recent years have seen rapid progress in identifying predefined relationship between entity pairs using neural networks NNs. However, such models often make predictions for each entity pair individually, thus often fail to solve the inconsistency among different predictions, which can be characterized by discrete relation constraints. These constraints are often defined over combinations of entity-relation-entity triples, since there often lack of explicitly well-defined type and cardinality requirements for the relations. In this paper, we propose a unified framework to integrate relation constraints with NNs by introducing a new loss term, ConstraintLoss. Particularly, we develop two efficient methods to capture how well the local predictions from multiple instance pairs satisfy the relation constraints. Experiments on both English and Chinese datasets show that our approach can help NNs learn from discrete relation constraints to reduce inconsistency among local predictions, and outperform popular neural relation extraction NRE models even enhanced with extra post-processing. Our source code and datasets will be released at https://github.com/PKUYeYuan/Constraint-Loss-AAAI-2020.Comment: Accepted to AAAI-202

    Learning to Generate Parameters of ConvNets for Unseen Image Data

    Full text link
    Typical Convolutional Neural Networks (ConvNets) depend heavily on large amounts of image data and resort to an iterative optimization algorithm (e.g., SGD or Adam) to learn network parameters, which makes training very time- and resource-intensive. In this paper, we propose a new training paradigm and formulate the parameter learning of ConvNets into a prediction task: given a ConvNet architecture, we observe there exists correlations between image datasets and their corresponding optimal network parameters, and explore if we can learn a hyper-mapping between them to capture the relations, such that we can directly predict the parameters of the network for an image dataset never seen during the training phase. To do this, we put forward a new hypernetwork based model, called PudNet, which intends to learn a mapping between datasets and their corresponding network parameters, and then predicts parameters for unseen data with only a single forward propagation. Moreover, our model benefits from a series of adaptive hyper recurrent units sharing weights to capture the dependencies of parameters among different network layers. Extensive experiments demonstrate that our proposed method achieves good efficacy for unseen image datasets on two kinds of settings: Intra-dataset prediction and Inter-dataset prediction. Our PudNet can also well scale up to large-scale datasets, e.g., ImageNet-1K. It takes 8967 GPU seconds to train ResNet-18 on the ImageNet-1K using GC from scratch and obtain a top-5 accuracy of 44.65 %. However, our PudNet costs only 3.89 GPU seconds to predict the network parameters of ResNet-18 achieving comparable performance (44.92 %), more than 2,300 times faster than the traditional training paradigm

    Cooperative Caching with Content Popularity Prediction for Mobile Edge Caching

    Get PDF
    Mobile Edge Caching (MEC) can be exploited for reducing redundant data transmissions and improving content delivery performance in mobile networks. However, under the MEC architecture, dynamic user preference is challenging the delivery efficiency due to the imperfect match between users\u27 demands and cached content. In this paper, we propose a learning-based cooperative content caching policy to predict the content popularity and cache the desired content proactively. We formulate the optimal cooperative content caching problem as a 0-1 integer programming for minimizing the average downloading latency. After using an artificial neural network to learn content popularity, we use a greedy algorithm for its approximate solution. Numerical results validate that the proposed policy can significantly increase content cache hit rate and reduce content delivery latency when compared with popular caching strategies

    Numerical Simulation Based Targeting of the Magushan Skarn Cu-Mo Deposit, Middle-Lower Yangtze Metallogenic Belt, China

    Get PDF
    The Magushan Cu–Mo deposit is a skarn deposit within the Nanling–Xuancheng mining district of the Middle-Lower Yangtze River Metallogenic Belt (MLYRMB), China. This study presents the results of a new numerical simulation that models the ore-forming processes that generated the Magushan deposit and enables the identification of unexplored areas that have significant exploration potential under areas covered by thick sedimentary sequences that cannot be easily explored using traditional methods. This study outlines the practical value of numerical simulation in determining the processes that operate during mineral deposit formation and how this knowledge can be used to enhance exploration targeting in areas of known mineralization. Our simulation also links multiple subdisciplines such as heat transfer, pressure, fluid flow, chemical reactions, and material migration. Our simulation allows the modeling of the formation and distribution of garnet, a gangue mineral commonly found within skarn deposits (including within the Magushan deposit). The modeled distribution of garnet matches the distribution of known mineralization as well as delineating areas that may well contain high garnet abundances within and around a concealed intrusion, indicating this area should be considered a prospective target during future mineral exploration. Overall, our study indicates that this type of numerical simulation-based approach to prospectivity modeling is both effective and economical and should be considered an additional tool for future mineral exploration to reduce exploration risks when targeting mineralization in areas with thick and unprospective sedimentary cover sequences
    • …
    corecore