47 research outputs found

    On the Learning Property of Logistic and Softmax Losses for Deep Neural Networks

    Full text link
    Deep convolutional neural networks (CNNs) trained with logistic and softmax losses have made significant advancement in visual recognition tasks in computer vision. When training data exhibit class imbalances, the class-wise reweighted version of logistic and softmax losses are often used to boost performance of the unweighted version. In this paper, motivated to explain the reweighting mechanism, we explicate the learning property of those two loss functions by analyzing the necessary condition (e.g., gradient equals to zero) after training CNNs to converge to a local minimum. The analysis immediately provides us explanations for understanding (1) quantitative effects of the class-wise reweighting mechanism: deterministic effectiveness for binary classification using logistic loss yet indeterministic for multi-class classification using softmax loss; (2) disadvantage of logistic loss for single-label multi-class classification via one-vs.-all approach, which is due to the averaging effect on predicted probabilities for the negative class (e.g., non-target classes) in the learning process. With the disadvantage and advantage of logistic loss disentangled, we thereafter propose a novel reweighted logistic loss for multi-class classification. Our simple yet effective formulation improves ordinary logistic loss by focusing on learning hard non-target classes (target vs. non-target class in one-vs.-all) and turned out to be competitive with softmax loss. We evaluate our method on several benchmark datasets to demonstrate its effectiveness.Comment: AAAI2020. Previously this appeared as arXiv:1906.04026v2, which was submitted as a replacement by acciden

    Learning Compact Features via In-Training Representation Alignment

    Full text link
    Deep neural networks (DNNs) for supervised learning can be viewed as a pipeline of the feature extractor (i.e., last hidden layer) and a linear classifier (i.e., output layer) that are trained jointly with stochastic gradient descent (SGD) on the loss function (e.g., cross-entropy). In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set and model parameters are then updated with the mini-batch gradients. Although the latter provides an unbiased estimation of the former, they are subject to substantial variances derived from the size and number of sampled mini-batches, leading to noisy and jumpy updates. To stabilize such undesirable variance in estimating the true gradients, we propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss in the SGD training process. We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning: (1) extracting compact feature representation; (2) reducing over-adaption on mini-batches via an adaptive weighting mechanism; and (3) accommodating to multi-modalities. Finally, we conduct large-scale experiments on both image and text classifications to demonstrate its superior performance to the strong baselines.Comment: 11 pages, 4 figures, 6 tables. Accepted for publication by AAAI-23. arXiv admin note: text overlap with arXiv:2002.0991

    GOODAT: Towards Test-time Graph Out-of-Distribution Detection

    Full text link
    Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains. While GNNs excel in scenarios where the testing data shares the distribution of their training counterparts (in distribution, ID), they often exhibit incorrect predictions when confronted with samples from an unfamiliar distribution (out-of-distribution, OOD). To identify and reject OOD samples with GNNs, recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN. Despite their effectiveness, these methods come with heavy training resources and costs, as they need to optimize the GNN-based models on training data. Moreover, their reliance on modifying the original GNNs and accessing training data further restricts their universality. To this end, this paper introduces a method to detect Graph Out-of-Distribution At Test-time (namely GOODAT), a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture. With a lightweight graph masker, GOODAT can learn informative subgraphs from test samples, enabling the capture of distinct graph patterns between OOD and ID samples. To optimize the graph masker, we meticulously design three unsupervised objective functions based on the graph information bottleneck principle, motivating the masker to capture compact yet informative subgraphs for OOD detection. Comprehensive evaluations confirm that our GOODAT method outperforms state-of-the-art benchmarks across a variety of real-world datasets. The code is available at Github: https://github.com/Ee1s/GOODATComment: 9 pages, 5 figure

    Phonon-assisted radiofrequency absorption by gold nanoparticles resulting in hyperthermia

    Full text link
    It is suggested that in gold nanoparticles (GNPs) of about 5 nm sizes used in the radiofrequency (RF) hyperthermia, an absorption of the RF photon by the Fermi electron occurs with involvement of the longitudinal acoustic vibrational mode (LAVM), the dominating one in the distribution of vibrational density of states (VDOS). This physical mechanism helps to explain two observed phenomena: the size dependence of the heating rate (HR) in GNPs and reduced heat production in aggregated GNPs. The argumentation proceeds within the one-electron approximation, taking into account the discretenesses of energies and momenta of both electrons and LAVMs. The heating of GNPs is thought to consist of two consecutive processes: first, the Fermi electron absorbs simultaneously the RF photon and the LAVM available in the GNP; hereafter the excited electron gets relaxed within the GNP's boundary, exciting a LAVM with the energy higher than that of the previously absorbed LAVM. GNPs containing the Ta and/or Fe impurities are proposed for the RF hyperthermia as promising heaters with enhanced HRs, and GNPs with rare-earth impurity atoms are also brought into consideration. It is shown why the maximum HR values should be expected in GNPs with about 5-7 nm size.Comment: proceedings at the NATO Advanced Research workshop FANEM-2015 (Minsk, May 25-27, 2015). To be published in the final form in: "Fundamental and Applied NanoElectroMagnetics" (Springer Science + Business Media B.V.

    Autonomous Overlapping Community Detection in Temporal Networks: A Dynamic Bayesian Nonnegative Matrix Factorization Approach.

    Get PDF
    A wide variety of natural or artificial systems can be modeled as time-varying or temporal networks. To understand the structural and functional properties of these time-varying networked systems, it is desirable to detect and analyze the evolving community structure. In temporal networks, the identified communities should reflect the current snapshot network, and at the same time be similar to the communities identified in history or say the previous snapshot networks. Most of the existing approaches assume that the number of communities is known or can be obtained by some heuristic methods. This is unsuitable and complicated for most real world networks, especially temporal networks. In this paper, we propose a Bayesian probabilistic model, named Dynamic Bayesian Nonnegative Matrix Factorization (DBNMF), for automatic detection of overlapping communities in temporal networks. Our model can not only give the overlapping community structure based on the probabilistic memberships of nodes in each snapshot network but also automatically determines the number of communities in each snapshot network based on automatic relevance determination. Thereafter, a gradient descent algorithm is proposed to optimize the objective function of our DBNMF model. The experimental results using both synthetic datasets and real-world temporal networks demonstrate that the DBNMF model has superior performance compared with two widely used methods, especially when the number of communities is unknown and when the network is highly sparse

    Adjuvant Chemotherapy Versus Adjuvant Concurrent Chemoradiotherapy After Radical Surgery for Early-Stage Cervical Cancer: A Randomized, Non-Inferiority, Multicenter Trial

    Get PDF
    We conducted a prospective study to assess the non-inferiority of adjuvant chemotherapy alone versus adjuvant concurrent chemoradiotherapy (CCRT) as an alternative strategy for patients with early-stage (FIGO 2009 stage IB-IIA) cervical cancer having risk factors after surgery. The condition was assessed in terms of prognosis, adverse effects, and quality of life. This randomized trial involved nine centers across China. Eligible patients were randomized to receive adjuvant chemotherapy or CCRT after surgery. The primary end-point was progression-free survival (PFS). From December 2012 to December 2014, 337 patients were subjected to randomization. Final analysis included 329 patients, including 165 in the adjuvant chemotherapy group and 164 in the adjuvant CCRT group. The median follow-up was 72.1 months. The three-year PFS rates were both 91.9%, and the five-year OS was 90.6% versus 90.0% in adjuvant chemotherapy and CCRT groups, respectively. No significant differences were observed in the PFS or OS between groups. The adjusted HR for PFS was 0.854 (95% confidence interval 0.415-1.757; P = 0.667) favoring adjuvant chemotherapy, excluding the predefined non-inferiority boundary of 1.9. The chemotherapy group showed a tendency toward good quality of life. In comparison with post-operative adjuvant CCRT, adjuvant chemotherapy treatment showed non-inferior efficacy in patients with early-stage cervical cancer having pathological risk factors. Adjuvant chemotherapy alone is a favorable alternative post-operative treatment

    Genome-Wide Bovine H3K27me3 Modifications and the Regulatory Effects on Genes Expressions in Peripheral Blood Lymphocytes

    Get PDF
    Gene expression of lymphocytes was found to be influenced by histone methylation in mammals and trimethylation of lysine 27 on histone H3 (H3K27me3) normally represses genes expressions. Peripheral blood lymphocytes are the main source of somatic cells in the milk of dairy cows that vary frequently in response to the infection or injury of mammary gland and number of parities.The genome-wide status of H3K27me3 modifications on blood lymphocytes in lactating Holsteins was performed via ChIP-Seq approach. Combined with digital gene expression (DGE) technique, the regulation effects of H3K27me3 on genes expressions were analyzed.The ChIP-seq results showed that the peaks of H3K27me3 in cows lymphocytes were mainly enriched in the regions of up20K (~50%), down20K (~30%) and intron (~28%) of the genes. Only ~3% peaks were enriched in exon regions. Moreover, the highest H3K27me3 modification levels were mainly around the 2 Kb upstream of transcriptional start sites (TSS) of the genes. Using conjoint analysis with DGE data, we found that H3K27me3 marks tended to repress target genes expressions throughout whole gene regions especially acting on the promoter region. A total of 53 differential expressed genes were detected in third parity cows compared to first parity, and the 25 down-regulated genes (PSEN2 etc.) were negatively correlated with H3K27me3 levels on up2Kb to up1Kb of the genes, while the up-regulated genes were not showed in this relationship.The first blueprint of bovine H3K27me3 marks that mediates gene silencing was generated. H3K27me3 plays its repressed role mainly in the regulatory region in bovine lymphocytes. The up2Kb to up1Kb region of the down-regulated genes in third parity cows could be potential target of H3K27me3 regulation. Further studies are warranted to understand the regulation mechanisms of H3K27me3 on somatic cell count increases and milk losses in latter parities of cows
    corecore