Search CORE

47 research outputs found

On the Learning Property of Logistic and Softmax Losses for Deep Neural Networks

Author: Li Xiangrui
Li Xin
Pan Deng
Zhu Dongxiao
Publication venue
Publication date: 04/03/2020
Field of study

Deep convolutional neural networks (CNNs) trained with logistic and softmax losses have made significant advancement in visual recognition tasks in computer vision. When training data exhibit class imbalances, the class-wise reweighted version of logistic and softmax losses are often used to boost performance of the unweighted version. In this paper, motivated to explain the reweighting mechanism, we explicate the learning property of those two loss functions by analyzing the necessary condition (e.g., gradient equals to zero) after training CNNs to converge to a local minimum. The analysis immediately provides us explanations for understanding (1) quantitative effects of the class-wise reweighting mechanism: deterministic effectiveness for binary classification using logistic loss yet indeterministic for multi-class classification using softmax loss; (2) disadvantage of logistic loss for single-label multi-class classification via one-vs.-all approach, which is due to the averaging effect on predicted probabilities for the negative class (e.g., non-target classes) in the learning process. With the disadvantage and advantage of logistic loss disentangled, we thereafter propose a novel reweighted logistic loss for multi-class classification. Our simple yet effective formulation improves ordinary logistic loss by focusing on learning hard non-target classes (target vs. non-target class in one-vs.-all) and turned out to be competitive with softmax loss. We evaluate our method on several benchmark datasets to demonstrate its effectiveness.Comment: AAAI2020. Previously this appeared as arXiv:1906.04026v2, which was submitted as a replacement by acciden

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning Compact Features via In-Training Representation Alignment

Author: Li Xiangrui
Li Xin
Pan Deng
Qiang Yao
Zhu Dongxiao
Publication venue
Publication date: 23/11/2022
Field of study

Deep neural networks (DNNs) for supervised learning can be viewed as a pipeline of the feature extractor (i.e., last hidden layer) and a linear classifier (i.e., output layer) that are trained jointly with stochastic gradient descent (SGD) on the loss function (e.g., cross-entropy). In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set and model parameters are then updated with the mini-batch gradients. Although the latter provides an unbiased estimation of the former, they are subject to substantial variances derived from the size and number of sampled mini-batches, leading to noisy and jumpy updates. To stabilize such undesirable variance in estimating the true gradients, we propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss in the SGD training process. We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning: (1) extracting compact feature representation; (2) reducing over-adaption on mini-batches via an adaptive weighting mechanism; and (3) accommodating to multi-modalities. Finally, we conduct large-scale experiments on both image and text classifications to demonstrate its superior performance to the strong baselines.Comment: 11 pages, 4 figures, 6 tables. Accepted for publication by AAAI-23. arXiv admin note: text overlap with arXiv:2002.0991

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

GOODAT: Towards Test-time Graph Out-of-Distribution Detection

Author: Chua Tat-Seng
He Dongxiao
Jin Di
Liu Yixin
Pan Shirui
Wang Luzhi
Wang Wenjie
Zhang He
Publication venue
Publication date: 10/01/2024
Field of study

Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains. While GNNs excel in scenarios where the testing data shares the distribution of their training counterparts (in distribution, ID), they often exhibit incorrect predictions when confronted with samples from an unfamiliar distribution (out-of-distribution, OOD). To identify and reject OOD samples with GNNs, recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN. Despite their effectiveness, these methods come with heavy training resources and costs, as they need to optimize the GNN-based models on training data. Moreover, their reliance on modifying the original GNNs and accessing training data further restricts their universality. To this end, this paper introduces a method to detect Graph Out-of-Distribution At Test-time (namely GOODAT), a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture. With a lightweight graph masker, GOODAT can learn informative subgraphs from test samples, enabling the capture of distinct graph patterns between OOD and ID samples. To optimize the graph masker, we meticulously design three unsupervised objective functions based on the graph information bottleneck principle, motivating the masker to capture compact yet informative subgraphs for OOD detection. Comprehensive evaluations confirm that our GOODAT method outperforms state-of-the-art benchmarks across a variety of real-world datasets. The code is available at Github: https://github.com/Ee1s/GOODATComment: 9 pages, 5 figure

arXiv.org e-Print Archive

Phonon-assisted radiofrequency absorption by gold nanoparticles resulting in hyperthermia

Author: A Leifert
AB Chinen
AF Radovic-Moreno
Andrea Dal Corso
AV Postnikov
BH San
CA Wert
CH Moran
DE Kruse
Dmitry A. Nedosekin
Dongxiao Li
Dongxiao Li
DY Sun
E Araya
Ekaterina I. Galanzha
FM Kouri
G. W. Hanson
George W. Hanson
GR Stewart
HK Kim
J Conde
J Sun
JA Muñoz
JI Cutler
K Huang
M Hembury
M Raoof
M Raoof
MA Sirotkina
NW Ashcroft
P-C Tsai
Pratik S. Randeria
PT Coleridge
R Kubo
R Kubo
RN Singh
RR Letfullin
S. A. Jensen
Stuart J. Corr
Stuart J. Corr
X Liu
Y Pan
Yu Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/08/2015
Field of study

It is suggested that in gold nanoparticles (GNPs) of about 5 nm sizes used in the radiofrequency (RF) hyperthermia, an absorption of the RF photon by the Fermi electron occurs with involvement of the longitudinal acoustic vibrational mode (LAVM), the dominating one in the distribution of vibrational density of states (VDOS). This physical mechanism helps to explain two observed phenomena: the size dependence of the heating rate (HR) in GNPs and reduced heat production in aggregated GNPs. The argumentation proceeds within the one-electron approximation, taking into account the discretenesses of energies and momenta of both electrons and LAVMs. The heating of GNPs is thought to consist of two consecutive processes: first, the Fermi electron absorbs simultaneously the RF photon and the LAVM available in the GNP; hereafter the excited electron gets relaxed within the GNP's boundary, exciting a LAVM with the energy higher than that of the previously absorbed LAVM. GNPs containing the Ta and/or Fe impurities are proposed for the RF hyperthermia as promising heaters with enhanced HRs, and GNPs with rare-earth impurity atoms are also brought into consideration. It is shown why the maximum HR values should be expected in GNPs with about 5-7 nm size.Comment: proceedings at the NATO Advanced Research workshop FANEM-2015 (Minsk, May 25-27, 2015). To be published in the final form in: "Fundamental and Applied NanoElectroMagnetics" (Springer Science + Business Media B.V.

arXiv.org e-Print Archive

Crossref

Autonomous Overlapping Community Detection in Temporal Networks: A Dynamic Bayesian Nonnegative Matrix Factorization Approach.

Author: Ahmed
Ball
Bogdan Gabrys
Chakrabarti
Chi
Danon
Derényi
Di Jin
Dongxiao He
Du
Folino
Gehrke
Girvan
Holme
Jin
Jin
Karrer
Kim
Kumar
Lee
Leskovec
Lin
Lin
Lin Pan
Lusseau
Nepusz
Newman
Newman
Nguyen
P. Wipf
Palla
Palla
Pengfei Jiao
Psorakis
Reichardt
Reichardt
Rezvanian
Schuetz
Spirin
Sun
Tan
Tang
Wenjun Wang
Xu
Xu
Yang
Yuan
Publication venue: 'Elsevier BV'
Publication date: 01/10/2016
Field of study

A wide variety of natural or artificial systems can be modeled as time-varying or temporal networks. To understand the structural and functional properties of these time-varying networked systems, it is desirable to detect and analyze the evolving community structure. In temporal networks, the identified communities should reflect the current snapshot network, and at the same time be similar to the communities identified in history or say the previous snapshot networks. Most of the existing approaches assume that the number of communities is known or can be obtained by some heuristic methods. This is unsuitable and complicated for most real world networks, especially temporal networks. In this paper, we propose a Bayesian probabilistic model, named Dynamic Bayesian Nonnegative Matrix Factorization (DBNMF), for automatic detection of overlapping communities in temporal networks. Our model can not only give the overlapping community structure based on the probabilistic memberships of nodes in each snapshot network but also automatically determines the number of communities in each snapshot network based on automatic relevance determination. Thereafter, a gradient descent algorithm is proposed to optimize the objective function of our DBNMF model. The experimental results using both synthetic datasets and real-world temporal networks demonstrate that the DBNMF model has superior performance compared with two widely used methods, especially when the number of communities is unknown and when the network is highly sparse

Crossref

Bournemouth University Research Online

Adjuvant Chemotherapy Versus Adjuvant Concurrent Chemoradiotherapy After Radical Surgery for Early-Stage Cervical Cancer: A Randomized, Non-Inferiority, Multicenter Trial

Author: An Ruifang
Chen Gang
Chen Qingqin
Chen Yaheng
Chen Yaxia
Chen Yile
Cheng Xiaodong
Cui Baoxia
Fan Liangsheng
Gao Qinglei
Han Xiaobing
Hu Dongxiao
Jiang Jie
Kong Beihua
Li Kezhen
Li Lin
Lu Weiguo
Ma Ding
Mao Yuyan
Pan Zimin
Peng Guangcai
Song Kun
Tang Zhenzi
Wan Xiaoyun
Wang Changyu
Wang Hui
Wang Wei
Wang Xinyu
Weng Danhui
Xie Xing
Xing Hui
Xiong Huihua
Xiong Tingchuan
Yang Xingsheng
Yi Cunjian
Zhang Xi
Zhang Youzhong
Zhu Changkun
Publication venue: DigitalCommons@TMC
Publication date: 23/11/2022
Field of study

We conducted a prospective study to assess the non-inferiority of adjuvant chemotherapy alone versus adjuvant concurrent chemoradiotherapy (CCRT) as an alternative strategy for patients with early-stage (FIGO 2009 stage IB-IIA) cervical cancer having risk factors after surgery. The condition was assessed in terms of prognosis, adverse effects, and quality of life. This randomized trial involved nine centers across China. Eligible patients were randomized to receive adjuvant chemotherapy or CCRT after surgery. The primary end-point was progression-free survival (PFS). From December 2012 to December 2014, 337 patients were subjected to randomization. Final analysis included 329 patients, including 165 in the adjuvant chemotherapy group and 164 in the adjuvant CCRT group. The median follow-up was 72.1 months. The three-year PFS rates were both 91.9%, and the five-year OS was 90.6% versus 90.0% in adjuvant chemotherapy and CCRT groups, respectively. No significant differences were observed in the PFS or OS between groups. The adjusted HR for PFS was 0.854 (95% confidence interval 0.415-1.757; P = 0.667) favoring adjuvant chemotherapy, excluding the predefined non-inferiority boundary of 1.9. The chemotherapy group showed a tendency toward good quality of life. In comparison with post-operative adjuvant CCRT, adjuvant chemotherapy treatment showed non-inferior efficacy in patients with early-stage cervical cancer having pathological risk factors. Adjuvant chemotherapy alone is a favorable alternative post-operative treatment

PubMed Central

DigitalCommons@The Texas Medical Center

Genome-Wide Bovine H3K27me3 Modifications and the Regulatory Effects on Genes Expressions in Peripheral Blood Lymphocytes

Author: A Barski
A Breton
A Koc
AL Rivas
AM Szalkowski
Apratim Mitra
AS Morrissy
CG Lee
D Kioussis
DA Nix
Dongxiao Sun
DS Johnson
DZ Caraviello
E Hare
E Vivier
EL Greer
Frank Emmert-Streib
G Wei
GJ Pan
H Li
H Ohtsuka
HD Woo
HQ Yu
I Alcobia
J Hultgren
J Hultgren
J ten Napel
Jiuzhou Song
JW Dürr
K Singh
KH Hansen
KR Cui
KT Crispi
LA Boyer
M Kircher
M Xu
MD Young
MW Pfaffl
P Lefrancois
PAC t Hoen
PJ Ross
R Rupp
RQ Li
S Derks
S Kuchen
S Walsh
SD Fouse
Shengli Zhang
SL Berger
T Hamatani
T Kouzarides
TY Roh
V Beglopoulos
XF Wang
Y Zhang
Yachun Wang
Yanghua He
Yi Zhang
Ying Yu
YK Wei
Yuan Zhang
Publication venue: Public Library of Science
Publication date: 28/06/2012
Field of study

Gene expression of lymphocytes was found to be influenced by histone methylation in mammals and trimethylation of lysine 27 on histone H3 (H3K27me3) normally represses genes expressions. Peripheral blood lymphocytes are the main source of somatic cells in the milk of dairy cows that vary frequently in response to the infection or injury of mammary gland and number of parities.The genome-wide status of H3K27me3 modifications on blood lymphocytes in lactating Holsteins was performed via ChIP-Seq approach. Combined with digital gene expression (DGE) technique, the regulation effects of H3K27me3 on genes expressions were analyzed.The ChIP-seq results showed that the peaks of H3K27me3 in cows lymphocytes were mainly enriched in the regions of up20K (~50%), down20K (~30%) and intron (~28%) of the genes. Only ~3% peaks were enriched in exon regions. Moreover, the highest H3K27me3 modification levels were mainly around the 2 Kb upstream of transcriptional start sites (TSS) of the genes. Using conjoint analysis with DGE data, we found that H3K27me3 marks tended to repress target genes expressions throughout whole gene regions especially acting on the promoter region. A total of 53 differential expressed genes were detected in third parity cows compared to first parity, and the 25 down-regulated genes (PSEN2 etc.) were negatively correlated with H3K27me3 levels on up2Kb to up1Kb of the genes, while the up-regulated genes were not showed in this relationship.The first blueprint of bovine H3K27me3 marks that mediates gene silencing was generated. H3K27me3 plays its repressed role mainly in the regulatory region in bovine lymphocytes. The up2Kb to up1Kb region of the down-regulated genes in third parity cows could be potential target of H3K27me3 regulation. Further studies are warranted to understand the regulation mechanisms of H3K27me3 on somatic cell count increases and milk losses in latter parities of cows

Crossref

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute