34 research outputs found

    Kernel Learning in Ridge Regression "Automatically" Yields Exact Low Rank Solution

    Full text link
    We consider kernels of the form (x,x′)↦ϕ(∥x−x′∥Σ2)(x,x') \mapsto \phi(\|x-x'\|^2_\Sigma) parametrized by Σ\Sigma. For such kernels, we study a variant of the kernel ridge regression problem which simultaneously optimizes the prediction function and the parameter Σ\Sigma of the reproducing kernel Hilbert space. The eigenspace of the Σ\Sigma learned from this kernel ridge regression problem can inform us which directions in covariate space are important for prediction. Assuming that the covariates have nonzero explanatory power for the response only through a low dimensional subspace (central mean subspace), we find that the global minimizer of the finite sample kernel learning objective is also low rank with high probability. More precisely, the rank of the minimizing Σ\Sigma is with high probability bounded by the dimension of the central mean subspace. This phenomenon is interesting because the low rankness property is achieved without using any explicit regularization of Σ\Sigma, e.g., nuclear norm penalization. Our theory makes correspondence between the observed phenomenon and the notion of low rank set identifiability from the optimization literature. The low rankness property of the finite sample solutions exists because the population kernel learning objective grows "sharply" when moving away from its minimizers in any direction perpendicular to the central mean subspace.Comment: Add code links and correct a figur

    Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction

    Full text link
    While multi-modal learning has been widely used for MRI reconstruction, it relies on paired multi-modal data which is difficult to acquire in real clinical scenarios. Especially in the federated setting, the common situation is that several medical institutions only have single-modal data, termed the modality missing issue. Therefore, it is infeasible to deploy a standard federated learning framework in such conditions. In this paper, we propose a novel communication-efficient federated learning framework, namely Fed-PMG, to address the missing modality challenge in federated multi-modal MRI reconstruction. Specifically, we utilize a pseudo modality generation mechanism to recover the missing modality for each single-modal client by sharing the distribution information of the amplitude spectrum in frequency space. However, the step of sharing the original amplitude spectrum leads to heavy communication costs. To reduce the communication cost, we introduce a clustering scheme to project the set of amplitude spectrum into finite cluster centroids, and share them among the clients. With such an elaborate design, our approach can effectively complete the missing modality within an acceptable communication cost. Extensive experiments demonstrate that our proposed method can attain similar performance with the ideal scenario, i.e., all clients have the full set of modalities. The source code will be released.Comment: 10 pages, 5 figures

    CellMix: A General Instance Relationship based Method for Data Augmentation Towards Pathology Image Classification

    Full text link
    In pathology image analysis, obtaining and maintaining high-quality annotated samples is an extremely labor-intensive task. To overcome this challenge, mixing-based methods have emerged as effective alternatives to traditional preprocessing data augmentation techniques. Nonetheless, these methods fail to fully consider the unique features of pathology images, such as local specificity, global distribution, and inner/outer-sample instance relationships. To better comprehend these characteristics and create valuable pseudo samples, we propose the CellMix framework, which employs a novel distribution-oriented in-place shuffle approach. By dividing images into patches based on the granularity of pathology instances and shuffling them within the same batch, the absolute relationships between instances can be effectively preserved when generating new samples. Moreover, we develop a curriculum learning-inspired, loss-driven strategy to handle perturbations and distribution-related noise during training, enabling the model to adaptively fit the augmented data. Our experiments in pathology image classification tasks demonstrate state-of-the-art (SOTA) performance on 7 distinct datasets. This innovative instance relationship-centered method has the potential to inform general data augmentation approaches for pathology image classification. The associated codes are available at https://github.com/sagizty/CellMix

    Rethinking Client Drift in Federated Learning: A Logit Perspective

    Full text link
    Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection. However, the real-world non-IID data will lead to client drift which degrades the performance of FL. Interestingly, we find that the difference in logits between the local and global models increases as the model is continuously updated, thus seriously deteriorating FL performance. This is mainly due to catastrophic forgetting caused by data heterogeneity between clients. To alleviate this problem, we propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models. FedCSD does not simply transfer global knowledge to local clients, as an undertrained global model cannot provide reliable knowledge, i.e., class similarity information, and its wrong soft labels will mislead the optimization of local models. Concretely, FedCSD introduces a class prototype similarity distillation to align the local logits with the refined global logits that are weighted by the similarity between local logits and the global prototype. To enhance the quality of global logits, FedCSD adopts an adaptive mask to filter out the terrible soft labels of the global models, thereby preventing them to mislead local optimization. Extensive experiments demonstrate the superiority of our method over the state-of-the-art federated learning approaches in various heterogeneous settings. The source code will be released.Comment: 11 pages, 7 figure

    Pancreatic Cancer ROSE Image Classification Based on Multiple Instance Learning with Shuffle Instances

    Full text link
    The rapid on-site evaluation (ROSE) technique can significantly ac-celerate the diagnostic workflow of pancreatic cancer by immediately analyzing the fast-stained cytopathological images with on-site pathologists. Computer-aided diagnosis (CAD) using the deep learning method has the potential to solve the problem of insufficient pathology staffing. However, the cancerous patterns of ROSE images vary greatly between different samples, making the CAD task extremely challenging. Besides, due to different staining qualities and various types of acquisition devices, the ROSE images also have compli-cated perturbations in terms of color distribution, brightness, and contrast. To address these challenges, we proposed a novel multiple instance learning (MIL) approach using shuffle patches containing the instances, which adopts the patch-based learning strategy of Vision Transformers. With the re-grouped bags of shuffle instances and their bag-level soft labels, the approach utilizes a MIL head to make the model focus on the features from the pancreatic cancer cells, rather than that from various perturbations in ROSE images. Simultaneously, combined with a classification head, the model can effectively identify the gen-eral distributive patterns across different instances. The results demonstrate the significant improvements in the classification accuracy with more accurate at-tention regions, indicating that the diverse patterns of ROSE images are effec-tively extracted, and the complicated perturbations of ROSE images are signifi-cantly eliminated. It also suggests that the MIL with shuffle instances has great potential in the analysis of cytopathological images

    miR-720 Regulates Insulin Secretion by Targeting Rab35

    No full text
    miRNAs pose a good prospect in the diagnosis and treatment of type 2 diabetes (T2D). This study is aimed at investigating whether miR-720 targets Rab35 to regulate insulin secretion in MIN6 cells and its molecular mechanism and the clinical value of miR-720 as a specific biomarker of T2D. Fifty-five samples of new diagnosis T2D patients and normal control were collected. Levels of miR-720, fasting blood glucose, insulin, and other indicators of glucose and lipid metabolism were determined. We increased and decreased the miR-720 expression using miR-720 mimic and inhibitor to identify the effect of miR-720 on insulin secretion in MIN6 cells, respectively. Then, we used miR-720 mimic, miR-720 inhibitor, and dual luciferase reporter gene assays to prove miR-720 which regulates insulin secretion by targeting Rab35 in MIN6 cells. In addition, we overexpressed and silenced the Rab35 gene to detect the expression of PI3K, Akt, and mTOR in MIN6 cells by RT-PCR and western blot. In this study, circulating miR-720 was significantly higher in the T2D group than the control group, and miR-270 was positive correlated with FBG, while negatively correlated with FINS. The overexpression of miR-720 inhibited insulin secretion, and miR-720 downregulation promoted insulin secretion. miR-720 regulated insulin secretion by targeting Rab35 in MIN6 cells. Compared with the control group, the expression of PI3K, Akt, and mTOR was significantly decreased by the overexpression of the Rab35 gene, while the silencing Rab35 gene could induce the expression of PI3K, Akt, and mTOR. Furthermore, miR-720 mimic could activate the PI3K pathway. We conclude that miR-720 may be a potential biomarker for the diagnosis of T2D. Increase of miR-720 reduced the Rab35 expression then activate the PI3K/Akt/mTOR signal pathway, thus inhibiting insulin secretion

    Comprehensive Evaluation on Space Information Network Demonstration Platform Based on Tracking and Data Relay Satellite System

    No full text
    Due to the global coverage and real-time access advantages of the Tracking and Data Relay Satellite System (TDRSS), the demonstration platform based on TDRSS can satisfy the new technology verification and demonstration needs of the space information network (evolution from sensorweb). However, the comprehensive evaluation research of this demonstration platform faces many problems: complicated and diverse technical indicators in various areas, coupling redundancy between indicators, difficulty in establishing the number of indicator system layers, and evaluation errors causing by subjective scoring. Concerning the difficulties, this paper gives a method to construct this special index system, and improves the consistency of evaluation results with Analytic Hierarchy Process in Group Decision-Making (AHP-GDM). A comprehensive evaluation index system including five criterions, 11 elements, more than 30 indicators is constructed according to the three-step strategy of initial set classification, hierarchical optimization, and de-redundancy. For the inconsistent scoring of AHP-GDM, a high-speed convergence consistency improvement strategy is proposed in this paper. Moreover, a method for generating a comprehensive judgment matrix (the aggregation of each judgment matrix) aggregation coefficient is provided. Numerical experiments show that this strategy effectively improves the consistency of the comprehensive judgment matrix. Finally, taking the evaluation of TDRSS development as an example, the versatility and feasibility of the new evaluation strategy are demonstrated

    Comparison of risk prediction models for the progression of pelvic inflammatory disease patients to sepsis: Cox regression model and machine learning model

    No full text
    Introduction: The present study presents the development and validation of a clinical prediction model using random survival forest (RSF) and stepwise Cox regression, aiming to predict the probability of pelvic inflammatory disease (PID) progressing to sepsis. Methods: A retrospective cohort study was conducted, gathering clinical data of patients diagnosed with PID between 2008 and 2019 from the Medical Information Mart for Intensive Care (MIMIC)-IV database. Patients who met the Sepsis 3.0 diagnostic criteria were selected, with sepsis as the outcome. Univariate Cox regression and stepwise Cox regression were used to screen variables for constructing a nomogram. Moreover, an RSF model was created using machine learning algorithms. To verify the model's performance, a calibration curve, decision curve analysis (DCA), and receiver operating characteristic (ROC) curve were utilized. Furthermore, the capabilities of the two models for estimating the incidence of sepsis in PID patients within 3 and 7 days were compared. Results: A total of 1064 PID patients were included, of whom 54 had progressed to sepsis. The established nomogram highlighted dialysis, reduced platelet (PLT) counts, history of pneumonia, medication of glucocorticoids, and increased leukocyte counts as significant predictive factors. The areas under the curve (AUCs) of the nomogram for prediction of PID progression to sepsis at 3-day and 7-day (3-/7-day) in the training set and the validation set were 0.886/0.863 and 0.824/0.726, respectively, and the C-index of the model was 0.8905. The RSF displayed excellent performance, with AUCs of 0.939/0.919 and 0.712/0.571 for 3-/7-day risk prediction in the training set and validation set, respectively. Conclusion: The nomogram accurately predicted the incidence of sepsis in PID patients, and relevant risk factors were identified. While the RSF model outperformed the Cox regression models in predicting sepsis incidence, its performance exhibited some instability. On the other hand, the Cox regression-based nomogram displayed stable performance and improved interpretability, thereby supporting clinical decision-making in PID treatment
    corecore