91 research outputs found

    Sanitized Clustering against Confounding Bias

    Full text link
    Real-world datasets inevitably contain biases that arise from different sources or conditions during data collection. Consequently, such inconsistency itself acts as a confounding factor that disturbs the cluster analysis. Existing methods eliminate the biases by projecting data onto the orthogonal complement of the subspace expanded by the confounding factor before clustering. Therein, the interested clustering factor and the confounding factor are coarsely considered in the raw feature space, where the correlation between the data and the confounding factor is ideally assumed to be linear for convenient solutions. These approaches are thus limited in scope as the data in real applications is usually complex and non-linearly correlated with the confounding factor. This paper presents a new clustering framework named Sanitized Clustering Against confounding Bias (SCAB), which removes the confounding factor in the semantic latent space of complex data through a non-linear dependence measure. To be specific, we eliminate the bias information in the latent space by minimizing the mutual information between the confounding factor and the latent representation delivered by Variational Auto-Encoder (VAE). Meanwhile, a clustering module is introduced to cluster over the purified latent representations. Extensive experiments on complex datasets demonstrate that our SCAB achieves a significant gain in clustering performance by removing the confounding bias. The code is available at \url{https://github.com/EvaFlower/SCAB}.Comment: Machine Learning, in pres

    Robust Deep Learning Models Against Semantic-Preserving Adversarial Attack

    Full text link
    Deep learning models can be fooled by small lpl_p-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointly-perturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small l∞l_{\infty} norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations.Comment: Paper accepted by the 2023 International Joint Conference on Neural Networks (IJCNN 2023

    BINN: A deep learning approach for computational mechanics problems based on boundary integral equations

    Full text link
    We proposed the boundary-integral type neural networks (BINN) for the boundary value problems in computational mechanics. The boundary integral equations are employed to transfer all the unknowns to the boundary, then the unknowns are approximated using neural networks and solved through a training process. The loss function is chosen as the residuals of the boundary integral equations. Regularization techniques are adopted to efficiently evaluate the weakly singular and Cauchy principle integrals in boundary integral equations. Potential problems and elastostatic problems are mainly concerned in this article as a demonstration. The proposed method has several outstanding advantages: First, the dimensions of the original problem are reduced by one, thus the freedoms are greatly reduced. Second, the proposed method does not require any extra treatment to introduce the boundary conditions, since they are naturally considered through the boundary integral equations. Therefore, the method is suitable for complex geometries. Third, BINN is suitable for problems on the infinite or semi-infinite domains. Moreover, BINN can easily handle heterogeneous problems with a single neural network without domain decomposition

    Earning Extra Performance from Restrictive Feedbacks

    Full text link
    Many machine learning applications encounter a situation where model providers are required to further refine the previously trained model so as to gratify the specific need of local users. This problem is reduced to the standard model tuning paradigm if the target data is permissibly fed to the model. However, it is rather difficult in a wide range of practical cases where target data is not shared with model providers but commonly some evaluations about the model are accessible. In this paper, we formally set up a challenge named \emph{Earning eXtra PerformancE from restriCTive feEDdbacks} (EXPECTED) to describe this form of model tuning problems. Concretely, EXPECTED admits a model provider to access the operational performance of the candidate model multiple times via feedback from a local user (or a group of users). The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks. Unlike existing model tuning methods where the target data is always ready for calculating model gradients, the model providers in EXPECTED only see some feedbacks which could be as simple as scalars, such as inference accuracy or usage rate. To enable tuning in this restrictive circumstance, we propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution. In particular, for the deep models whose parameters distribute across multiple layers, a more query-efficient algorithm is further tailor-designed that conducts layerwise tuning with more attention to those layers which pay off better. Our theoretical analyses justify the proposed algorithms from the aspects of both efficacy and efficiency. Extensive experiments on different applications demonstrate that our work forges a sound solution to the EXPECTED problem.Comment: Accepted by IEEE TPAMI in April 202

    Socioeconomic disparities and regional environment are associated with cervical lymph node metastases in children and adolescents with differentiated thyroid cancer: developing a web-based predictive model

    Get PDF
    PurposeTo establish an online predictive model for the prediction of cervical lymph node metastasis (CLNM) in children and adolescents with differentiated thyroid cancer (caDTC). And analyze the impact between socioeconomic disparities, regional environment and CLNM.MethodsWe retrospectively analyzed clinicopathological and sociodemographic data of caDTC from the Surveillance, Epidemiology, and End Results (SEER) database from 2000 to 2019. Risk factors for CLNM in caDTC were analyzed using univariate and multivariate logistic regression (LR). And use the extreme gradient boosting (XGBoost) algorithm and other commonly used ML algorithms to build CLNM prediction models. Model performance assessment and visualization were performed using the area under the receiver operating characteristic (AUROC) curve and SHapley Additive exPlanations (SHAP).ResultsIn addition to common risk factors, our study found that median household income and living regional were strongly associated with CLNM. Whether in the training set or the validation set, among the ML models constructed based on these variables, the XGBoost model has the best predictive performance. After 10-fold cross-validation, the prediction performance of the model can reach the best, and its best AUROC value is 0.766 (95%CI: 0.745-0.786) in the training set, 0.736 (95%CI: 0.670-0.802) in the validation set, and 0.733 (95%CI: 0.683-0.783) in the test set. Based on this XGBoost model combined with SHAP method, we constructed a web-base predictive system.ConclusionThe online prediction model based on the XGBoost algorithm can dynamically estimate the risk probability of CLNM in caDTC, so as to provide patients with personalized treatment advice

    The role of fluconazole in the regulation of fatty acid and unsaponifiable matter biosynthesis in Schizochytrium sp. MYA 1381.

    Get PDF
    BACKGROUND(#br)Schizochytrium has been widely used in industry for synthesizing polyunsaturated fatty acids (PUFAs), especially docosahexaenoic acid (DHA). However, unclear biosynthesis pathway of PUFAs inhibits further production of the Schizochytrium. Unsaponifiable matter (UM) from mevalonate pathway is crucial to cell growth and intracellular metabolism in all higher eukaryotes and microalgae. Therefore, regulation of UM biosynthesis in Schizochytrium may have important effects on fatty acids synthesis. Moreover, it is well known that UMs, such as squalene and β-carotene, are of great commercial value. Thus, regulating UM biosynthesis may also allow for an increased valuation of Schizochytrium.(#br)RESULTS(#br)To investigate the correlation of UM biosynthesis with fatty acids accumulation in Schizochytrium, fluconazole was used to block the sterols pathway. The addition of 60 mg/L fluconazole at 48 h increased the total lipids (TLs) at 96 h by 16% without affecting cell growth, which was accompanied by remarkable changes in UMs and NADPH. Cholesterol content was reduced by 8%, and the squalene content improved by 45% at 72 h, which demonstrated fluconazole’s role in inhibiting squalene flow to cholesterol. As another typical UM with antioxidant capacity, the β-carotene production was increased by 53% at 96 h. The increase of squalene and β-carotene could boost intracellular oxidation resistance to protect fatty acids from oxidation. The NADPH was found to be 33% higher than that of the control at 96 h, which meant that the cells had more reducing power for fatty acid synthesis. Metabolic analysis further confirmed that regulation of sterols was closely related to glucose absorption, pigment biosynthesis and fatty acid production in Schizochytrium.(#br)CONCLUSION(#br)This work first reported the effect of UM biosynthesis on fatty acid accumulation in Schizochytrium. The UM was found to affect fatty acid biosynthesis by changing cell membrane function, intracellular antioxidation and reducing power. We believe that this work provides valuable insights in improving PUFA and other valuable matters in microalgae

    Primer registro de anomalía intersexual gonadal de Trachurus mediterraneus (Steindachner, 1868) desde el Mar de Alborán.

    Get PDF
    El objetivo principal de este trabajo es dar a conocer el primer registro de una anomalía intersexual gonadal de Trachurus mediterraneus desde el mar de Alborán (Mediterráneo occidental). Este espécimen es el primer registro de intersexualidad para un jurel en el mundo.Postprin

    Global population structure and evolution of Bordetella pertussis and their relationship with vaccination.

    Get PDF
    Bordetella pertussis causes pertussis, a respiratory disease that is most severe for infants. Vaccination was introduced in the 1950s, and in recent years, a resurgence of disease was observed worldwide, with significant mortality in infants. Possible causes for this include the switch from whole-cell vaccines (WCVs) to less effective acellular vaccines (ACVs), waning immunity, and pathogen adaptation. Pathogen adaptation is suggested by antigenic divergence between vaccine strains and circulating strains and by the emergence of strains with increased pertussis toxin production. We applied comparative genomics to a worldwide collection of 343 B. pertussis strains isolated between 1920 and 2010. The global phylogeny showed two deep branches; the largest of these contained 98% of all strains, and its expansion correlated temporally with the first descriptions of pertussis outbreaks in Europe in the 16th century. We found little evidence of recent geographical clustering of the strains within this lineage, suggesting rapid strain flow between countries. We observed that changes in genes encoding proteins implicated in protective immunity that are included in ACVs occurred after the introduction of WCVs but before the switch to ACVs. Furthermore, our analyses consistently suggested that virulence-associated genes and genes coding for surface-exposed proteins were involved in adaptation. However, many of the putative adaptive loci identified have a physiological role, and further studies of these loci may reveal less obvious ways in which B. pertussis and the host interact. This work provides insight into ways in which pathogens may adapt to vaccination and suggests ways to improve pertussis vaccines. IMPORTANCE Whooping cough is mainly caused by Bordetella pertussis, and current vaccines are targeted against this organism. Recently, there have been increasing outbreaks of whooping cough, even where vaccine coverage is high. Analysis of the genomes of 343 B. pertussis isolates from around the world over the last 100 years suggests that the organism has emerged within the last 500 years, consistent with historical records. We show that global transmission of new strains is very rapid and that the worldwide population of B. pertussis is evolving in response to vaccine introduction, potentially enabling vaccine escape
    • …
    corecore