369 research outputs found

    A NOVEL SPLIT SELECTION OF A LOGISTIC REGRESSION TREE FOR THE CLASSIFICATION OF DATA WITH HETEROGENEOUS SUBGROUPS

    Get PDF
    A logistic regression tree (LRT) is a hybrid machine learning method that combines a decision tree model and logistic regression models. An LRT recursively partitions the input data space through splitting and learns multiple logistic regression models optimized for each subpopulation. The split selection is a critical procedure for improving the predictive performance of the LRT. In this paper, we present a novel separability-based split selection method for the construction of an LRT. The separability measure, defined on the feature space of logistic regression models, evaluates the performance of potential child models without fitting, and the optimal split is selected based on the results. Heterogeneous subgroups that have different class-separating patterns can be identified in the split process when they exist in the data. In addition, we compare the performance of our proposed method with the benchmark algorithms through experiments on both synthetic and real-world datasets. The experimental results indicate the effectiveness and generality of our proposed method

    Bilingual Autoencoder-based Efficient Harmonization of Multi-source Private Data for Accurate Predictive Modeling

    Get PDF
    Sharing electronic health record data is essential for advanced analysis, but may put sensitive information at risk. Several studies have attempted to address this risk using contextual embedding, but with many hospitals involved, they are often inefficient and inflexible. Thus, we propose a bilingual autoencoder-based model to harmonize local embeddings in different spaces. Cross-hospital reconstruction of embeddings makes encoders map embeddings from hospitals to a shared space and align them spontaneously. We also suggest two-phase training to prevent distortion of embeddings during harmonization with hospitals that have biased information. In experiments, we used medical event sequences from the Medical Information Mart for Intensive Care-III dataset and simulated the situation of multiple hospitals. For evaluation, we measured the alignment of events from different hospitals and the prediction accuracy of a patient & rsquo;s diagnosis in the next admission in three scenarios in which local embeddings do not work. The proposed method efficiently harmonizes embeddings in different spaces, increases prediction accuracy, and gives flexibility to include new hospitals, so is superior to previous methods in most cases. It will be useful in predictive tasks to utilize distributed data while preserving private information

    An efficient multivariate feature ranking method for gene selection in high-dimensional microarray data

    Get PDF
    Classification of microarray data plays a significant role in the diagnosis and prediction of cancer. However, its high-dimensionality (>tens of thousands) compared to the number of observations (<tens of hundreds) may lead to poor classification accuracy. In addition, only a fraction of genes is really important for the classification of a certain cancer, and thus feature selection is very essential in this field. Due to the time and memory burden for processing the high-dimensional data, univariate feature ranking methods are widely-used in gene selection. However, most of them are not that accurate because they only consider the relevance of features to the target without considering the redundancy among features. In this study, we propose a novel multivariate feature ranking method to improve the quality of gene selection and ultimately to improve the accuracy of microarray data classification. The method can be efficiently applied to high-dimensional microarray data. We embedded the formal definition of relevance into a Markov blanket (MB) to create a new feature ranking method. Using a few microarray datasets, we demonstrated the practicability of MB-based feature ranking having high accuracy and good efficiency. The method outperformed commonly-used univariate ranking methods and also yielded the better result even compared with the other multivariate feature ranking method due to the advantage of data efficiency

    Exosomes from Human Adipose Tissue-Derived Mesenchymal Stem Cells Promote Epidermal Barrier Repair by Inducing de Novo Synthesis of Ceramides in Atopic Dermatitis.

    Get PDF
    Atopic dermatitis (AD) is a multifactorial, heterogeneous disease associated with epidermal barrier disruption and intense systemic inflammation. Previously, we showed that exosomes derived from human adipose tissue-derived mesenchymal stem cells (ASC-exosomes) attenuate AD-like symptoms by reducing multiple inflammatory cytokine levels. Here, we investigated ASC-exosomes' effects on skin barrier restoration by analyzing protein and lipid contents. We found that subcutaneous injection of ASC-exosomes in an oxazolone-induced dermatitis model remarkably reduced trans-epidermal water loss, while enhancing stratum corneum (SC) hydration and markedly decreasing the levels of inflammatory cytokines such as IL-4, IL-5, IL-13, TNF-α, IFN-γ, IL-17, and TSLP, all in a dose-dependent manner. Interestingly, ASC-exosomes induced the production of ceramides and dihydroceramides. Electron microscopic analysis revealed enhanced epidermal lamellar bodies and formation of lamellar layer at the interface of the SC and stratum granulosum with ASC-exosomes treatment. Deep RNA sequencing analysis of skin lesions demonstrated that ASC-exosomes restores the expression of genes involved in skin barrier, lipid metabolism, cell cycle, and inflammatory response in the diseased area. Collectively, our results suggest that ASC-exosomes effectively restore epidermal barrier functions in AD by facilitating the de novo synthesis of ceramides, resulting in a promising cell-free therapeutic option for treating AD

    Boundary Setting for Ecosystem Services by Factor Analysis: A Case Study in Seocheon, South Korea

    Get PDF
    Ecosystem service assessment maps are an important form of data, showing the flow and characteristics of ecosystem services. However, there has been a lack of research on the spatial boundaries of synergetic and trade-off relationships among different types of ecosystem services based on the microscopic characteristics of ecosystem maps. Therefore, the boundaries of ecosystems were identified in this study using factor analysis of indicators in ecosystem service maps. Ecosystems were mapped for each indicator in each cell, and then factor analysis was used to combine all indicators into one map. Analysis of Seocheon in central South Korea shows the boundaries of two ecosystem types: a mountainous region with abundant underground water and carbon stocks that lack rice paddies, and flatlands with high crop production and a lack of scenic views. The spatial types of ecosystems in which synergy and trade-offs occur were identified by indicator, and these can be used as evidentiary material for spatial planning in order to maximize the function of each ecosystem service
    corecore