112 research outputs found

    Random Shapley Forests: Cooperative Game Based Random Forests with Consistency

    Get PDF
    The original random forests algorithm has been widely used and has achieved excellent performance for the classification and regression tasks. However, the research on the theory of random forests lags far behind its applications. In this paper, to narrow the gap between the applications and theory of random forests, we propose a new random forests algorithm, called random Shapley forests (RSFs), based on the Shapley value. The Shapley value is one of the well-known solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs uses the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed random forests algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent random forests, the original random forests and a classic classifier, support vector machines

    A Novel Strategy for MALDI-TOF MS Analysis of Small Molecules

    Get PDF
    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) does not work efficiently on small molecules (usually with molecular weight below 500 Da) because of the interference of matrix-related peaks in low m/z region. The previous methods developed for this problem focused on reducing the peaks caused by the traditional matrices. Here, we report a novel strategy to analyze small molecules in a high and interference-free mass range by using metal-phthalocyanines (MPcs) as matrices which should be capable of forming matrix-analyte adducts. The mass of the target analyte was calculated by subtracting the mass of MPc from the mass of the MPc–analyte adduct. MPcs were also detectable and could serve as internal standards. Various MPcs with aromatic or aliphatic groups and different metal centers were then synthesized and explored. Aluminum-phthalocyanines (AlPcs), gallium-phthalocyanines (GaPcs), and indium-phthalocyanines (InPcs) were efficient matrices to form MPc–analyte adducts in either the positive or negative ion mode. The detection limits varied from 17 to 75 fmol, depending on analyte types. The mechanism of adducts formation was also proposed. Collectively, our strategy provides a novel and efficient way to analyze small molecules by MALDI-TOF MS

    One for All, All for One: Learning and Transferring User Embeddings for Cross-Domain Recommendation

    Full text link
    Cross-domain recommendation is an important method to improve recommender system performance, especially when observations in target domains are sparse. However, most existing techniques focus on single-target or dual-target cross-domain recommendation (CDR) and are hard to be generalized to CDR with multiple target domains. In addition, the negative transfer problem is prevalent in CDR, where the recommendation performance in a target domain may not always be enhanced by knowledge learned from a source domain, especially when the source domain has sparse data. In this study, we propose CAT-ART, a multi-target CDR method that learns to improve recommendations in all participating domains through representation learning and embedding transfer. Our method consists of two parts: a self-supervised Contrastive AuToencoder (CAT) framework to generate global user embeddings based on information from all participating domains, and an Attention-based Representation Transfer (ART) framework which transfers domain-specific user embeddings from other domains to assist with target domain recommendation. CAT-ART boosts the recommendation performance in any target domain through the combined use of the learned global user representation and knowledge transferred from other domains, in addition to the original user embedding in the target domain. We conducted extensive experiments on a collected real-world CDR dataset spanning 5 domains and involving a million users. Experimental results demonstrate the superiority of the proposed method over a range of prior arts. We further conducted ablation studies to verify the effectiveness of the proposed components. Our collected dataset will be open-sourced to facilitate future research in the field of multi-domain recommender systems and user modeling.Comment: 9 pages, accepted by WSDM 202

    Amino acid Formula induces Microbiota Dysbiosis and Depressive-Like Behavior in Mice

    Get PDF
    Amino acid formula (AAF) is increasingly consumed in infants with cow\u27s milk protein allergy; however, the long-term influences on health are less described. In this study, we established a mouse model by subjecting neonatal mice to an amino acid diet (AAD) to mimic the feeding regimen of infants on AAF. Surprisingly, AAD-fed mice exhibited dysbiotic microbiota and increased neuronal activity in both the intestine and brain, as well as gastrointestinal peristalsis disorders and depressive-like behavior. Furthermore, fecal microbiota transplantation from AAD-fed mice or AAF-fed infants to recipient mice led to elevated neuronal activations and exacerbated depressive-like behaviors compared to that from normal chow-fed mice or cow\u27s-milk-formula-fed infants, respectively. Our findings highlight the necessity to avoid the excessive use of AAF, which may influence the neuronal development and mental health of children

    An improved joint non-negative matrix factorization for identifying surgical treatment timing of neonatal necrotizing enterocolitis

    Get PDF
    Neonatal necrotizing enterocolitis is a severe neonatal intestinal disease. Timely identification of surgical indications is essential for newborns in order to seek the best time for treatment and improve prognosis. This paper attempts to establish an algorithm model based on multimodal clinical data to determine the features of surgical indications and construct an auxiliary diagnosis model. The proposed algorithm adds hypergraph constraints on the two modal data based on Joint Nonnegative Matrix Factorization (JNMF), aiming to mine the higher-order correlations of the two data features. In addition, the adjacency matrix of the two kinds of data is used as a network regularization constraint to prevent overfitting. Orthogonal and L1-norm regulations were introduced to avoid feature redundancy and perform feature selection, respectively, and confirmed 14 clinical features. Finally, we used three classifiers, random forest, support vector machine, and logistic regression, to perform binary classification of patients requiring surgery. The results show that when the features selected by the proposed algorithm model are classified by random forest, the area under the ROC curve is 0.8, which has high prediction accuracy

    6G Network AI Architecture for Everyone-Centric Customized Services

    Full text link
    Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous network resources and pervasive intelligence to support everyone-centric customized services anywhere and anytime. In this article, we first coin the concept of Service Requirement Zone (SRZ) on the user side to characterize and visualize the integrated service requirements and preferences of specific tasks of individual users. On the system side, we further introduce the concept of User Satisfaction Ratio (USR) to evaluate the system's overall service ability of satisfying a variety of tasks with different SRZs. Then, we propose a network Artificial Intelligence (AI) architecture with integrated network resources and pervasive AI capabilities for supporting customized services with guaranteed QoEs. Finally, extensive simulations show that the proposed network AI architecture can consistently offer a higher USR performance than the cloud AI and edge AI architectures with respect to different task scheduling algorithms, random service requirements, and dynamic network conditions

    Testing Security Properties of Protocol Implementations -- a Machine Learning Based Approach

    No full text
    Security and reliability of network protocol implementations are essential for communication services. Most of the approaches for verifying security and reliability, such as formal validation and black-box testing, are limited to checking the specification or conformance of implementation. However, in practice, a protocol implementation may contain engineering details, which are not included in the system specification but may result in security flaws. We propose a new learning-based approach to systematically and automatically test protocol implementation security properties. Protocols are specified using Symbolic Parameterized Extended Finite State Machine (SP-EFSM) model, and an important security property – message confidentiality under the general Dolev-Yao attacker model – is investigated. The new testing approach applies black-box checking theory and a supervised learning algorithm to explore the structure of an implementation under test while simulating the teacher with a conformance test generation scheme. We present the testing procedure, analyze its complexity, and report experimental results. 1

    Network Protocol System Fingerprinting – A Formal Approach

    No full text
    Abstract — Network protocol system fingerprinting has been recognized as an important issue and a major threat to network security. Prevalent works rely largely on human experiences and insight of the protocol system specifications and implementations. Such ad-hoc approaches are inadequate in dealing with large complex protocol systems. In this paper we propose a formal approach for automated protocol system fingerprinting analysis and experiment. Parameterized Extended Finite State Machine is used to model protocol systems, and four categories of fingerprinting problems are formally defined. We propose and analyze algorithms for both active and passive fingerprinting and present our experimental results on Internet protocols. Furthermore, we investigate protection techniques against malicious fingerprinting and discuss the feasibility of two defense schemes, based on the protocol and application scenarios
    • …
    corecore