6 research outputs found

    Parallel Decision Tree with Application to Water Quality Data Analysis

    Get PDF
    Abstract. Decision tree is a popular classification technique in many applications, such as retail target marketing, fraud detection and design of telecommunication service plans. With the information exploration, the existing classification algorithms are not good enough to tackle large data set. In order to deal with the problem, many researchers try to design efficient parallel classification algorithms. Based on the current and powerful parallel programming framework -MapReduce, we propose a parallel ID3 classification algorithm(PID3 for short). We use water quality data monitoring the Changjiang River which contains 17 branches as experimental data. As the data are time series, we process the data to attribute data before using the decision tree. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware

    Multi_CycGT: A Deep Learning-Based Multimodal Model for Predicting the Membrane Permeability of Cyclic Peptides

    No full text
    Cyclic peptides are gaining attention for their strong binding affinity, low toxicity, and ability to target “undruggable” proteins; however, their therapeutic potential against intracellular targets is constrained by their limited membrane permeability, and researchers need much time and money to test this property in the laboratory. Herein, we propose an innovative multimodal model called Multi_CycGT, which combines a graph convolutional network (GCN) and a transformer to extract one- and two-dimensional features for predicting cyclic peptide permeability. The extensive benchmarking experiments show that our Multi_CycGT model can attain state-of-the-art performance, with an average accuracy of 0.8206 and an area under the curve of 0.8650, and demonstrates satisfactory generalization ability on several external data sets. To the best of our knowledge, it is the first deep learning-based attempt to predict the membrane permeability of cyclic peptides, which is beneficial in accelerating the design of cyclic peptide active drugs in medicinal chemistry and chemical biology applications
    corecore