39 research outputs found
HGSMDA: miRNAâDisease Association Prediction Based on HyperGCN and Sørensen-Dice Loss
Biological research has demonstrated the significance of identifying miRNAâdisease associations in the context of disease prevention, diagnosis, and treatment. However, the utilization of experimental approaches involving biological subjects to infer these associations is both costly and inefficient. Consequently, there is a pressing need to devise novel approaches that offer enhanced accuracy and effectiveness. Presently, the predominant methods employed for predicting disease associations rely on Graph Convolutional Network (GCN) techniques. However, the Graph Convolutional Network algorithm, which is locally aggregated, solely incorporates information from the immediate neighboring nodes of a given node at each layer. Consequently, GCN cannot simultaneously aggregate information from multiple nodes. This constraint significantly impacts the predictive efficacy of the model. To tackle this problem, we propose a novel approach, based on HyperGCN and Sørensen-Dice loss (HGSMDA), for predicting associations between miRNAs and diseases. In the initial phase, we developed multiple networks to represent the similarity between miRNAs and diseases and employed GCNs to extract information from diverse perspectives. Subsequently, we draw into HyperGCN to construct a miRNAâdisease heteromorphic hypergraph using hypernodes and train GCN on the graph to aggregate information. Finally, we utilized the Sørensen-Dice loss function to evaluate the degree of similarity between the predicted outcomes and the ground truth values, thereby enabling the prediction of associations between miRNAs and diseases. In order to assess the soundness of our methodology, an extensive series of experiments was conducted employing the Human MicroRNA Disease Database (HMDD v3.2) as the dataset. The experimental outcomes unequivocally indicate that HGSMDA exhibits remarkable efficacy when compared to alternative methodologies. Furthermore, the predictive capacity of HGSMDA was corroborated through a case study focused on colon cancer. These findings strongly imply that HGSMDA represents a dependable and valid framework, thereby offering a novel avenue for investigating the intricate association between miRNAs and diseases
Effects of Astronomical Cycles on Laminated Shales of the Paleogene Shahejie Formation in the Dongying Sag, Bohai Bay Basin, China
Laminated shales are widely developed in the Dongying Sag and have attracted much attention as an oil reservoir. Macroscopically, these shales generally have multi-scale cyclicity, which is closely related to the development of laminae. Therefore, analyzing the origin of their cyclicity is helpful to understanding the formation mechanism of laminated shales and the vertical heterogeneity of shale reservoirs, which are of great significance for continental shale oil exploration and development. In this study, a gamma ray (GR) logging series, high-resolution elemental geochemical data, high-resolution core scanning photos and grayscale data, and mineralogical data were used to characterize the cyclicity of shale at different scales, and their relationship with different astronomical cycles was discussed. The results show that the Es3L and Es4U shale in the Dongying Sag has cyclicity from the meter-scale to the ten-meter scale and then to the hundred-meter scale, which is mainly manifested by periodic changes in organic matter abundance, mineral composition, element abundance, and grayscale. These cycles of different scales coincide with different astronomical periods. Specifically, the hundred-meter scale cyclicity is mainly controlled by the very long orbital period; the ten-meter scale cyclicity is mainly related to the eccentricity cycle; while the precession period is the main driver of the meter-scale cyclicity. Finally, we propose a simplified model for illustrating the formation of rhythmic organic-rich shale. This study is helpful to understanding the origin of continental organic-rich shale and predicting shale reservoir properties
Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
Cancer is a complex disease caused by genomic and epigenetic alterations; hence, identifying meaningful cancer drivers is an important and challenging task. Most studies have detected cancer drivers with mutated traits, while few studies consider multiple omics characteristics as important factors. In this study, we present a framework to analyze the effects of multi-omics characteristics on the identification of driver genes. We utilize four machine learning algorithms within this framework to detect cancer driver genes in pan-cancer data, including 75 characteristics among 19,636 genes. The 75 features are divided into four types and analyzed using Kullback–Leibler divergence based on CGC genes and non-CGC genes. We detect cancer driver genes in two different ways. One is to detect driver genes from a single feature type, while the other is from the top N features. The first analysis denotes that the mutational features are the best characteristics. The second analysis reveals that the top 45 features are the most effective feature combinations and superior to the mutational features. The top 45 features not only contain mutational features but also three other types of features. Therefore, our study extends the detection of cancer driver genes and provides a more comprehensive understanding of cancer mechanisms
Epileptic Seizure Detection Based on Variational Mode Decomposition and Deep Forest Using EEG Signals
Electroencephalography (EEG) records the electrical activity of the brain, which is an important tool for the automatic detection of epileptic seizures. It is certainly a very heavy burden to only recognize EEG epilepsy manually, so the method of computer-assisted treatment is of great importance. This paper presents a seizure detection algorithm based on variational modal decomposition (VMD) and a deep forest (DF) model. Variational modal decomposition is performed on EEG recordings, and the first three variational modal functions (VMFs) are selected to construct the timeâfrequency distribution of the EEG signals. Then, the logâEuclidean covariance matrix (LECM) is computed to represent the EEG properties and form EEG features. The deep forest model is applied to complete the EEG signal classification, which is a non-neural network deep model with a cascade structure that performs feature learning through the forest. In addition, to improve the classification accuracy, postprocessing techniques are performed to generate the discriminant results by moving average filtering and adaptive collar expansion. The algorithm was evaluated on the Bonn EEG dataset and the Freiburg longâterm EEG dataset, and the former achieved a sensitivity and specificity of 99.32% and 99.31%, respectively. The mean sensitivity and specificity of this method for the 21 patients in the Freiburg dataset were 95.2% and 98.56%, respectively, with a false detection rate of 0.36/h. These results demonstrate the superior performance advantage of our algorithm and indicate its great research potential in epilepsy detection
epiACO - a method for identifying epistasis based on ant Colony optimization algorithm
Abstract Background Identifying epistasis or epistatic interactions, which refer to nonlinear interaction effects of single nucleotide polymorphisms (SNPs), is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Though many works have been done for identifying epistatic interactions, due to their methodological and computational challenges, the algorithmic development is still ongoing. Results In this study, a method epiACO is proposed to identify epistatic interactions, which based on ant colony optimization algorithm. Highlights of epiACO are the introduced fitness function Svalue, path selection strategies, and a memory based strategy. The Svalue leverages the advantages of both mutual information and Bayesian network to effectively and efficiently measure associations between SNP combinations and the phenotype. Two path selection strategies, i.e., probabilistic path selection strategy and stochastic path selection strategy, are provided to adaptively guide ant behaviors of exploration and exploitation. The memory based strategy is designed to retain candidate solutions found in the previous iterations, and compare them to solutions of the current iteration to generate new candidate solutions, yielding a more accurate way for identifying epistasis. Conclusions Experiments of epiACO and its comparison with other recent methods epiMODE, TEAM, BOOST, SNPRuler, AntEpiSeeker, AntMiner, MACOED, and IACO are performed on both simulation data sets and a real data set of age-related macular degeneration. Results show that epiACO is promising in identifying epistasis and might be an alternative to existing methods
Integrated analysis of C3AR1 and CD163 associated with immune infiltration in intracranial aneurysms pathogenesis
Background: To identify potential immune-related biomarkers, molecular mechanism, and therapeutic agents of intracranial aneurysms (IAs). Methods: We identified the differentially expressed genes (DEGs) between IAs and control samples from GSE75436, GSE26969, GSE6551, and GSE13353 datasets. We used weighted gene co-expression network analysis (WGCNA) and proteinâprotein interaction (PPI) analysis to identify immune-related hub genes. We evaluated the expression of hub genes by using qRT-PCR analysis. Using miRNet, NetworkAnalyst, and DGIdb databases, we analyzed the regulatory networks and potential therapeutic agents targeting hub genes. Least absolute shrinkage and selection operator (LASSO) logistic regression was performed to identify optimal biomarkers among hub genes. The diagnostic value was validated by external GSE15629 dataset. Results: We identified 227 DEGs and 22 differentially infiltrating immune cells between IAs and control samples from GSE75436, GSE26969, GSE6551, and GSE13353 datasets. We further identified 41 differentially expressed immune-related genes (DEIRGs), which were primarily enriched in the chemokine-mediated signaling pathway, myeloid leukocyte migration, endocytic vesicle membrane, chemokine receptor binding, chemokine activity, and viral protein interactions with cytokines and their receptors. Among 41 DEIRGs, 10 hub genes including C3AR1, CD163, CCL4, CXCL8, CCL3, TLR2, TYROBP, C1QB, FCGR3A, and FCGR1A were identified with good diagnostic values (AUC >0.7). Hsa-mir-27a-3p and transcription factors, including YY1 and GATA2, were identified the primary regulators of hub genes. 92 potential therapeutic agents targeting hub genes were predicted. C3AR1 and CD163 were finally identified as the best diagnostic biomarkers using LASSO logistic regression (AUCÂ =Â 0.994). The diagnostic value of C3AR1 and CD163 was validated by the external GSE15629 dataset (AUCÂ =Â 0.914). Conclusions: This study revealed the importance of C3AR1 and CD163 in immune infiltration in IAs pathogenesis. Our finding provided a valuable reference for subsequent research on the potential targets for molecular mechanisms and intervention of IAs