21 research outputs found

    Ensemble of ensembles for fine particulate matter pollution prediction using big data analytics and IoT emission sensors

    Get PDF
    Ā© 2023, Emerald Publishing Limited. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1108/JEDT-07-2022-0379Purpose The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning (ML) methods (bagging and boosting ensembles) trained with high-volume data points retrieved from Internet of Things (IoT) emission sensors, time-corresponding meteorology and traffic data. Design/methodology/approach For a start, the study experimented big data hypothesis theory by developing sample ensemble predictive models on different data sample sizes and compared their results. Second, it developed a standalone model and several bagging and boosting ensemble models and compared their results. Finally, it used the best performing bagging and boosting predictive models as input estimators to develop a novel multilayer high-effective stacking ensemble predictive model. Findings Results proved data size to be one of the main determinants to ensemble ML predictive power. Second, it proved that, as compared to using a single algorithm, the cumulative result from ensemble ML algorithms is usually always better in terms of predicted accuracy. Finally, it proved stacking ensemble to be a better model for predicting PM2.5 concentration level than bagging and boosting ensemble models. Research limitations/implications A limitation of this study is the trade-off between performance of this novel model and the computational time required to train it. Whether this gap can be closed remains an open research question. As a result, future research should attempt to close this gap. Also, future studies can integrate this novel model to a personal air quality messaging system to inform public of pollution levels and improve public access to air quality forecast. Practical implications The outcome of this study will aid the public to proactively identify highly polluted areas thus potentially reducing pollution-associated/ triggered COVID-19 (and other lung diseases) deaths/ complications/ transmission by encouraging avoidance behavior and support informed decision to lock down by government bodies when integrated into an air pollution monitoring system Originality/value This study fills a gap in literature by providing a justification for selecting appropriate ensemble ML algorithms for PM2.5 concentration level predictive modeling. Second, it contributes to the big data hypothesis theory, which suggests that data size is one of the most important factors of ML predictive capability. Third, it supports the premise that when using ensemble ML algorithms, the cumulative output is usually always better in terms of predicted accuracy than using a single algorithm. Finally developing a novel multilayer high-performant hyperparameter optimized ensemble of ensembles predictive model that can accurately predict PM2.5 concentration levels with improved model interpretability and enhanced generalizability, as well as the provision of a novel databank of historic pollution data from IoT emission sensors that can be purchased for research, consultancy and policymaking.Peer reviewe

    A hybrid energy-based and AI-based screening approach for the discovery of novel inhibitors of JAK3

    Get PDF
    The JAKs protein family is composed of four isoforms, and JAK3 has been regarded as a druggable target for the development of drugs to treat various diseases, including hematologic tumors, cancer, and neuronal death. Therefore, the discovery of JAK3 inhibitors with novel scaffolds possesses the potential to provide additional options for drug development. This article presents a structure-based hybrid high-throughput virtual screening (HTVS) protocol as well as the DeepDock algorithm, which is based on geometric deep learning. These techniques were used to identify inhibitors of JAK3 with a novel sketch from a specific ā€œIn-houseā€ database. Using molecular docking with varying precision, MM/GBSA, geometric deep learning scoring, and manual selection, 10 compounds were obtained for subsequent biological evaluation. One of these 10 compounds, compound 8, was found to have inhibitory potency against JAK3 and the MOLM-16 cell line, providing a valuable lead compound for further development of JAK3 inhibitors. To gain a better understanding of the interaction between compound 8 and JAK3, molecular dynamics (MD) simulations were conducted to provide more details on the binding conformation of compound 8 with JAK3 to guide the subsequent structure optimization. In this article, we achieved compound 8 with a novel sketch possessing inhibitory bioactivity against JAK3, and it would provide an acceptable ā€œhitā€ for further structure optimization and modification to develop JAK3 inhibitors

    GWAS Analysis and QTL Identification of Fiber Quality Traits and Yield Components in Upland Cotton Using Enriched High-Density SNP Markers

    Get PDF
    It is of great importance to identify quantitative trait loci (QTL) controlling fiber quality traits and yield components for future marker-assisted selection (MAS) and candidate gene function identifications. In this study, two kinds of traits in 231 F6:8 recombinant inbred lines (RILs), derived from an intraspecific cross between Xinluzao24, a cultivar with elite fiber quality, and Lumianyan28, a cultivar with wide adaptability and high yield potential, were measured in nine environments. This RIL population was genotyped by 122 SSR and 4729 SNP markers, which were also used to construct the genetic map. The map covered 2477.99 cM of hirsutum genome, with an average marker interval of 0.51 cM between adjacent markers. As a result, a total of 134 QTLs for fiber quality traits and 122 QTLs for yield components were detected, with 2.18ā€“24.45 and 1.68ā€“28.27% proportions of the phenotypic variance explained by each QTL, respectively. Among these QTLs, 57 were detected in at least two environments, named stable QTLs. A total of 209 and 139 quantitative trait nucleotides (QTNs) were associated with fiber quality traits and yield components by four multilocus genome-wide association studies methods, respectively. Among these QTNs, 74 were detected by at least two algorithms or in two environments. The candidate genes harbored by 57 stable QTLs were compared with the ones associated with QTN, and 35 common candidate genes were found. Among these common candidate genes, four were possibly ā€œpleiotropic.ā€ This study provided important information for MAS and candidate gene functional studies

    Molecular Modeling Studies of 11Ī²-Hydroxysteroid Dehydrogenase Type 1 Inhibitors through Receptor-Based 3D-QSAR and Molecular Dynamics Simulations

    No full text
    11Ī²-Hydroxysteroid dehydrogenase type 1 (11Ī²-HSD1) is a potential target for the treatment of numerous human disorders, such as diabetes, obesity, and metabolic syndrome. In this work, molecular modeling studies combining molecular docking, 3D-QSAR, MESP, MD simulations and free energy calculations were performed on pyridine amides and 1,2,4-triazolopyridines as 11Ī²-HSD1 inhibitors to explore structure-activity relationships and structural requirement for the inhibitory activity. 3D-QSAR models, including CoMFA and CoMSIA, were developed from the conformations obtained by docking strategy. The derived pharmacophoric features were further supported by MESP and Mulliken charge analyses using density functional theory. In addition, MD simulations and free energy calculations were employed to determine the detailed binding process and to compare the binding modes of inhibitors with different bioactivities. The binding free energies calculated by MM/PBSA showed a good correlation with the experimental biological activities. Free energy analyses and per-residue energy decomposition indicated the van der Waals interaction would be the major driving force for the interactions between an inhibitor and 11Ī²-HSD1. These unified results may provide that hydrogen bond interactions with Ser170 and Tyr183 are favorable for enhancing activity. Thr124, Ser170, Tyr177, Tyr183, Val227, and Val231 are the key amino acid residues in the binding pocket. The obtained results are expected to be valuable for the rational design of novel potent 11Ī²-HSD1 inhibitors

    The specific cleavage of lactone linkage to open-loop in cyclic lipopeptide during negative ESI tandem mass spectrometry: the hydrogen bond interaction effect of 4-ethyl guaiacol.

    No full text
    Mass spectrometry is a valuable tool for the analysis and identification of chemical compounds, particularly proteins and peptides. Lichenysins G, the major cyclic lipopeptide of lichenysin, and the non-covalent complex of lichenysins G and 4-ethylguaiacol were investigated with negative ion ESI tandem mass spectrometry. The different fragmentation mechanisms for these compounds were investigated. Our study shows the 4-ethylguaiacol hydrogen bond with the carbonyl oxygen of the ester group in the loop of lichenysins G. With the help of this hydrogen bond interaction, the ring structure preferentially opens in lactone linkage rather than O-C bond of the ester-group to produce alcohol and ketene. Isothermal titration 1H-NMR analysis verified the hydrogen bond and determined the proportion of subject and ligand in the non-covalent complex to be 1āˆ¶1. Theoretical calculations also suggest that the addition of the ligand can affect the energy of the transition structures (TS) during loop opening

    The MS<sup>3</sup> fragmentation of 1016 Th ions of non-covalent complex (a) and lichenysins G (b).

    No full text
    <p>The MS<sup>3</sup> fragmentation of 1016 Th ions of non-covalent complex (a) and lichenysins G (b).</p
    corecore