38 research outputs found
sj-pdf-1-smm-10.1177_09622802231211010 - Supplemental material for Clustering minimal inhibitory concentration data through Bayesian mixture models: An application toMycobacteriumtuberculosis
Supplemental material, sj-pdf-1-smm-10.1177_09622802231211010 for Clustering minimal inhibitory concentration data through Bayesian mixture models: An application toMycobacteriumtuberculosis</p
Tractable skew-normal approximations via matching
Many approximate Bayesian inference methods assume a particular parametric form for approximating the posterior distribution. A Gaussian distribution provides a convenient density for such approaches; examples include the Laplace, penalized quasi-likelihood, Gaussian variational, and expectation propagation methods. Unfortunately, these all ignore potential posterior skewness. The recent work of Durante et al. [Skewed Bernstein-von Mises theorem and skew-modal approximations; 2023. ArXiv preprint arXiv:2301.03038.] addresses this using skew-modal (SM) approximations, and is theoretically justified by a skewed Bernstein-von Mises theorem. However, the SM approximation can be impractical to work with in terms of tractability and storage costs, and uses only local posterior information. We introduce a variety of matching-based approximation schemes using the standard skew-normal distribution to resolve these issues. Experiments were conducted to compare the performance of this skew-normal matching method (both as a standalone approximation and as a post-hoc skewness adjustment) with the SM and existing Gaussian approximations. We show that for small and moderate dimensions, skew-normal matching can be much more accurate than these other approaches. For post-hoc skewness adjustments, this comes at very little cost in additional computational time.</p
Opinion Mining by Convolutional Neural Networks for Maximizing Discoverability of Nanomaterials
The
scientific literature contains valuable information that can
be used for future applications, but manual analysis presents challenges
due to its size and disciplinary boundaries. The prevailing solution
involves natural language processing (NLP) techniques such as information
retrieval. Nonetheless, existing automated systems primarily provide
either statistically based shallow information or deep information
without traceability, thereby falling short of delivering high-quality
and reliable insights. To address this, we propose an innovative approach
of leveraging sentiment information embedded within the literature
to track the opinions toward materials. In this study, we integrated
material knowledge into text representation and constructed opinion
data sets to hierarchically train deep learning models, named as Scientific
Sentiment Network (SSNet). SSNet can effectively extract knowledge
from the energy material literature and accurately categorize expert
opinions into challenges and opportunities (94% and 92% accuracy,
respectively). By incorporating sentiment features determined by SSNet,
we can predict the ranking of emerging thermoelectric materials with
a 70% correlation to experimental outcomes. Furthermore, our model
achieves a commendable 68% accuracy in predicting suitable nanomaterials
for atomic layer deposition (ALD) over time. These promising results
offer a practical framework to extract and synthesize knowledge from
the scientific literature, thereby accelerating research in the field
of nanomaterials
Manhattan plots of regions containing oligopeptide variants associated with MIC across 13 drugs.
Significant oligopeptides are coloured by the direction (orange = increase, blue = decrease) and magnitude of their effect size on MIC, estimated by LMM [32]. Bonferroni-corrected significance thresholds are shown by the black dashed lines. The top 20 genes ranked by their most significant oligopeptides are annotated alphabetically. Gene names separated by colons indicate intergenic regions. Gene names for those annotated with letters can be found in Table 1. Oligopeptides were aligned to the H37Rv reference; unaligned oligopeptides are plotted to the right in light grey. LMM, linear mixed model; MIC, minimum inhibitory concentration.</p
Effect size (beta) estimates and −log10 <i>p</i>-values for all significant oligopeptide variants for each drug, AMI, BDQ, CFZ, DLM, EMB, ETH, INH, KAN, LEV, LZD, MXF, RFB, and RIF.
For many of the drugs, the most significant oligopeptides were associated with lower MIC. AMI, amikacin; BDQ, bedaquiline; CFZ, clofazimine; DLM, delamanid; EMB, ethambutol; ETH, ethionamide; INH, isoniazid; KAN, kanamycin; LEV, levofloxacin; LZD, linezolid; MIC, minimum inhibitory concentration; MXF, moxifloxacin; RFB, rifabutin; RIF, rifampicin. (PDF)</p
Variants in <i>spoU</i> associated with EMB and RIF MIC.
Manhattan plots showing the oligopeptide association results for the spoU coding region A ethambutol and B rifampicin, and oligonucleotide alignment plots showing close-ups of the significant region just downstream of spoU for C ethambutol and D rifampicin. The black dashed lines indicate the Bonferroni-corrected significance thresholds. In the Manhattan plots, oligopeptides are coloured by the reading frame that they align to, black for the correct reading frame for spoU. Oligopeptides assigned to the region but did not align using BLAST are shown in grey on the right-hand side of the plots. In the oligonucleotide alignment plots, the H37Rv reference codons are shown at the bottom of the figure, grey for an invariant site, coloured at variant site positions. The oligonucleotides that aligned to the region are plotted from least significant at the bottom to most significant at the top. The background colour of the oligonucleotides represents the direction of the b estimate, light grey when b 0 (associated with higher MIC). Oligonucleotides are coloured by their amino acid residue at all variant positions. Oligonucleotides below the MAF threshold and not included in the analysis, but visualised here for signal interpretation, are marked by *s. The spoU stop codon is highlighted in red in the alignment plots. EMB, ethambutol; MAF, minor allele frequency; MIC, minimum inhibitory concentration; RIF, rifampicin. (PDF)</p
The interpretation of oligopeptides and oligonucleotides required manual curation to determine the underlying variants they tagged; the most significant oligopeptide or oligonucleotide for each allele captured by the significant signals are described here.
(XLSX)</p
Significant oligopeptide (rpoB, katG, gyrA, embB) and oligonucleotide (rrs) effect size (beta) estimates for known resistance genes plus the flanking 33 amino acids (oligopeptides) or 100 bases (oligonucelotides).
On the left, the beta estimates are shown for all significant oligopeptides for the drugs the gene is causal for, on the right, the beta estimates are shown for the same gene, but for the drugs they are artefactually associated to. For many drugs, the beta estimate is lower when the gene is significant due to artefactual cross-resistance. Drug name abbreviations are as follows: AMI, BDQ, CFZ, DLM, EMB, ETH, INH, KAN, LEV, LZD, MXF, RFB, and RIF. AMI, amikacin; BDQ, bedaquiline; CFZ, clofazimine; DLM, delamanid; EMB, ethambutol; ETH, ethionamide; INH, isoniazid; KAN, kanamycin; LEV, levofloxacin; LZD, linezolid; MXF, moxifloxacin; RFB, rifabutin; RIF, rifampicin. (PDF)</p
Fig 1 -
(A) Phylogeny of 10,228 isolates sampled globally by CRyPTIC used in the GWAS analyses. Lineages are coloured blue (lineage 1), green (2), orange (3), and yellow (4). Branch lengths have been square root transformed to visualise the detail at the tips. (B) Distributions of the log2 MIC measurements for all 13 drugs in the GWAS analyses, AMI, BDQ, CFZ, DLM, EMB, ETH, INH, KAN, LEV, LZD, MXF, RFB, and RIF. The red line indicates the ECOFF breakpoint for binary resistance versus sensitivity calls [31]. AMI, amikacin; BDQ, bedaquiline; CFZ, clofazimine; DLM, delamanid; ECOFF, epidemiological cutoff; EMB, ethambutol; ETH, ethionamide; INH, isoniazid; KAN, kanamycin; LEV, levofloxacin; LZD, linezolid; MIC, minimum inhibitory concentration; MXF, moxifloxacin; RFB, rifabutin; RIF, rifampicin.</p
QQ plots for the oligopeptide analyses, part B.
Comparing the empirical distribution of p-values to the expected distribution under the null hypothesis for the KAN, LEV, LZD, MXF, RFB, and RIF. Oligopeptides in the orange (MAF (PDF)</p
