941 research outputs found

    CrabNet for explainable deep learning in materials science: Bridging the gap between academia and industry

    Get PDF
    Despite recent breakthroughs in deep learning for materials informatics, there exists a disparity between their popularity in academic research and their limited adoption in the industry. A significant contributor to this “interpretability-adoption gap” is the prevalence of black-box models and the lack of built-in methods for model interpretation. While established methods for evaluating model performance exist, an intuitive understanding of the modeling and decision-making processes in models is nonetheless desired in many cases. In this work, we demonstrate several ways of incorporating model interpretability to the structure-agnostic Compositionally Restricted Attention-Based network, CrabNet. We show that CrabNet learns meaningful, material property-specific element representations based solely on the data with no additional supervision. These element representations can then be used to explore element identity, similarity, behavior, and interactions within different chemical environments. Chemical compounds can also be uniquely represented and examined to reveal clear structures and trends within the chemical space. Additionally, visualizations of the attention mechanism can be used in conjunction to further understand the modeling process, identify potential modeling or dataset errors, and hint at further chemical insights leading to a better understanding of the phenomena governing material properties. We feel confident that the interpretability methods introduced in this work for CrabNet will be of keen interest to materials informatics researchers as well as industrial practitioners alike.TU Berlin, Open-Access-Mittel - 202

    Compositionally restricted attention-based network for materials property predictions

    Get PDF
    In this paper, we demonstrate an application of the Transformer self-attention mechanism in the context of materials science. Our network, the Compositionally Restricted Attention-Based network (CrabNet), explores the area of structure-agnostic materials property predictions when only a chemical formula is provided. Our results show that CrabNet’s performance matches or exceeds current best-practice methods on nearly all of 28 total benchmark datasets. We also demonstrate how CrabNet’s architecture lends itself towards model interpretability by showing different visualization approaches that are made possible by its design. We feel confident that CrabNet and its attention-based framework will be of keen interest to future materials informatics researchers

    BRCA2 polymorphic stop codon K3326X and the risk of breast, prostate, and ovarian cancers

    Get PDF
    Background: The K3326X variant in BRCA2 (BRCA2*c.9976A>T; p.Lys3326*; rs11571833) has been found to be associated with small increased risks of breast cancer. However, it is not clear to what extent linkage disequilibrium with fully pathogenic mutations might account for this association. There is scant information about the effect of K3326X in other hormone-related cancers. Methods: Using weighted logistic regression, we analyzed data from the large iCOGS study including 76 637 cancer case patients and 83 796 control patients to estimate odds ratios (ORw) and 95% confidence intervals (CIs) for K3326X variant carriers in relation to breast, ovarian, and prostate cancer risks, with weights defined as probability of not having a pathogenic BRCA2 variant. Using Cox proportional hazards modeling, we also examined the associations of K3326X with breast and ovarian cancer risks among 7183 BRCA1 variant carriers. All statistical tests were two-sided. Results: The K3326X variant was associated with breast (ORw = 1.28, 95% CI = 1.17 to 1.40, P = 5.9x10- 6) and invasive ovarian cancer (ORw = 1.26, 95% CI = 1.10 to 1.43, P = 3.8x10-3). These associations were stronger for serous ovarian cancer and for estrogen receptor–negative breast cancer (ORw = 1.46, 95% CI = 1.2 to 1.70, P = 3.4x10-5 and ORw = 1.50, 95% CI = 1.28 to 1.76, P = 4.1x10-5, respectively). For BRCA1 mutation carriers, there was a statistically significant inverse association of the K3326X variant with risk of ovarian cancer (HR = 0.43, 95% CI = 0.22 to 0.84, P = .013) but no association with breast cancer. No association with prostate cancer was observed. Conclusions: Our study provides evidence that the K3326X variant is associated with risk of developing breast and ovarian cancers independent of other pathogenic variants in BRCA2. Further studies are needed to determine the biological mechanism of action responsible for these associations

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Evidence that breast cancer risk at the 2q35 locus is mediated through IGFBP5 regulation.

    Get PDF
    GWAS have identified a breast cancer susceptibility locus on 2q35. Here we report the fine mapping of this locus using data from 101,943 subjects from 50 case-control studies. We genotype 276 SNPs using the 'iCOGS' genotyping array and impute genotypes for a further 1,284 using 1000 Genomes Project data. All but two, strongly correlated SNPs (rs4442975 G/T and rs6721996 G/A) are excluded as candidate causal variants at odds against >100:1. The best functional candidate, rs4442975, is associated with oestrogen receptor positive (ER+) disease with an odds ratio (OR) in Europeans of 0.85 (95% confidence interval=0.84-0.87; P=1.7 × 10(-43)) per t-allele. This SNP flanks a transcriptional enhancer that physically interacts with the promoter of IGFBP5 (encoding insulin-like growth factor-binding protein 5) and displays allele-specific gene expression, FOXA1 binding and chromatin looping. Evidence suggests that the g-allele confers increased breast cancer susceptibility through relative downregulation of IGFBP5, a gene with known roles in breast cell biology

    Breast cancer risk variants at 6q25 display different phenotype associations and regulate ESR1, RMND1 and CCDC170.

    Get PDF
    We analyzed 3,872 common genetic variants across the ESR1 locus (encoding estrogen receptor α) in 118,816 subjects from three international consortia. We found evidence for at least five independent causal variants, each associated with different phenotype sets, including estrogen receptor (ER(+) or ER(-)) and human ERBB2 (HER2(+) or HER2(-)) tumor subtypes, mammographic density and tumor grade. The best candidate causal variants for ER(-) tumors lie in four separate enhancer elements, and their risk alleles reduce expression of ESR1, RMND1 and CCDC170, whereas the risk alleles of the strongest candidates for the remaining independent causal variant disrupt a silencer element and putatively increase ESR1 and RMND1 expression.This is the author accepted manuscript. The final version is available from Nature Publishing Group via http://dx.doi.org/10.1038/ng.352

    Functional mechanisms underlying pleiotropic risk alleles at the 19p13.1 breast-ovarian cancer susceptibility locus

    Get PDF
    A locus at 19p13 is associated with breast cancer (BC) and ovarian cancer (OC) risk. Here we analyse 438 SNPs in this region in 46,451 BC and 15,438 OC cases, 15,252 BRCA1 mutation carriers and 73,444 controls and identify 13 candidate causal SNPs associated with serous OC (P=9.2 × 10-20), ER-negative BC (P=1.1 × 10-13), BRCA1-associated BC (P=7.7 × 10-16) and triple negative BC (P-diff=2 × 10-5). Genotype-gene expression associations are identified for candidate target genes ANKLE1 (P=2 × 10-3) and ABHD8 (P<2 × 10-3). Chromosome conformation capture identifies interactions between four candidate SNPs and ABHD8, and luciferase assays indicate six risk alleles increased transactivation of the ADHD8 promoter. Targeted deletion of a region containing risk SNP rs56069439 in a putative enhancer induces ANKLE1 downregulation; and mRNA stability assays indicate functional effects for an ANKLE1 3′-UTR SNP. Altogether, these data suggest that multiple SNPs at 19p13 regulate ABHD8 and perhaps ANKLE1 expression, and indicate common mechanisms underlying breast and ovarian cancer risk

    The Physics of the B Factories

    Get PDF
    This work is on the Physics of the B Factories. Part A of this book contains a brief description of the SLAC and KEK B Factories as well as their detectors, BaBar and Belle, and data taking related issues. Part B discusses tools and methods used by the experiments in order to obtain results. The results themselves can be found in Part C

    Measurement of the top quark forward-backward production asymmetry and the anomalous chromoelectric and chromomagnetic moments in pp collisions at √s = 13 TeV

    Get PDF
    Abstract The parton-level top quark (t) forward-backward asymmetry and the anomalous chromoelectric (d̂ t) and chromomagnetic (μ̂ t) moments have been measured using LHC pp collisions at a center-of-mass energy of 13 TeV, collected in the CMS detector in a data sample corresponding to an integrated luminosity of 35.9 fb−1. The linearized variable AFB(1) is used to approximate the asymmetry. Candidate t t ¯ events decaying to a muon or electron and jets in final states with low and high Lorentz boosts are selected and reconstructed using a fit of the kinematic distributions of the decay products to those expected for t t ¯ final states. The values found for the parameters are AFB(1)=0.048−0.087+0.095(stat)−0.029+0.020(syst),μ̂t=−0.024−0.009+0.013(stat)−0.011+0.016(syst), and a limit is placed on the magnitude of | d̂ t| &lt; 0.03 at 95% confidence level. [Figure not available: see fulltext.

    Measurement of b jet shapes in proton-proton collisions at root s=5.02 TeV

    Get PDF
    We present the first study of charged-hadron production associated with jets originating from b quarks in proton-proton collisions at a center-of-mass energy of 5.02 TeV. The data sample used in this study was collected with the CMS detector at the CERN LHC and corresponds to an integrated luminosity of 27.4 pb(-1). To characterize the jet substructure, the differential jet shapes, defined as the normalized transverse momentum distribution of charged hadrons as a function of angular distance from the jet axis, are measured for b jets. In addition to the jet shapes, the per-jet yields of charged particles associated with b jets are also quantified, again as a function of the angular distance with respect to the jet axis. Extracted jet shape and particle yield distributions for b jets are compared with results for inclusive jets, as well as with the predictions from the pythia and herwig++ event generators.Peer reviewe
    corecore