43 research outputs found

    Non-Negative Matrix Factorization with Auxiliary Information on Overlapping Groups

    Get PDF
    Matrix factorization is useful to extract the essential low-rank structure from a given matrix and has been paid increasing attention. A typical example is non-negative matrix factorization (NMF), which is one type of unsupervised learning, having been successfully applied to a variety of data including documents, images and gene expression, where their values are usually non-negative. We propose a new model of NMF which is trained by using auxiliary information of overlapping groups. This setting is very reasonable in many applications, a typical example being gene function estimation where functional gene groups are heavily overlapped with each other. To estimate true groups from given overlapping groups efficiently, our model incorporates latent matrices with the regularization term using a mixed norm. This regularization term allows group-wise sparsity on the optimized low-rank structure. The latent matrices and other parameters are efficiently estimated by a block coordinate gradient descent method. We empirically evaluated the performance of our proposed model and algorithm from a variety of viewpoints, comparing with four methods including MMF for auxiliary graph information, by using both synthetic and real world document and gene expression data sets

    Noise robust automatic charge state recognition in quantum dots by machine learning and pre-processing, and visual explanations of the model with Grad-CAM

    Full text link
    Charge state recognition in quantum dot devices is important in preparation of quantum bits for quantum information processing. Towards auto-tuning of larger-scale quantum devices, automatic charge state recognition by machine learning has been demonstrated. In this work, we propose a simpler method using machine learning and pre-processing. We demonstrate the operation of the charge state recognition and evaluated an accuracy high as 96%. We also analyze the explainability of the trained machine learning model by gradient-weighted class activation mapping (Grad-CAM) which identifies class-discriminative regions for the predictions. It exhibits that the model predicts the state based on the change transition lines, indicating human-like recognition is realized.Comment: 15 pages, 6 figure

    Efficiently finding genome-wide three-way gene interactions from transcript- and genotype-data

    Get PDF
    Motivation: We address the issue of finding a three-way gene interaction, i.e. two interacting genes in expression under the genotypes of another gene, given a dataset in which expressions and genotypes are measured at once for each individual. This issue can be a general, switching mechanism in expression of two genes, being controlled by categories of another gene, and finding this type of interaction can be a key to elucidating complex biological systems. The most suitable method for this issue is likelihood ratio test using logistic regressions, which we call interaction test, but a serious problem of this test is computational intractability at a genome-wide level

    Structure and properties of densified silica glass: characterizing the order within disorder

    Get PDF
    世界一構造秩序のあるガラスの合成と構造解析に成功 --ガラスの一見無秩序な構造の中に潜む秩序を抽出--. 京都大学プレスリリース. 2021-12-25.The broken symmetry in the atomic-scale ordering of glassy versus crystalline solids leads to a daunting challenge to provide suitable metrics for describing the order within disorder, especially on length scales beyond the nearest neighbor that are characterized by rich structural complexity. Here, we address this challenge for silica, a canonical network-forming glass, by using hot versus cold compression to (i) systematically increase the structural ordering after densification and (ii) prepare two glasses with the same high-density but contrasting structures. The structure was measured by high-energy X-ray and neutron diffraction, and atomistic models were generated that reproduce the experimental results. The vibrational and thermodynamic properties of the glasses were probed by using inelastic neutron scattering and calorimetry, respectively. Traditional measures of amorphous structures show relatively subtle changes upon compacting the glass. The method of persistent homology identifies, however, distinct features in the network topology that change as the initially open structure of the glass is collapsed. The results for the same high-density glasses show that the nature of structural disorder does impact the heat capacity and boson peak in the low-frequency dynamical spectra. Densification is discussed in terms of the loss of locally favored tetrahedral structures comprising oxygen-decorated SiSi4 tetrahedra

    Genome-Wide Integration on Transcription Factors, Histone Acetylation and Gene Expression Reveals Genes Co-Regulated by Histone Modification Patterns

    Get PDF
    N-terminal tails of H2A, H2B, H3 and H4 histone families are subjected to posttranslational modifications that take part in transcriptional regulation mechanisms, such as transcription factor binding and gene expression. Regulation mechanisms under control of histone modification are important but remain largely unclear, despite of emerging datasets for comprehensive analysis of histone modification. In this paper, we focus on what we call genetic harmonious units (GHUs), which are co-occurring patterns among transcription factor binding, gene expression and histone modification. We present the first genome-wide approach that captures GHUs by combining ChIP-chip with microarray datasets from Saccharomyces cerevisiae. Our approach employs noise-robust soft clustering to select patterns which share the same preferences in transcription factor-binding, histone modification and gene expression, which are all currently implied to be closely correlated. The detected patterns are a well-studied acetylation of lysine 16 of H4 in glucose depletion as well as co-acetylation of five lysine residues of H3 with H4 Lys12 and H2A Lys7 responsible for ribosome biogenesis. Furthermore, our method further suggested the recognition of acetylated H4 Lys16 being crucial to histone acetyltransferase ESA1, whose essential role is still under controversy, from a microarray dataset on ESA1 and its bypass suppressor mutants. These results demonstrate that our approach allows us to provide clearer principles behind gene regulation mechanisms under histone modifications and detect GHUs further by applying to other microarray and ChIP-chip datasets. The source code of our method, which was implemented in MATLAB (http://www.mathworks.com/), is available from the supporting page for this paper: http://www.bic.kyoto-u.ac.jp/pathway/natsume/hm_detector.htm

    A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection

    Get PDF
    The response to respiratory viruses varies substantially between individuals, and there are currently no known molecular predictors from the early stages of infection. Here we conduct a community-based analysis to determine whether pre- or early post-exposure molecular factors could predict physiologic responses to viral exposure. Using peripheral blood gene expression profiles collected from healthy subjects prior to exposure to one of four respiratory viruses (H1N1, H3N2, Rhinovirus, and RSV), as well as up to 24 h following exposure, we find that it is possible to construct models predictive of symptomatic response using profiles even prior to viral exposure. Analysis of predictive gene features reveal little overlap among models; however, in aggregate, these genes are enriched for common pathways. Heme metabolism, the most significantly enriched pathway, is associated with a higher risk of developing symptoms following viral exposure. This study demonstrates that pre-exposure molecular predictors can be identified and improves our understanding of the mechanisms of response to respiratory viruses

    Prediction of overall survival for patients with metastatic castration-resistant prostate cancer : development of a prognostic model through a crowdsourced challenge with open clinical trial data

    Get PDF
    Background Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. Methods Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. Findings 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0.791; Bayes factor >5) and surpassed the reference model (iAUC 0.743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3.32, 95% CI 2.39-4.62, p Interpretation Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.Peer reviewe
    corecore