98 research outputs found

    High dimensional discriminant rules with shrinkage estimators of covariance matrix and mean vector

    Full text link
    Linear discriminant analysis is a typical method used in the case of large dimension and small samples. There are various types of linear discriminant analysis methods, which are based on the estimations of the covariance matrix and mean vectors. Although there are many methods for estimating the inverse matrix of covariance and the mean vectors, we consider shrinkage methods based on non-parametric approach. In the case of the precision matrix, the methods based on either the sparsity structure or the data splitting are considered. Regarding the estimation of mean vectors, nonparametric empirical Bayes (NPEB) estimator and nonparametric maximum likelihood estimation (NPMLE) methods are adopted which are also called f-modeling and g-modeling, respectively. We analyzed the performances of linear discriminant rules which are based on combined estimation strategies of the covariance matrix and mean vectors. In particular, we present a theoretical result on the performance of the NPEB method and compare that with the results from other methods in previous studies. We provide simulation studies for various structures of covariance matrices and mean vectors to evaluate the methods considered in this paper. In addition, real data examples such as gene expressions and EEG data are presented.Comment: 39 pages, 3 figure

    Shuffle & Divide: Contrastive Learning for Long Text

    Full text link
    We propose a self-supervised learning method for long text documents based on contrastive learning. A key to our method is Shuffle and Divide (SaD), a simple text augmentation algorithm that sets up a pretext task required for contrastive updates to BERT-based document embedding. SaD splits a document into two sub-documents containing randomly shuffled words in the entire documents. The sub-documents are considered positive examples, leaving all other documents in the corpus as negatives. After SaD, we repeat the contrastive update and clustering phases until convergence. It is naturally a time-consuming, cumbersome task to label text documents, and our method can help alleviate human efforts, which are most expensive resources in AI. We have empirically evaluated our method by performing unsupervised text classification on the 20 Newsgroups, Reuters-21578, BBC, and BBCSport datasets. In particular, our method pushes the current state-of-the-art, SS-SB-MT, on 20 Newsgroups by 20.94% in accuracy. We also achieve the state-of-the-art performance on Reuters-21578 and exceptionally-high accuracy performances (over 95%) for unsupervised classification on the BBC and BBCSport datasets.Comment: Accepted at ICPR 202

    ContraCluster: Learning to Classify without Labels by Contrastive Self-Supervision and Prototype-Based Semi-Supervision

    Full text link
    The recent advances in representation learning inspire us to take on the challenging problem of unsupervised image classification tasks in a principled way. We propose ContraCluster, an unsupervised image classification method that combines clustering with the power of contrastive self-supervised learning. ContraCluster consists of three stages: (1) contrastive self-supervised pre-training (CPT), (2) contrastive prototype sampling (CPS), and (3) prototype-based semi-supervised fine-tuning (PB-SFT). CPS can select highly accurate, categorically prototypical images in an embedding space learned by contrastive learning. We use sampled prototypes as noisy labeled data to perform semi-supervised fine-tuning (PB-SFT), leveraging small prototypes and large unlabeled data to further enhance the accuracy. We demonstrate empirically that ContraCluster achieves new state-of-the-art results for standard benchmark datasets including CIFAR-10, STL-10, and ImageNet-10. For example, ContraCluster achieves about 90.8% accuracy for CIFAR-10, which outperforms DAC (52.2%), IIC (61.7%), and SCAN (87.6%) by a large margin. Without any labels, ContraCluster can achieve a 90.8% accuracy that is comparable to 95.8% by the best supervised counterpart.Comment: Accepted at ICPR 202

    Influence of oxygen vacancy on the electronic structure of HfO2_2 film

    Get PDF
    We investigated the unoccupied part of the electronic structure of the oxygen-deficient hafnium oxide (HfO1.8_{\sim1.8}) using soft x-ray absorption spectroscopy at O KK and Hf N3N_3 edges. Band-tail states beneath the unoccupied Hf 5dd band are observed in the O KK-edge spectra; combined with ultraviolet photoemission spectrum, this indicates the non-negligible occupation of Hf 5dd state. However, Hf N3N_3-edge magnetic circular dichroism spectrum reveals the absence of a long-range ferromagnetic spin order in the oxide. Thus the small amount of dd electron gained by the vacancy formation does not show inter-site correlation, contrary to a recent report [M. Venkatesan {\it et al.}, Nature {\bf 430}, 630 (2004)].Comment: 5 pages, 4 figures, submitted to Phys. Rev.

    Protein-targeted corona phase molecular recognition

    Get PDF
    Corona phase molecular recognition (CoPhMoRe) uses a heteropolymer adsorbed onto and templated by a nanoparticle surface to recognize a specific target analyte. This method has not yet been extended to macromolecular analytes, including proteins. Herein we develop a variant of a CoPhMoRe screening procedure of single-walled carbon nanotubes (SWCNT) and use it against a panel of human blood proteins, revealing a specific corona phase that recognizes fibrinogen with high selectivity. In response to fibrinogen binding, SWCNT fluorescence decreases by \u3e80% at saturation. Sequential binding of the three fibrinogen nodules is suggested by selective fluorescence quenching by isolated sub-domains and validated by the quenching kinetics. The fibrinogen recognition also occurs in serum environment, at the clinically relevant fibrinogen concentrations in the human blood. These results open new avenues for synthetic, non-biological antibody analogues that recognize biological macromolecules, and hold great promise for medical and clinical applications

    Protein-targeted corona phase molecular recognition

    Get PDF
    Corona phase molecular recognition (CoPhMoRe) uses a heteropolymer adsorbed onto and templated by a nanoparticle surface to recognize a specific target analyte. This method has not yet been extended to macromolecular analytes, including proteins. Herein we develop a variant of a CoPhMoRe screening procedure of single-walled carbon nanotubes (SWCNT) and use it against a panel of human blood proteins, revealing a specific corona phase that recognizes fibrinogen with high selectivity. In response to fibrinogen binding, SWCNT fluorescence decreases by \u3e80% at saturation. Sequential binding of the three fibrinogen nodules is suggested by selective fluorescence quenching by isolated sub-domains and validated by the quenching kinetics. The fibrinogen recognition also occurs in serum environment, at the clinically relevant fibrinogen concentrations in the human blood. These results open new avenues for synthetic, non-biological antibody analogues that recognize biological macromolecules, and hold great promise for medical and clinical applications

    Protein-targeted corona phase molecular recognition

    Get PDF
    Corona phase molecular recognition (CoPhMoRe) uses a heteropolymer adsorbed onto and templated by a nanoparticle surface to recognize a specific target analyte. This method has not yet been extended to macromolecular analytes, including proteins. Herein we develop a variant of a CoPhMoRe screening procedure of single-walled carbon nanotubes (SWCNT) and use it against a panel of human blood proteins, revealing a specific corona phase that recognizes fibrinogen with high selectivity. In response to fibrinogen binding, SWCNT fluorescence decreases by >80% at saturation. Sequential binding of the three fibrinogen nodules is suggested by selective fluorescence quenching by isolated sub-domains and validated by the quenching kinetics. The fibrinogen recognition also occurs in serum environment, at the clinically relevant fibrinogen concentrations in the human blood. These results open new avenues for synthetic, non-biological antibody analogues that recognize biological macromolecules, and hold great promise for medical and clinical applications.Juvenile Diabetes Research Foundation InternationalMIT-Technion Fellowshi

    Nematicity dynamics in the charge-density-wave phase of a cuprate superconductor

    Full text link
    Understanding the interplay between charge, nematic, and structural ordering tendencies in cuprate superconductors is critical to unraveling their complex phase diagram. Using pump-probe time-resolved resonant x-ray scattering on the (0 0 1) Bragg peak at the Cu L3 and oxygen K resonances, we investigate non-equilibrium dynamics of Qa = Qb = 0 nematic order and its association with both charge density wave (CDW) order and lattice dynamics in La1.65Eu0.2Sr0.15CuO4. In contrast to the slow lattice dynamics probed at the apical oxygen K resonance, fast nematicity dynamics are observed at the Cu L3 and planar oxygen K resonances. The temperature dependence of the nematicity dynamics is correlated with the onset of CDW order. These findings unambiguously indicate that the CDW phase, typically evidenced by translational symmetry breaking, includes a significant electronic nematic component.Comment: 16 pages, 4 figure
    corecore