790 research outputs found

    CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels

    Full text link
    Utilizing language models (LMs) without internal access is becoming an attractive paradigm in the field of NLP as many cutting-edge LMs are released through APIs and boast a massive scale. The de-facto method in this type of black-box scenario is known as prompting, which has shown progressive performance enhancements in situations where data labels are scarce or unavailable. Despite their efficacy, they still fall short in comparison to fully supervised counterparts and are generally brittle to slight modifications. In this paper, we propose Clustering-enhanced Linear Discriminative Analysis, a novel approach that improves the text classification accuracy with a very weak-supervision signal (i.e., name of the labels). Our framework draws a precise decision boundary without accessing weights or gradients of the LM model or data labels. The core ideas of CELDA are twofold: (1) extracting a refined pseudo-labeled dataset from an unlabeled dataset, and (2) training a lightweight and robust model on the top of LM, which learns an accurate decision boundary from an extracted noisy dataset. Throughout in-depth investigations on various datasets, we demonstrated that CELDA reaches new state-of-the-art in weakly-supervised text classification and narrows the gap with a fully-supervised model. Additionally, our proposed methodology can be applied universally to any LM and has the potential to scale to larger models, making it a more viable option for utilizing large LMs.Comment: ACL 202

    Analyzing the Latent Space of GAN through Local Dimension Estimation

    Full text link
    The impressive success of style-based GANs (StyleGANs) in high-fidelity image synthesis has motivated research to understand the semantic properties of their latent spaces. In this paper, we approach this problem through a geometric analysis of latent spaces as a manifold. In particular, we propose a local dimension estimation algorithm for arbitrary intermediate layers in a pre-trained GAN model. The estimated local dimension is interpreted as the number of possible semantic variations from this latent variable. Moreover, this intrinsic dimension estimation enables unsupervised evaluation of disentanglement for a latent space. Our proposed metric, called Distortion, measures an inconsistency of intrinsic tangent space on the learned latent space. Distortion is purely geometric and does not require any additional attribute information. Nevertheless, Distortion shows a high correlation with the global-basis-compatibility and supervised disentanglement score. Our work is the first step towards selecting the most disentangled latent space among various latent spaces in a GAN without attribute labels

    Finding the global semantic representation in GAN through Frechet Mean

    Full text link
    The ideally disentangled latent space in GAN involves the global representation of latent space with semantic attribute coordinates. In other words, considering that this disentangled latent space is a vector space, there exists the global semantic basis where each basis component describes one attribute of generated images. In this paper, we propose an unsupervised method for finding this global semantic basis in the intermediate latent space in GANs. This semantic basis represents sample-independent meaningful perturbations that change the same semantic attribute of an image on the entire latent space. The proposed global basis, called Fr\'echet basis, is derived by introducing Fr\'echet mean to the local semantic perturbations in a latent space. Fr\'echet basis is discovered in two stages. First, the global semantic subspace is discovered by the Fr\'echet mean in the Grassmannian manifold of the local semantic subspaces. Second, Fr\'echet basis is found by optimizing a basis of the semantic subspace via the Fr\'echet mean in the Special Orthogonal Group. Experimental results demonstrate that Fr\'echet basis provides better semantic factorization and robustness compared to the previous methods. Moreover, we suggest the basis refinement scheme for the previous methods. The quantitative experiments show that the refined basis achieves better semantic factorization while constrained on the same semantic subspace given by the previous method.Comment: 25 pages, 21 figure

    Charge density wave and superconductivity in the kagome metal CsV3_3Sb5_5 around a pressure-induced quantum critical point

    Full text link
    Using first-principles density functional theory calculations, we investigate the pressure-induced quantum phase transition (QPT) from the charge density wave (CDW) to the pristine phase in the layered kagome metal CsV3_3Sb5_5 consisting of three-atom-thick Sbβˆ’-V3_3Sbβˆ’-Sb and one-atom-thick Cs layers. The CDW structure having the formation of trimeric and hexameric V atoms with buckled Sb honeycomb layers features an increase in the lattice parameter along the cc axis, compared to its counterpart pristine structure having the ideal V3_3Sb kagome and planar Sb honeycomb layers. Consequently, as pressure increases, the relatively smaller volume of the pristine phase contributes to reducing the enthalpy difference between the CDW and pristine phases, yielding a pressure-induced QPT at a critical pressure PcP_c of ∼{\sim}2 GPa. Furthermore, we find that (i) the superconducting transition temperature TcT_c increases around PcP_c due to a phonon softening associated with the periodic lattice distortion of V trimers and hexamers and that (ii) above PcP_c, optical phonon modes are hardened with increasing pressure, leading to monotonous decreases in the electron-phonon coupling constant and TcT_c. Our findings not only demonstrate that the uniaxial strain along the cc axis plays an important role in the QPT observed in CsV3_3Sb5_5, but also provide an explanation for the observed superconductivity around PcP_c in terms of a phonon-mediated superconducting mechanism

    Surface-induced ferromagnetism and anomalous Hall transport at Zr2S(001)

    Full text link
    Two-dimensional layered electrides possessing anionic excess electrons in the interstitial spaces between cationic layers have attracted much attention due to their promising opportunities in both fundamental research and technological applications. Using first-principles calculations, we predict that the layered bulk electride Zr2S is nonmagnetic with massive Dirac nodal-line states arising from Zr-4d cationic and interlayer anionic electrons. However, the Zr2S(001) surface increases the density of states at the Fermi level caused by the surface potential, thereby inducing a ferromagnetic order at the outermost Zr layer via the Stoner instability. Consequently, the time-reversal symmetry breaking at the surface not only generates highly spin-polarized topological surface states with intricate helical spin textures, but also hosts an intrinsic anomalous Hall effect originating from the Berry curvature generated by spin-orbit coupling. Our findings offer a playground to investigate the emergence of ferromagnetism and anomalous Hall transport at the surface of nonmagnetic topological electrides
    • …
    corecore