12,326 research outputs found

    How Many Topics? Stability Analysis for Topic Models

    Full text link
    Topic modeling refers to the task of discovering the underlying thematic structure in a text corpus, where the output is commonly presented as a report of the top terms appearing in each topic. Despite the diversity of topic modeling algorithms that have been proposed, a common challenge in successfully applying these techniques is the selection of an appropriate number of topics for a given corpus. Choosing too few topics will produce results that are overly broad, while choosing too many will result in the "over-clustering" of a corpus into many small, highly-similar topics. In this paper, we propose a term-centric stability analysis strategy to address this issue, the idea being that a model with an appropriate number of topics will be more robust to perturbations in the data. Using a topic modeling approach based on matrix factorization, evaluations performed on a range of corpora show that this strategy can successfully guide the model selection process.Comment: Improve readability of plots. Add minor clarification

    Landau levels in the case of two degenerate coupled bands: kagome lattice tight-binding spectrum

    Full text link
    The spectrum of charged particles hopping on a kagome lattice in a uniform transverse magnetic field shows an unusual set of Landau levels at low field. They are unusual in two respects: the lowest Landau levels are paramagnetic so their energies decrease linearly with increasing field magnitude, and the spacings between the levels are not equal. These features are shown to follow from the degeneracy of the energy bands in zero magnetic field. We give a general discussion of Landau levels in the case of two degenerate bands, and show how the kagome lattice tight-binding model includes one special case of this more general problem. We also discuss the consequences of this for the behavior of the critical temperature of a kagome grid superconducting wire network, which is the experimental system that originally motivated this work.Comment: 18 pages, 8 figure

    Inhomogeneous non-Gaussianity

    Get PDF
    We propose a method to probe higher-order correlators of the primordial density field through the inhomogeneity of local non-Gaussian parameters, such as f_NL, measured within smaller patches of the sky. Correlators between n-point functions measured in one patch of the sky and k-point functions measured in another patch depend upon the (n+k)-point functions over the entire sky. The inhomogeneity of non-Gaussian parameters may be a feasible way to detect or constrain higher-order correlators in local models of non-Gaussianity, as well as to distinguish between single and multiple-source scenarios for generating the primordial density perturbation, and more generally to probe the details of inflationary physics.Comment: 16 pages, 2 figures; v2: Minor changes and references added. Matches the published versio

    Drawing Trees with Perfect Angular Resolution and Polynomial Area

    Full text link
    We study methods for drawing trees with perfect angular resolution, i.e., with angles at each node v equal to 2{\pi}/d(v). We show: 1. Any unordered tree has a crossing-free straight-line drawing with perfect angular resolution and polynomial area. 2. There are ordered trees that require exponential area for any crossing-free straight-line drawing having perfect angular resolution. 3. Any ordered tree has a crossing-free Lombardi-style drawing (where each edge is represented by a circular arc) with perfect angular resolution and polynomial area. Thus, our results explore what is achievable with straight-line drawings and what more is achievable with Lombardi-style drawings, with respect to drawings of trees with perfect angular resolution.Comment: 30 pages, 17 figure

    Testing real-time systems using TINA

    Get PDF
    The paper presents a technique for model-based black-box conformance testing of real-time systems using the Time Petri Net Analyzer TINA. Such test suites are derived from a prioritized time Petri net composed of two concurrent sub-nets specifying respectively the expected behaviour of the system under test and its environment.We describe how the toolbox TINA has been extended to support automatic generation of time-optimal test suites. The result is optimal in the sense that the set of test cases in the test suite have the shortest possible accumulated time to be executed. Input/output conformance serves as the notion of implementation correctness, essentially timed trace inclusion taking environment assumptions into account. Test cases selection is based either on using manually formulated test purposes or automatically from various coverage criteria specifying structural criteria of the model to be fulfilled by the test suite. We discuss how test purposes and coverage criterion are specified in the linear temporal logic SE-LTL, derive test sequences, and assign verdicts

    Emergent geometry from q-deformations of N=4 super Yang-Mills

    Full text link
    We study BPS states in a marginal deformation of super Yang-Mills on R x S^3 using a quantum mechanical system of q-commuting matrices. We focus mainly on the case where the parameter q is a root of unity, so that the AdS dual of the field theory can be associated to an orbifold of AdS_5x S^5. We show that in the large N limit, BPS states are described by density distributions of eigenvalues and we assign to these distributions a geometrical spacetime interpretation. We go beyond BPS configurations by turning on perturbative non-q-commuting excitations. Considering states in an appropriate BMN limit, we use a saddle point approximation to compute the BMN energy to all perturbative orders in the 't Hooft coupling. We also examine some BMN like states that correspond to twisted sector string states in the orbifold and we show that our geometrical interpretation of the system is consistent with the quantum numbers of the corresponding states under the quantum symmetry of the orbifold.Comment: 22 pages, 1 figure. v2: added references. v3:final published versio

    Optimization of inhomogeneous electron correlation factors in periodic solids

    Full text link
    A method is presented for the optimization of one-body and inhomogeneous two-body terms in correlated electronic wave functions of Jastrow-Slater type. The most general form of inhomogeneous correlation term which is compatible with crystal symmetry is used and the energy is minimized with respect to all parameters using a rapidly convergent iterative approach, based on Monte Carlo sampling of the energy and fitting energy fluctuations. The energy minimization is performed exactly within statistical sampling error for the energy derivatives and the resulting one- and two-body terms of the wave function are found to be well-determined. The largest calculations performed require the optimization of over 3000 parameters. The inhomogeneous two-electron correlation terms are calculated for diamond and rhombohedral graphite. The optimal terms in diamond are found to be approximately homogeneous and isotropic over all ranges of electron separation, but exhibit some inhomogeneity at short- and intermediate-range, whereas those in graphite are found to be homogeneous at short-range, but inhomogeneous and anisotropic at intermediate- and long-range electron separation.Comment: 23 pages, 15 figures, 1 table, REVTeX4, submitted to PR

    SMART: Unique splitting-while-merging framework for gene clustering

    Get PDF
    Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc
    corecore