17,569 research outputs found

    Localized Linear Discriminant Analysis

    Get PDF
    Despite its age, the Linear Discriminant Analysis performs well even in situations where the underlying premises like normally distributed data with constant covariance matrices over all classes are not met. It is, however, a global technique that does not regard the nature of an individual observation to be classified. By weighting each training observation according to its distance to the observation of interest, a global classifier can be transformed into an observation specific approach. So far, this has been done for logistic discrimination. By using LDA instead, the computation of the local classifier is much simpler. Moreover, it is ready for applications in multi-class situations. --classification,local models,LDA

    Localized Regression

    Get PDF
    The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen dataÂĄadaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures

    Codimension-3 Singularities and Yukawa Couplings in F-theory

    Full text link
    F-theory is one of the frameworks where all the Yukawa couplings of grand unified theories are generated and their computation is possible. The Yukawa couplings of charged matter multiplets are supposed to be generated around codimension-3 singularity points of a base complex 3-fold, and that has been confirmed for the simplest type of codimension-3 singularities in recent studies. However, the geometry of F-theory compactifications is much more complicated. For a generic F-theory compactification, such issues as flux configuration around the codimension-3 singularities, field-theory formulation of the local geometry and behavior of zero-mode wavefunctions have virtually never been addressed before. We address all these issues in this article, and further discuss nature of Yukawa couplings generated at such singularities. In order to calculate the Yukawa couplings of low-energy effective theory, however, the local descriptions of wavefunctions on complex surfaces and a global characterization of zero-modes over a complex curve have to be combined together. We found the relation between them by re-examining how chiral charged matters are characterized in F-theory compactification. An intrinsic definition of spectral surfaces in F-theory turns out to be the key concept. As a biproduct, we found a new way to understand the Heterotic--F theory duality, which improves the precision of existing duality map associated with codimension-3 singularities.Comment: 91 pages; minor clarification, typos corrected and a reference added (v3

    ELM regime classification by conformal prediction on an information manifold

    Get PDF
    Characterization and control of plasma instabilities known as edge-localized modes (ELMs) is crucial for the operation of fusion reactors. Recently, machine learning methods have demonstrated good potential in making useful inferences from stochastic fusion data sets. However, traditional classification methods do not offer an inherent estimate of the goodness of their prediction. In this paper, a distance-based conformal predictor classifier integrated with a geometric-probabilistic framework is presented. The first benefit of the approach lies in its comprehensive treatment of highly stochastic fusion data sets, by modeling the measurements with probability distributions in a metric space. This enables calculation of a natural distance measure between probability distributions: the Rao geodesic distance. Second, the predictions are accompanied by estimates of their accuracy and reliability. The method is applied to the classification of regimes characterized by different types of ELMs based on the measurements of global parameters and their error bars. This yields promising success rates and outperforms state-of-the-art automatic techniques for recognizing ELM signatures. The estimates of goodness of the predictions increase the confidence of classification by ELM experts, while allowing more reliable decisions regarding plasma control and at the same time increasing the robustness of the control system

    Gene ranking and biomarker discovery under correlation

    Full text link
    Biomarker discovery and gene ranking is a standard task in genomic high throughput analysis. Typically, the ordering of markers is based on a stabilized variant of the t-score, such as the moderated t or the SAM statistic. However, these procedures ignore gene-gene correlations, which may have a profound impact on the gene orderings and on the power of the subsequent tests. We propose a simple procedure that adjusts gene-wise t-statistics to take account of correlations among genes. The resulting correlation-adjusted t-scores ("cat" scores) are derived from a predictive perspective, i.e. as a score for variable selection to discriminate group membership in two-class linear discriminant analysis. In the absence of correlation the cat score reduces to the standard t-score. Moreover, using the cat score it is straightforward to evaluate groups of features (i.e. gene sets). For computation of the cat score from small sample data we propose a shrinkage procedure. In a comparative study comprising six different synthetic and empirical correlation structures we show that the cat score improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa. Finally, we also illustrate the cat score by analyzing metabolomic data. The shrinkage cat score is implemented in the R package "st" available from URL http://cran.r-project.org/web/packages/st/Comment: 18 pages, 5 figures, 1 tabl

    Exotic matter on singular divisors in F-theory

    Full text link
    We analyze exotic matter representations that arise on singular seven-brane configurations in F-theory. We develop a general framework for analyzing such representations, and work out explicit descriptions for models with matter in the 2-index and 3-index symmetric representations of SU(NN) and SU(2) respectively, associated with double and triple point singularities in the seven-brane locus. These matter representations are associated with Weierstrass models whose discriminants vanish to high order thanks to nontrivial cancellations possible only in the presence of a non-UFD algebraic structure. This structure can be described using the normalization of the ring of intrinsic local functions on a singular divisor. We consider the connection between geometric constraints on singular curves and corresponding constraints on the low-energy spectrum of 6D theories, identifying some new examples of apparent "swampland" theories that cannot be realized in F-theory but have no apparent low-energy inconsistency.Comment: 71 page

    On Two Simple and Effective Procedures for High Dimensional Classification of General Populations

    Get PDF
    In this paper, we generalize two criteria, the determinant-based and trace-based criteria proposed by Saranadasa (1993), to general populations for high dimensional classification. These two criteria compare some distances between a new observation and several different known groups. The determinant-based criterion performs well for correlated variables by integrating the covariance structure and is competitive to many other existing rules. The criterion however requires the measurement dimension be smaller than the sample size. The trace-based criterion in contrast, is an independence rule and effective in the "large dimension-small sample size" scenario. An appealing property of these two criteria is that their implementation is straightforward and there is no need for preliminary variable selection or use of turning parameters. Their asymptotic misclassification probabilities are derived using the theory of large dimensional random matrices. Their competitive performances are illustrated by intensive Monte Carlo experiments and a real data analysis.Comment: 5 figures; 22 pages. To appear in "Statistical Papers

    E(lementary) Strings in Six-Dimensional Heterotic F-Theory

    Full text link
    Using E-strings, we can analyze not only six-dimensional superconformal field theories but also probe vacua of non-perturabative heterotic string. We study strings made of D3-branes wrapped on various two-cycles in the global F-theory setup. We claim that E-strings are elementary in the sense that various combinations of E-strings can form M-strings as well as heterotic strings and new kind of strings, called G-strings. Using them, we show that emissions and combinations of heterotic small instantons generate most of known six-dimensional superconformal theories, their affinizations and little string theories. Taking account of global structure of compact internal geometry, we also show that special combinations of E-strings play an important role in constructing six-dimensional theories of DD- and EE-types. We check global consistency conditions from anomaly cancellation conditions, both from five-branes and strings, and show that they are given in terms of elementary E-string combinations.Comment: 58 pages, 16 figures; v2. version to appear in JHE
    • 

    corecore