912 research outputs found

    Network Lasso: Clustering and Optimization in Large Graphs

    Full text link
    Convex optimization is an essential tool for modern data analysis, as it provides a framework to formulate and solve many problems in machine learning and data mining. However, general convex optimization solvers do not scale well, and scalable solvers are often specialized to only work on a narrow class of problems. Therefore, there is a need for simple, scalable algorithms that can solve many common optimization problems. In this paper, we introduce the \emph{network lasso}, a generalization of the group lasso to a network setting that allows for simultaneous clustering and optimization on graphs. We develop an algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in a distributed and scalable manner, which allows for guaranteed global convergence even on large graphs. We also examine a non-convex extension of this approach. We then demonstrate that many types of problems can be expressed in our framework. We focus on three in particular - binary classification, predicting housing prices, and event detection in time series data - comparing the network lasso to baseline approaches and showing that it is both a fast and accurate method of solving large optimization problems

    Learning a kernel matrix for nonlinear dimensionality reduction

    Get PDF
    We investigate how to learn a kernel matrix for high dimensional data that lies on or near a low dimensional manifold. Noting that the kernel matrix implicitly maps the data into a nonlinear feature space, we show how to discover a mapping that unfolds the underlying manifold from which the data was sampled. The kernel matrix is constructed by maximizing the variance in feature space subject to local constraints that preserve the angles and distances between nearest neighbors. The main optimization involves an instance of semidefinite programming---a fundamentally different computation than previous algorithms for manifold learning, such as Isomap and locally linear embedding. The optimized kernels perform better than polynomial and Gaussian kernels for problems in manifold learning, but worse for problems in large margin classification. We explain these results in terms of the geometric properties of different kernels and comment on various interpretations of other manifold learning algorithms as kernel methods

    Hierarchical Distributed Representations for Statistical Language Modeling

    Get PDF
    Statistical language models estimate the probability of a word occurring in a given context. The most common language models rely on a discrete enumeration of predictive contexts (e.g., n-grams) and consequently fail to capture and exploit statistical regularities across these contexts. In this paper, we show how to learn hierarchical, distributed representations of word contexts that maximize the predictive value of a statistical language model. The representations are initialized by unsupervised algorithms for linear and nonlinear dimensionality reduction [14], then fed as input into a hierarchical mixture of experts, where each expert is a multinomial distribution over predicted words [12]. While the distributed representations in our model are inspired by the neural probabilistic language model of Bengio et al. [2, 3], our particular architecture enables us to work with significantly larger vocabularies and training corpora. For example, on a large-scale bigram modeling task involving a sixty thousand word vocabulary and a training corpus of three million sentences, we demonstrate consistent improvement over class-based bigram models [10, 13]. We also discuss extensions of our approach to longer multiword contexts

    Association of prenatal perchlorate, thiocyanate, and nitrate exposure with neonatal size and gestational age

    Get PDF
    BACKGROUND: Perchlorate and similar anions compete with iodine for uptake into the thyroid by the sodium iodide symporter (NIS). This may restrict fetal growth via impaired thyroid hormone production. METHODS: We collected urine samples from 107 pregnant women and used linear regression to estimate differences in newborn size and gestational age associated with increases in perchlorate, thiocyanate, nitrate, and perchlorate equivalence concentrations (PEC; measure of total NIS inhibitor exposure). RESULTS: NIS inhibitor concentrations were not associated with newborn weight, length, or gestational age. Each 2.62ng/mug creatinine increase in perchlorate was associated with smaller head circumference (0.32cm; 95% CI: -0.66, 0.01), but each 3.38ng/mug increase in PEC was associated with larger head circumference (0.48cm; -0.01, 0.97). CONCLUSIONS: These anions may have effects on fetal development (e.g. neurocognitive) that are not reflected in gross measures. Future research should focus on other abnormalities in neonates exposed to NIS inhibitors

    A Novel Hybrid Scheme Using Genetic Algorithms and Deep Learning for the Reconstruction of Portuguese Tile Panels

    Full text link
    This paper presents a novel scheme, based on a unique combination of genetic algorithms (GAs) and deep learning (DL), for the automatic reconstruction of Portuguese tile panels, a challenging real-world variant of the jigsaw puzzle problem (JPP) with important national heritage implications. Specifically, we introduce an enhanced GA-based puzzle solver, whose integration with a novel DL-based compatibility measure (DLCM) yields state-of-the-art performance, regarding the above application. Current compatibility measures consider typically (the chromatic information of) edge pixels (between adjacent tiles), and help achieve high accuracy for the synthetic JPP variant. However, such measures exhibit rather poor performance when applied to the Portuguese tile panels, which are susceptible to various real-world effects, e.g., monochromatic panels, non-squared tiles, edge degradation, etc. To overcome such difficulties, we have developed a novel DLCM to extract high-level texture/color statistics from the entire tile information. Integrating this measure with our enhanced GA-based puzzle solver, we have demonstrated, for the first time, how to deal most effectively with large-scale real-world problems, such as the Portuguese tile problem. Specifically, we have achieved 82% accuracy for the reconstruction of Portuguese tile panels with unknown piece rotation and puzzle dimension (compared to merely 3.5% average accuracy achieved by the best method known for solving this problem variant). The proposed method outperforms even human experts in several cases, correcting their mistakes in the manual tile assembly

    Nonadiabatic approach to dimerization gap and optical absorption coefficient of the Su-Schrieffer-Heeger model

    Full text link
    An analytical nonadiabatic approach has been developed to study the dimerization gap and the optical absorption coefficient of the Su-Schrieffer-Heeger model where the electrons interact with dispersive quantum phonons. By investigating quantitatively the effects of quantum phonon fluctuations on the gap order and the optical responses in this system, we show that the dimerization gap is much more reduced by the quantum lattice fluctuations than the optical absorption coefficient is. The calculated optical absorption coefficient and the density of states do not have the inverse-square-root singularity, but have a peak above the gap edge and there exist a significant tail below the peak. The peak of optical absorption spectrum is not directly corresponding to the dimerized gap. Our results of the optical absorption coefficient agree well with those of the experiments in both the shape and the peak position of the optical absorption spectrum.Comment: 14 pages, 7 figures. to be published in PR

    Topological and geometrical restrictions, free-boundary problems and self-gravitating fluids

    Full text link
    Let (P1) be certain elliptic free-boundary problem on a Riemannian manifold (M,g). In this paper we study the restrictions on the topology and geometry of the fibres (the level sets) of the solutions f to (P1). We give a technique based on certain remarkable property of the fibres (the analytic representation property) for going from the initial PDE to a global analytical characterization of the fibres (the equilibrium partition condition). We study this analytical characterization and obtain several topological and geometrical properties that the fibres of the solutions must possess, depending on the topology of M and the metric tensor g. We apply these results to the classical problem in physics of classifying the equilibrium shapes of both Newtonian and relativistic static self-gravitating fluids. We also suggest a relationship with the isometries of a Riemannian manifold.Comment: 36 pages. In this new version the analytic representation hypothesis is proved. Please address all correspondence to D. Peralta-Sala

    Radiative Decays of the Upsilon(1S) to a Pair of Charged Hadrons

    Full text link
    Using data obtained with the CLEO~III detector, running at the Cornell Electron Storage Ring (CESR), we report on a new study of exclusive radiative Upsilon(1S) decays into the final states gamma pi^+ pi^-, gamma K^+ K^-, and gamma p pbar.. We present branching ratio measurements for the decay modes Upsilon(1S) to gamma f_2(1270), Upsilon(1S) to gamma f_2'(1525), and Upsilon(1S) to gamma K^+K^-; helicity production ratios for f_2(1270) and f_2'(1525); upper limits for the decay Upsilon(1S) to gamma f_J(2200), with f_J(2220) to pi^+ pi^-, K^+ K^-, p pbar; and an upper limit for the decay Upsilon(1S) to gamma X(1860), with X(1860) to gamma p pbar.Comment: 17 pages postscript,also available through http://www.lns.cornell.edu/public/CLNS/2005/, Submitted to PR

    Update of the measurement of the cross section for e^+e^- -> psi(3770) -> hadrons

    Full text link
    We have updated our measurement of the cross section for e^+e^- -> psi(3770) -> hadrons, our publication "Measurement of sigma(e^+e^- -> psi(3770) -> hadrons) at E_{c.m.} = 3773 MeV", arXiv:hep-ex/0512038, Phys.Rev.Lett.96, 092002 (2006). Simultaneous with this arXiv update, we have published an erratum in Phys.Rev.Lett.104, 159901 (2010). There, and in this update, we have corrected a mistake in the computation of the error on the difference of the cross sections for e^+e^- -> psi(3770) -> hadrons and e^+e^- -> psi(3770) -> DDbar. We have also used a more recent CLEO measurement of cross section for e^+e^- -> psi(3770) -> DDbar. From this, we obtain an upper limit on the branching fraction for psi(3770) -> non-DDbar of 9% at 90% confidence level.Comment: 3 pages, 0 figures. This is an erratum to Phys.Rev.Lett.96:092002,2006. Added a reference

    Measurement of Interfering K^*+K^- and K^*-K^+ Amplitudes in the Decay D^0 --> K^+K^-pi^0

    Full text link
    We have studied the Cabibbo-suppressed decay mode D^0 into K^+ K^- pi^0 using a Dalitz plot technique and find the strong phase difference delta_D [defined as delta_(K*^- K^+) - delta_(K*^+ K^-)] = 332 degrees +- 8 degrees +- 11 degrees and relative amplitude r_D [defined as a_(K*^- K^+) / a_(K*^+ K^-)] = 0.52 +- 0.05 +- 0.04. This measurement indicates significant destructive interference between D^0 into K^+ (K^- pi^0)_K*^- and D^0 into K^- (K^+ pi^0)_K*^+ in the Dalitz plot region where these two modes overlap. This analysis uses 9.0 fb^(-1) of data collected at s^(1/2) of approximately 10.58 GeV with the CLEO III detector.Comment: 10 pages postscript,also available through http://www.lns.cornell.edu/public/CLNS/2006/, Submitted to Phys. Rev. D (Rapid Communications
    • …
    corecore