146,850 research outputs found

    Parametric t-Distributed Stochastic Exemplar-centered Embedding

    Full text link
    Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation. However, the performance of pt-SNE is highly sensitive to the hyper-parameter batch size due to conflicting optimization goals, and often produces dramatically different embeddings with different choices of user-defined perplexities. To effectively solve these issues, we present parametric t-distributed stochastic exemplar-centered embedding methods. Our strategy learns embedding parameters by comparing given data only with precomputed exemplars, resulting in a cost function with linear computational and memory complexity, which is further reduced by noise contrastive samples. Moreover, we propose a shallow embedding network with high-order feature interactions for data visualization, which is much easier to tune but produces comparable performance in contrast to a deep neural network employed by pt-SNE. We empirically demonstrate, using several benchmark datasets, that our proposed methods significantly outperform pt-SNE in terms of robustness, visual effects, and quantitative evaluations.Comment: fixed typo

    Found in Alberta: Environmental Themes for the Anthropocene edited by Robert Boschman and Mario Trono

    Get PDF
    Review of Robert Boschman and Mario Trono’s edited collection Found in Alberta: Environmental Themes for the Anthropocene

    The Horseshoe Estimator: Posterior Concentration around Nearly Black Vectors

    Get PDF
    We consider the horseshoe estimator due to Carvalho, Polson and Scott (2010) for the multivariate normal mean model in the situation that the mean vector is sparse in the nearly black sense. We assume the frequentist framework where the data is generated according to a fixed mean vector. We show that if the number of nonzero parameters of the mean vector is known, the horseshoe estimator attains the minimax â„“2\ell_2 risk, possibly up to a multiplicative constant. We provide conditions under which the horseshoe estimator combined with an empirical Bayes estimate of the number of nonzero means still yields the minimax risk. We furthermore prove an upper bound on the rate of contraction of the posterior distribution around the horseshoe estimator, and a lower bound on the posterior variance. These bounds indicate that the posterior distribution of the horseshoe prior may be more informative than that of other one-component priors, including the Lasso.Comment: This version differs from the final published version in pagination and typographical detail; Available at http://projecteuclid.org/euclid.ejs/141813426

    Relations between some invariants of algebraic varieties in positive characteristic

    Full text link
    We discuss relations between certain invariants of varieties in positive characteristic, like the a-number and the height of the Artin-Mazur formal group. We calculate the a-number for Fermat surfacesComment: 13 page

    On Exploring Temporal Graphs of Small Pathwidth

    Get PDF
    We show that the Temporal Graph Exploration Problem is NP-complete, even when the underlying graph has pathwidth 2 and at each time step, the current graph is connected

    Calculating the global contribution of coralline algae to carbon burial

    Get PDF
    The ongoing increase in anthropogenic carbon dioxide (CO2) emissions is changing the global marine environment and is causing warming and acidification of the oceans. Reduction of CO2 to a sustainable level is required to avoid further marine change. Many studies investigate the potential of marine carbon sinks (e.g. seagrass) to mitigate anthropogenic emissions, however, information on storage by coralline algae and the beds they create is scant. Calcifying photosynthetic organisms, including coralline algae, can act as a CO2 sink via photosynthesis and CaCO3 dissolution and act as a CO2 source during respiration and CaCO3 production on short-term time scales. Long-term carbon storage potential might come from the accumulation of coralline algae deposits over geological time scales. Here, the carbon storage potential of coralline algae is assessed using meta-analysis of their global organic and inorganic carbon production and the processes involved in this metabolism. Organic and inorganic production were estimated at 330 g C m−2 yr−1 and 880 g CaCO3 m−2 yr−1 respectively giving global organic/inorganic C production of 0.7/1.8 × 109 t C yr−1. Calcium carbonate production by free-living/crustose coralline algae (CCA) corresponded to a sediment accretion of 70/450 mm kyr−1. Using this potential carbon storage by coralline algae, the global production of free-living algae/CCA was 0.4/1.2 × 109 t C yr−1 suggesting a total potential carbon sink of 1.6 × 109 t C yr−1. Coralline algae therefore have production rates similar to mangroves, saltmarshes and seagrasses representing an as yet unquantified but significant carbon store, however, further empirical investigations are needed to determine the dynamics and stability of that store

    Classifying document types to enhance search and recommendations in digital libraries

    Full text link
    In this paper, we address the problem of classifying documents available from the global network of (open access) repositories according to their type. We show that the metadata provided by repositories enabling us to distinguish research papers, thesis and slides are missing in over 60% of cases. While these metadata describing document types are useful in a variety of scenarios ranging from research analytics to improving search and recommender (SR) systems, this problem has not yet been sufficiently addressed in the context of the repositories infrastructure. We have developed a new approach for classifying document types using supervised machine learning based exclusively on text specific features. We achieve 0.96 F1-score using the random forest and Adaboost classifiers, which are the best performing models on our data. By analysing the SR system logs of the CORE [1] digital library aggregator, we show that users are an order of magnitude more likely to click on research papers and thesis than on slides. This suggests that using document types as a feature for ranking/filtering SR results in digital libraries has the potential to improve user experience.Comment: 12 pages, 21st International Conference on Theory and Practise of Digital Libraries (TPDL), 2017, Thessaloniki, Greec
    • …
    corecore