2,267 research outputs found
Parametric t-Distributed Stochastic Exemplar-centered Embedding
Parametric embedding methods such as parametric t-SNE (pt-SNE) have been
widely adopted for data visualization and out-of-sample data embedding without
further computationally expensive optimization or approximation. However, the
performance of pt-SNE is highly sensitive to the hyper-parameter batch size due
to conflicting optimization goals, and often produces dramatically different
embeddings with different choices of user-defined perplexities. To effectively
solve these issues, we present parametric t-distributed stochastic
exemplar-centered embedding methods. Our strategy learns embedding parameters
by comparing given data only with precomputed exemplars, resulting in a cost
function with linear computational and memory complexity, which is further
reduced by noise contrastive samples. Moreover, we propose a shallow embedding
network with high-order feature interactions for data visualization, which is
much easier to tune but produces comparable performance in contrast to a deep
neural network employed by pt-SNE. We empirically demonstrate, using several
benchmark datasets, that our proposed methods significantly outperform pt-SNE
in terms of robustness, visual effects, and quantitative evaluations.Comment: fixed typo
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Providing systems the ability to relate linguistic and visual content is one
of the hallmarks of computer vision. Tasks such as text-based image retrieval
and image captioning were designed to test this ability but come with
evaluation measures that have a high variance or are difficult to interpret. We
study an alternative task for systems that match text and images: given a text
query, the system is asked to select the image that best matches the query from
a pair of semantically similar images. The system's accuracy on this Binary
Image SelectiON (BISON) task is interpretable, eliminates the reliability
problems of retrieval evaluations, and focuses on the system's ability to
understand fine-grained visual structure. We gather a BISON dataset that
complements the COCO dataset and use it to evaluate modern text-based image
retrieval and image captioning systems. Our results provide novel insights into
the performance of these systems. The COCO-BISON dataset and corresponding
evaluation code are publicly available from \url{http://hexianghu.com/bison/}
Classifying document types to enhance search and recommendations in digital libraries
In this paper, we address the problem of classifying documents available from
the global network of (open access) repositories according to their type. We
show that the metadata provided by repositories enabling us to distinguish
research papers, thesis and slides are missing in over 60% of cases. While
these metadata describing document types are useful in a variety of scenarios
ranging from research analytics to improving search and recommender (SR)
systems, this problem has not yet been sufficiently addressed in the context of
the repositories infrastructure. We have developed a new approach for
classifying document types using supervised machine learning based exclusively
on text specific features. We achieve 0.96 F1-score using the random forest and
Adaboost classifiers, which are the best performing models on our data. By
analysing the SR system logs of the CORE [1] digital library aggregator, we
show that users are an order of magnitude more likely to click on research
papers and thesis than on slides. This suggests that using document types as a
feature for ranking/filtering SR results in digital libraries has the potential
to improve user experience.Comment: 12 pages, 21st International Conference on Theory and Practise of
Digital Libraries (TPDL), 2017, Thessaloniki, Greec
Intestinal Obstruction in a Dog
On August 3, 1954, a 6-year-old female Collie was admitted to the Stange Memorial Clinic with a history of having an upset stomach for the past several days. Penicillin had been administered, but no improvement was noticed. The animal was examined and found to be extremely depressed and in a toxic condition. The conjunctiva appeared injected and the temperature was 103°F. A hard mass could be detected upon palpation of the lower abdomen on the left side
High Phenotypic Plasticity, but Low Signals of Local Adaptation to Climate in a Large-Scale Transplant Experiment of Picea abies (L.) Karst. in Europe
The most common tool to predict future changes in species range are species distribution models. These models do, however, often underestimate potential future habitat, as they do not account for phenotypic plasticity and local adaptation, although being the most important processes in the response of tree populations to rapid climate change. Here, we quantify the difference in the predictions of future range for Norway spruce, by (i) deriving a classic, occurrence-based species distribution model (OccurrenceSDM), and (ii) analysing the variation in juvenile tree height and translating this to species occurrence (TraitSDM). Making use of 32 site locations of the most comprehensive European trial series that includes 1,100 provenances of Norway spruce originating from its natural and further beyond from its largely extended, artificial distribution, we fit a universal response function to quantify growth as a function of site and provenance climate. Both the OccurrenceSDM and TraitSDM show a substantial retreat towards the northern latitudes and higher elevations (−55 and −43%, respectively, by the 2080s). However, thanks to the species’ particularly high phenotypic plasticity in juvenile height growth, the decline is delayed. The TraitSDM identifies increasing summer heat paired with decreasing water availability as the main climatic variable that restricts growth, while a prolonged frost-free period enables a longer period of active growth and therefore increasing growth potential within the restricted, remaining area. Clear signals of local adaptation to climatic clines spanning the entire range are barely detectable, as they are disguised by a latitudinal cline. This cline strongly reflects population differentiation for the Baltic domain, but fails to capture the high phenotypic variation associated to the geographic heterogeneity in the Central European mountain ranges paired with the species history of postglacial migration. Still the model is used to provide recommendations of optimal provenance choice for future climate conditions. In essence, assisted migration may not decrease the predicted range decline of Norway spruce, but may help to capitalize on potential opportunities for increased growth associated with warmer climates
Enhancing Domain Word Embedding via Latent Semantic Imputation
We present a novel method named Latent Semantic Imputation (LSI) to transfer
external knowledge into semantic space for enhancing word embedding. The method
integrates graph theory to extract the latent manifold structure of the
entities in the affinity space and leverages non-negative least squares with
standard simplex constraints and power iteration method to derive spectral
embeddings. It provides an effective and efficient approach to combining entity
representations defined in different Euclidean spaces. Specifically, our
approach generates and imputes reliable embedding vectors for low-frequency
words in the semantic space and benefits downstream language tasks that depend
on word embedding. We conduct comprehensive experiments on a carefully designed
classification problem and language modeling and demonstrate the superiority of
the enhanced embedding via LSI over several well-known benchmark embeddings. We
also confirm the consistency of the results under different parameter settings
of our method.Comment: ACM SIGKDD 201
Modeling dominant height growth using permanent plot data for Pinus brutia stands in the Eastern Mediterranean region
Aim of the study: At current, forest management in the Eastern Mediterranean region is largely based on experience rather than on management plans. To support the development of such plans, this study develops and compares site index equations for pure even-aged Pinus brutia stands in Syria using base-age invariant techniques that realistically describe dominant height growth.Materials and methods: Data on top height and stand age were obtained in 2008 and 2016 from 80 permanent plots capturing the whole range of variation in site conditions, stand age and stand density. Both the Algebraic Difference Approach (ADA) and the Generalized Algebraic Difference Approach (GADA) were used to fit eight generalized algebraic difference equations in order to identify the one which describes the data best. For this, 61 permanent plots were used for model calibration and 19 plots for validation.Main results: According to both biological plausibility and model accuracy, the so-called Sloboda equation based on the GADA approach showed the best performance.Research highlights: The study provides a solid classification and comparison of Pinus brutia stands growing in the Eastern Mediterranean region and can thus be used to support sustainable forest management planning.Keywords: site index; Generalized Algebraic Difference Approach (GADA); Sloboda equation
- …