75,254 research outputs found
Factorized Topic Models
In this paper we present a modification to a latent topic model, which makes
the model exploit supervision to produce a factorized representation of the
observed data. The structured parameterization separately encodes variance that
is shared between classes from variance that is private to each class by the
introduction of a new prior over the topic space. The approach allows for a
more eff{}icient inference and provides an intuitive interpretation of the data
in terms of an informative signal together with structured noise. The
factorized representation is shown to enhance inference performance for image,
text, and video classification.Comment: ICLR 201
Evaluating the Differences of Gridding Techniques for Digital Elevation Models Generation and Their Influence on the Modeling of Stony Debris Flows Routing: A Case Study From Rovina di Cancia Basin (North-Eastern Italian Alps)
Debris \ufb02ows are among the most hazardous phenomena in mountain areas. To cope
with debris \ufb02ow hazard, it is common to delineate the risk-prone areas through
routing models. The most important input to debris \ufb02ow routing models are the
topographic data, usually in the form of Digital Elevation Models (DEMs). The quality
of DEMs depends on the accuracy, density, and spatial distribution of the sampled
points; on the characteristics of the surface; and on the applied gridding methodology.
Therefore, the choice of the interpolation method affects the realistic representation
of the channel and fan morphology, and thus potentially the debris \ufb02ow routing
modeling outcomes. In this paper, we initially investigate the performance of common
interpolation methods (i.e., linear triangulation, natural neighbor, nearest neighbor,
Inverse Distance to a Power, ANUDEM, Radial Basis Functions, and ordinary kriging)
in building DEMs with the complex topography of a debris \ufb02ow channel located
in the Venetian Dolomites (North-eastern Italian Alps), by using small footprint full-
waveform Light Detection And Ranging (LiDAR) data. The investigation is carried
out through a combination of statistical analysis of vertical accuracy, algorithm
robustness, and spatial clustering of vertical errors, and multi-criteria shape reliability
assessment. After that, we examine the in\ufb02uence of the tested interpolation algorithms
on the performance of a Geographic Information System (GIS)-based cell model for
simulating stony debris \ufb02ows routing. In detail, we investigate both the correlation
between the DEMs heights uncertainty resulting from the gridding procedure and
that on the corresponding simulated erosion/deposition depths, both the effect of
interpolation algorithms on simulated areas, erosion and deposition volumes, solid-liquid
discharges, and channel morphology after the event. The comparison among the tested
interpolation methods highlights that the ANUDEM and ordinary kriging algorithms
are not suitable for building DEMs with complex topography. Conversely, the linear
triangulation, the natural neighbor algorithm, and the thin-plate spline plus tension and completely regularized spline functions ensure the best trade-off among accuracy
and shape reliability. Anyway, the evaluation of the effects of gridding techniques on
debris \ufb02ow routing modeling reveals that the choice of the interpolation algorithm does
not signi\ufb01cantly affect the model outcomes
Utility in WTP space: a tool to address confounding random scale effects in destination choice to the Alps
Destination choice models with individual-specific taste variation have become the presumptive analytical approach in applied nonmarket valuation. Under the usual specification, tastes are represented by coefficients of site attributes that enter utility, and the distribution of these coefficients is estimated. The distribution of willingness-to-pay (WTP) for site attributes is then derived from the estimated distribution of coefficients. Though conceptually appealing this procedure often results in untenable distributions of willingness to pay. An alternative procedure is to estimate the distribution of willingness to pay directly, through a re-parameterization of the model. We compare hierarchical Bayes and maximum simulated likelihood estimates under both approaches, using data on site choice in the Alps. We find that models parameterized in terms of WTP provide more reasonable estimates for the distribution of WTP, and also fit the data better than models parameterized in terms of attribute coefficients. This approach to parameterizing utility is hence deemed promising for applied nonmarket valuation
A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data
Topic modeling based on latent Dirichlet allocation (LDA) has been a
framework of choice to deal with multimodal data, such as in image annotation
tasks. Another popular approach to model the multimodal data is through deep
neural networks, such as the deep Boltzmann machine (DBM). Recently, a new type
of topic model called the Document Neural Autoregressive Distribution Estimator
(DocNADE) was proposed and demonstrated state-of-the-art performance for text
document modeling. In this work, we show how to successfully apply and extend
this model to multimodal data, such as simultaneous image classification and
annotation. First, we propose SupDocNADE, a supervised extension of DocNADE,
that increases the discriminative power of the learned hidden topic features
and show how to employ it to learn a joint representation from image visual
words, annotation words and class label information. We test our model on the
LabelMe and UIUC-Sports data sets and show that it compares favorably to other
topic models. Second, we propose a deep extension of our model and provide an
efficient way of training the deep model. Experimental results show that our
deep model outperforms its shallow version and reaches state-of-the-art
performance on the Multimedia Information Retrieval (MIR) Flickr data set.Comment: 24 pages, 10 figures. A version has been accepted by TPAMI on Aug
4th, 2015. Add footnote about how to train the model in practice in Section
5.1. arXiv admin note: substantial text overlap with arXiv:1305.530
Quantitative Perspectives on Fifty Years of the Journal of the History of Biology
Journal of the History of Biology provides a fifty-year long record for
examining the evolution of the history of biology as a scholarly discipline. In
this paper, we present a new dataset and preliminary quantitative analysis of
the thematic content of JHB from the perspectives of geography, organisms, and
thematic fields. The geographic diversity of authors whose work appears in JHB
has increased steadily since 1968, but the geographic coverage of the content
of JHB articles remains strongly lopsided toward the United States, United
Kingdom, and western Europe and has diversified much less dramatically over
time. The taxonomic diversity of organisms discussed in JHB increased steadily
between 1968 and the late 1990s but declined in later years, mirroring broader
patterns of diversification previously reported in the biomedical research
literature. Finally, we used a combination of topic modeling and nonlinear
dimensionality reduction techniques to develop a model of multi-article fields
within JHB. We found evidence for directional changes in the representation of
fields on multiple scales. The diversity of JHB with regard to the
representation of thematic fields has increased overall, with most of that
diversification occurring in recent years. Drawing on the dataset generated in
the course of this analysis, as well as web services in the emerging digital
history and philosophy of science ecosystem, we have developed an interactive
web platform for exploring the content of JHB, and we provide a brief overview
of the platform in this article. As a whole, the data and analyses presented
here provide a starting-place for further critical reflection on the evolution
of the history of biology over the past half-century.Comment: 45 pages, 14 figures, 4 table
- âŠ