20,384 research outputs found
Variabilidade intraespecífica de isolados de meloidogyne spp. do arroz e marcadores scar para identificação de m. graminicola, M. oryzae E M. salasi.
Gisbrecht A, Schleif F-M. Metric and non-metric proximity transformations at linear costs. Neurocomputing. 2015;167:643-657.Domain specific (dis-)similarity or proximity measures used e.g. in alignment algorithms of sequence data are popular to analyze complicated data objects and to cover domain specific data properties. Without an underlying vector space these data are given as pairwise (dis-)similarities only. The few available methods for such data focus widely on similarities and do not scale to large datasets. Kernel methods are very effective for metric similarity matrices, also at large scale, but costly transformations are necessary starting with non-metric (dis-) similarities. We propose an integrative combination of Nystrom approximation, potential double centering and eigenvalue correction to obtain valid kernel matrices at linear costs in the number of samples. By the proposed approach effective kernel approaches become accessible. Experiments with several larger (dis-)similarity datasets show that the proposed method achieves much better runtime performance than the standard strategy while keeping competitive model accuracy. The main contribution is an efficient and accurate technique, to convert (potentially non-metric) large scale dissimilarity matrices into approximated positive semi-definite kernel matrices at linear costs. (C) 2015 Elsevier B.V. All rights reserved
Competition Among Spatially Differentiated Firms: An Empirical Model with an Application to Cement
The theoretical literature of industrial organization shows that the distances between consumers and firms have first-order implications for competitive outcomes whenever transportation costs are large. To assess these effects empirically, we develop a structural model of competition among spatially differentiated firms and introduce a GMM estimator that recovers the structural parameters with only regional-level data. We apply the model and estimator to the portland cement industry. The estimation fits, both in-sample and out-of-sample, demonstrate that the framework explains well the salient features of competition. We estimate transportation costs to be $0.30 per tonne-mile, given diesel prices at the 2000 level, and show that these costs constrain shipping distances and provide firms with localized market power. To demonstrate policy-relevance, we conduct counter-factual simulations that quantify competitive harm from a hypothetical merger. We are able to map the distribution of harm over geographic space and identify the divestiture that best mitigates harm.
Recommended from our members
Instance-based prediction of real-valued attributes
Instance-based representations have been applied to numerous classification tasks with a fair amount of success. These tasks predict a symbolic class based on observed attributes. This paper presents a method for predicting a numeric value based on observed attributes. We prove that if the numeric values are generated by continuous functions with bounded slope, then the predicted values are accurate approximations of the actual values. We demonstrate the utility of this approach by comparing it with standard approaches for value-prediction. The approach requires no background knowledge
Complex-valued embeddings of generic proximity data
Proximities are at the heart of almost all machine learning methods. If the
input data are given as numerical vectors of equal lengths, euclidean distance,
or a Hilbertian inner product is frequently used in modeling algorithms. In a
more generic view, objects are compared by a (symmetric) similarity or
dissimilarity measure, which may not obey particular mathematical properties.
This renders many machine learning methods invalid, leading to convergence
problems and the loss of guarantees, like generalization bounds. In many cases,
the preferred dissimilarity measure is not metric, like the earth mover
distance, or the similarity measure may not be a simple inner product in a
Hilbert space but in its generalization a Krein space. If the input data are
non-vectorial, like text sequences, proximity-based learning is used or ngram
embedding techniques can be applied. Standard embeddings lead to the desired
fixed-length vector encoding, but are costly and have substantial limitations
in preserving the original data's full information. As an information
preserving alternative, we propose a complex-valued vector embedding of
proximity data. This allows suitable machine learning algorithms to use these
fixed-length, complex-valued vectors for further processing. The complex-valued
data can serve as an input to complex-valued machine learning algorithms. In
particular, we address supervised learning and use extensions of
prototype-based learning. The proposed approach is evaluated on a variety of
standard benchmarks and shows strong performance compared to traditional
techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued
embedding, learning vector quantizatio
Complex-valued embeddings of generic proximity data
Proximities are at the heart of almost all machine learning methods. If the
input data are given as numerical vectors of equal lengths, euclidean distance,
or a Hilbertian inner product is frequently used in modeling algorithms. In a
more generic view, objects are compared by a (symmetric) similarity or
dissimilarity measure, which may not obey particular mathematical properties.
This renders many machine learning methods invalid, leading to convergence
problems and the loss of guarantees, like generalization bounds. In many cases,
the preferred dissimilarity measure is not metric, like the earth mover
distance, or the similarity measure may not be a simple inner product in a
Hilbert space but in its generalization a Krein space. If the input data are
non-vectorial, like text sequences, proximity-based learning is used or ngram
embedding techniques can be applied. Standard embeddings lead to the desired
fixed-length vector encoding, but are costly and have substantial limitations
in preserving the original data's full information. As an information
preserving alternative, we propose a complex-valued vector embedding of
proximity data. This allows suitable machine learning algorithms to use these
fixed-length, complex-valued vectors for further processing. The complex-valued
data can serve as an input to complex-valued machine learning algorithms. In
particular, we address supervised learning and use extensions of
prototype-based learning. The proposed approach is evaluated on a variety of
standard benchmarks and shows strong performance compared to traditional
techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued
embedding, learning vector quantizatio
- …