23,558 research outputs found
Asymptotic Glosten Milgrom equilibrium
This paper studies the Glosten Milgrom model whose risky asset value admits
an arbitrary discrete distribution. Contrast to existing results on insider's
models, the insider's optimal strategy in this model, if exists, is not of
feedback type. Therefore a weak formulation of equilibrium is proposed. In this
weak formulation, the inconspicuous trade theorem still holds, but the
optimality for the insider's strategy is not enforced. However, the insider can
employ some feedback strategy whose associated expected profit is close to the
optimal value, when the order size is small. Moreover this discrepancy
converges to zero when the order size diminishes. The existence of such a weak
equilibrium is established, in which the insider's strategy converges to the
Kyle optimal strategy when the order size goes to zero
Pitfalls and Remedies for Cross Validation with Multi-trait Genomic Prediction Methods.
Incorporating measurements on correlated traits into genomic prediction models can increase prediction accuracy and selection gain. However, multi-trait genomic prediction models are complex and prone to overfitting which may result in a loss of prediction accuracy relative to single-trait genomic prediction. Cross-validation is considered the gold standard method for selecting and tuning models for genomic prediction in both plant and animal breeding. When used appropriately, cross-validation gives an accurate estimate of the prediction accuracy of a genomic prediction model, and can effectively choose among disparate models based on their expected performance in real data. However, we show that a naive cross-validation strategy applied to the multi-trait prediction problem can be severely biased and lead to sub-optimal choices between single and multi-trait models when secondary traits are used to aid in the prediction of focal traits and these secondary traits are measured on the individuals to be tested. We use simulations to demonstrate the extent of the problem and propose three partial solutions: 1) a parametric solution from selection index theory, 2) a semi-parametric method for correcting the cross-validation estimates of prediction accuracy, and 3) a fully non-parametric method which we call CV2*: validating model predictions against focal trait measurements from genetically related individuals. The current excitement over high-throughput phenotyping suggests that more comprehensive phenotype measurements will be useful for accelerating breeding programs. Using an appropriate cross-validation strategy should more reliably determine if and when combining information across multiple traits is useful
Majorana Edge States in Interacting Two-chain Ladders of Fermions
In this work we study interacting spinless fermions on a two-chain ladder
with inter-chain pair tunneling while single-particle tunneling is suppressed
at low energy. The model embodies a symmetry associated with the
fermion parity on each chain. We find that when the system is driven to the
strong-coupling phase by the pair tunneling, Majorana excitations appear on the
boundary. Such Majorana edge states correspond to two-fold degeneracy of ground
states distinguished by different fermion parity on each chain, thus
representing a generalization of one-dimensional topological superconductors.
We also characterize the stability of the ground state degeneracy against local
perturbations. Lattice fermion models realizing such effective field theory are
discussed.Comment: 6 pages, 1 figur
Fast k-means based on KNN Graph
In the era of big data, k-means clustering has been widely adopted as a basic
processing tool in various contexts. However, its computational cost could be
prohibitively high as the data size and the cluster number are large. It is
well known that the processing bottleneck of k-means lies in the operation of
seeking closest centroid in each iteration. In this paper, a novel solution
towards the scalability issue of k-means is presented. In the proposal, k-means
is supported by an approximate k-nearest neighbors graph. In the k-means
iteration, each data sample is only compared to clusters that its nearest
neighbors reside. Since the number of nearest neighbors we consider is much
less than k, the processing cost in this step becomes minor and irrelevant to
k. The processing bottleneck is therefore overcome. The most interesting thing
is that k-nearest neighbor graph is constructed by iteratively calling the fast
-means itself. Comparing with existing fast k-means variants, the proposed
algorithm achieves hundreds to thousands times speed-up while maintaining high
clustering quality. As it is tested on 10 million 512-dimensional data, it
takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
same scale of clustering, it would take 3 years for traditional k-means
- …