18,340 research outputs found
Algorithmic patterns for -matrices on many-core processors
In this work, we consider the reformulation of hierarchical ()
matrix algorithms for many-core processors with a model implementation on
graphics processing units (GPUs). matrices approximate specific
dense matrices, e.g., from discretized integral equations or kernel ridge
regression, leading to log-linear time complexity in dense matrix-vector
products. The parallelization of matrix operations on many-core
processors is difficult due to the complex nature of the underlying algorithms.
While previous algorithmic advances for many-core hardware focused on
accelerating existing matrix CPU implementations by many-core
processors, we here aim at totally relying on that processor type. As main
contribution, we introduce the necessary parallel algorithmic patterns allowing
to map the full matrix construction and the fast matrix-vector
product to many-core hardware. Here, crucial ingredients are space filling
curves, parallel tree traversal and batching of linear algebra operations. The
resulting model GPU implementation hmglib is the, to the best of the authors
knowledge, first entirely GPU-based Open Source matrix library of
this kind. We conclude this work by an in-depth performance analysis and a
comparative performance study against a standard matrix library,
highlighting profound speedups of our many-core parallel approach
Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora
Much of scientific progress stems from previously published findings, but
searching through the vast sea of scientific publications is difficult. We
often rely on metrics of scholarly authority to find the prominent authors but
these authority indices do not differentiate authority based on research
topics. We present Latent Topical-Authority Indexing (LTAI) for jointly
modeling the topics, citations, and topical authority in a corpus of academic
papers. Compared to previous models, LTAI differs in two main aspects. First,
it explicitly models the generative process of the citations, rather than
treating the citations as given. Second, it models each author's influence on
citations of a paper based on the topics of the cited papers, as well as the
citing papers. We fit LTAI to four academic corpora: CORA, Arxiv Physics, PNAS,
and Citeseer. We compare the performance of LTAI against various baselines,
starting with the latent Dirichlet allocation, to the more advanced models
including author-link topic model and dynamic author citation topic model. The
results show that LTAI achieves improved accuracy over other similar models
when predicting words, citations and authors of publications.Comment: Accepted by Transactions of the Association for Computational
Linguistics (TACL); to appea
High frequency trading and end-of-day price dislocation : [Version 28 Oktober 2013]
We show that the presence of high frequency trading (HFT) has significantly mitigated the frequency and severity of end-of-day price dislocation, counter to recent concerns expressed in the media. The effect of HFT is more pronounced on days when end of day price dislocation is more likely to be the result of market manipulation on days of option expiry dates and end of month. Moreover, the effect of HFT is more pronounced than the role of trading rules, surveillance, enforcement and legal conditions in curtailing the frequency and severity of end-of-day price dislocation. We show our findings are robust to different proxies of the start of HFT by trade size, cancellation of orders, and co-location
Suboptimal greedy power allocation schemes for discrete bit loading
In this paper we consider low cost discrete bit loading based on greedy power allocation (GPA) under the constraints of total transmit power budget, target BER, and maximum permissible QAM modulation order. Compared to the standard GPA, which is optimal in terms of maximising the data throughput, three suboptimal schemes are proposed, which perform GPA on subsets of subchannels only. These subsets are created by considering the minimum SNR boundaries of QAM levels for a given target BER. We demonstrate how these schemes can significantly reduce the computational complexity required for power allocation, particularly in the case of a large number of subchannels. Two of the proposed algorithms can achieve near optimal performance including a transfer of residual power between subsets at the expense of a very small extra cost. By simulations, we show that the two near optimal schemes, while greatly reducing complexity, perform best in two separate and distinct SNR regions
Suboptimal greedy power allocation schemes for discrete bit loading
In this paper we consider low cost discrete bit loading based on greedy power allocation (GPA) under the constraints of total transmit power budget, target BER, and maximum permissible QAM modulation order. Compared to the standard GPA, which is optimal in terms of maximising the data throughput, three suboptimal schemes are proposed, which perform GPA on subsets of subchannels only. These subsets are created by considering the minimum SNR boundaries of QAM levels for a given target BER. We demonstrate how these schemes can significantly reduce the computational complexity required for power allocation, particularly in the case of a large number of subchannels. Two of the proposed algorithms can achieve near optimal performance including a transfer of residual power between subsets at the expense of a very small extra cost. By simulations, we show that the two near optimal schemes, while greatly reducing complexity, perform best in two separate and distinct SNR regions
- …