18,340 research outputs found

    Algorithmic patterns for H\mathcal{H}-matrices on many-core processors

    Get PDF
    In this work, we consider the reformulation of hierarchical (H\mathcal{H}) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H\mathcal{H} matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H\mathcal{H} matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H\mathcal{H} matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H\mathcal{H} matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H\mathcal{H} matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H\mathcal{H} matrix library, highlighting profound speedups of our many-core parallel approach

    Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

    Full text link
    Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citations, and topical authority in a corpus of academic papers. Compared to previous models, LTAI differs in two main aspects. First, it explicitly models the generative process of the citations, rather than treating the citations as given. Second, it models each author's influence on citations of a paper based on the topics of the cited papers, as well as the citing papers. We fit LTAI to four academic corpora: CORA, Arxiv Physics, PNAS, and Citeseer. We compare the performance of LTAI against various baselines, starting with the latent Dirichlet allocation, to the more advanced models including author-link topic model and dynamic author citation topic model. The results show that LTAI achieves improved accuracy over other similar models when predicting words, citations and authors of publications.Comment: Accepted by Transactions of the Association for Computational Linguistics (TACL); to appea

    High frequency trading and end-of-day price dislocation : [Version 28 Oktober 2013]

    Get PDF
    We show that the presence of high frequency trading (HFT) has significantly mitigated the frequency and severity of end-of-day price dislocation, counter to recent concerns expressed in the media. The effect of HFT is more pronounced on days when end of day price dislocation is more likely to be the result of market manipulation on days of option expiry dates and end of month. Moreover, the effect of HFT is more pronounced than the role of trading rules, surveillance, enforcement and legal conditions in curtailing the frequency and severity of end-of-day price dislocation. We show our findings are robust to different proxies of the start of HFT by trade size, cancellation of orders, and co-location

    Suboptimal greedy power allocation schemes for discrete bit loading

    Get PDF
    In this paper we consider low cost discrete bit loading based on greedy power allocation (GPA) under the constraints of total transmit power budget, target BER, and maximum permissible QAM modulation order. Compared to the standard GPA, which is optimal in terms of maximising the data throughput, three suboptimal schemes are proposed, which perform GPA on subsets of subchannels only. These subsets are created by considering the minimum SNR boundaries of QAM levels for a given target BER. We demonstrate how these schemes can significantly reduce the computational complexity required for power allocation, particularly in the case of a large number of subchannels. Two of the proposed algorithms can achieve near optimal performance including a transfer of residual power between subsets at the expense of a very small extra cost. By simulations, we show that the two near optimal schemes, while greatly reducing complexity, perform best in two separate and distinct SNR regions

    Suboptimal greedy power allocation schemes for discrete bit loading

    Get PDF
    In this paper we consider low cost discrete bit loading based on greedy power allocation (GPA) under the constraints of total transmit power budget, target BER, and maximum permissible QAM modulation order. Compared to the standard GPA, which is optimal in terms of maximising the data throughput, three suboptimal schemes are proposed, which perform GPA on subsets of subchannels only. These subsets are created by considering the minimum SNR boundaries of QAM levels for a given target BER. We demonstrate how these schemes can significantly reduce the computational complexity required for power allocation, particularly in the case of a large number of subchannels. Two of the proposed algorithms can achieve near optimal performance including a transfer of residual power between subsets at the expense of a very small extra cost. By simulations, we show that the two near optimal schemes, while greatly reducing complexity, perform best in two separate and distinct SNR regions
    corecore