416 research outputs found

    PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs.

    Get PDF
    BACKGROUND: Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s) which usually involves multiple single-nucleotide polymorphisms (SNPs) available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE) and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA) are preferable in this regard but their performance varies with methods of extracting principal components (PCs). RESULTS: PCA-based bootstrap confidence interval test (PCA-BCIT), which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES), controls only(COES) and cases and controls combined(CES). Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. CONCLUSIONS: PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Security Analysis of Pairing-based Cryptography

    Full text link
    Recent progress in number field sieve (NFS) has shaken the security of Pairing-based Cryptography. For the discrete logarithm problem (DLP) in finite field, we present the first systematic review of the NFS algorithms from three perspectives: the degree Ī±\alpha, constant cc, and hidden constant o(1)o(1) in the asymptotic complexity LQ(Ī±,c)L_Q\left(\alpha,c\right) and indicate that further research is required to optimize the hidden constant. Using the special extended tower NFS algorithm, we conduct a thorough security evaluation for all the existing standardized PF curves as well as several commonly utilized curves, which reveals that the BN256 curves recommended by the SM9 and the previous ISO/IEC standard exhibit only 99.92 bits of security, significantly lower than the intended 128-bit level. In addition, we comprehensively analyze the security and efficiency of BN, BLS, and KSS curves for different security levels. Our analysis suggests that the BN curve exhibits superior efficiency for security strength below approximately 105 bit. For a 128-bit security level, BLS12 and BLS24 curves are the optimal choices, while the BLS24 curve offers the best efficiency for security levels of 160bit, 192bit, and 256bit.Comment: 8 figures, 8 tables, 5121 word

    SMiLE: Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction

    Full text link
    Link prediction is the task of inferring missing links between entities in knowledge graphs. Embedding-based methods have shown effectiveness in addressing this problem by modeling relational patterns in triples. However, the link prediction task often requires contextual information in entity neighborhoods, while most existing embedding-based methods fail to capture it. Additionally, little attention is paid to the diversity of entity representations in different contexts, which often leads to false prediction results. In this situation, we consider that the schema of knowledge graph contains the specific contextual information, and it is beneficial for preserving the consistency of entities across contexts. In this paper, we propose a novel Schema-augmented Multi-level contrastive LEarning framework (SMiLE) to conduct knowledge graph link prediction. Specifically, we first exploit network schema as the prior constraint to sample negatives and pre-train our model by employing a multi-level contrastive learning method to yield both prior schema and contextual information. Then we fine-tune our model under the supervision of individual triples to learn subtler representations for link prediction. Extensive experimental results on four knowledge graph datasets with thorough analysis of each component demonstrate the effectiveness of our proposed framework against state-of-the-art baselines. The implementation of SMiLE is available at https://github.com/GKNL/SMiLE.Comment: Findings of EMNLP 202

    The Wall Street Neophyte: A Zero-Shot Analysis of ChatGPT Over MultiModal Stock Movement Prediction Challenges

    Full text link
    Recently, large language models (LLMs) like ChatGPT have demonstrated remarkable performance across a variety of natural language processing tasks. However, their effectiveness in the financial domain, specifically in predicting stock market movements, remains to be explored. In this paper, we conduct an extensive zero-shot analysis of ChatGPT's capabilities in multimodal stock movement prediction, on three tweets and historical stock price datasets. Our findings indicate that ChatGPT is a "Wall Street Neophyte" with limited success in predicting stock movements, as it underperforms not only state-of-the-art methods but also traditional methods like linear regression using price features. Despite the potential of Chain-of-Thought prompting strategies and the inclusion of tweets, ChatGPT's performance remains subpar. Furthermore, we observe limitations in its explainability and stability, suggesting the need for more specialized training or fine-tuning. This research provides insights into ChatGPT's capabilities and serves as a foundation for future work aimed at improving financial market analysis and prediction by leveraging social media sentiment and historical stock data.Comment: 13 page

    Select and Trade: Towards Unified Pair Trading with Hierarchical Reinforcement Learning

    Full text link
    Pair trading is one of the most effective statistical arbitrage strategies which seeks a neutral profit by hedging a pair of selected assets. Existing methods generally decompose the task into two separate steps: pair selection and trading. However, the decoupling of two closely related subtasks can block information propagation and lead to limited overall performance. For pair selection, ignoring the trading performance results in the wrong assets being selected with irrelevant price movements, while the agent trained for trading can overfit to the selected assets without any historical information of other assets. To address it, in this paper, we propose a paradigm for automatic pair trading as a unified task rather than a two-step pipeline. We design a hierarchical reinforcement learning framework to jointly learn and optimize two subtasks. A high-level policy would select two assets from all possible combinations and a low-level policy would then perform a series of trading actions. Experimental results on real-world stock data demonstrate the effectiveness of our method on pair trading compared with both existing pair selection and trading methods.Comment: 10 pages, 6 figure

    I/Q Imbalance and Imperfect SIC on Two-way Relay NOMA Systems

    Get PDF
    Abstract: Non-orthogonal multiple access (NOMA) system can meet the demands of ultra-high data rate, ultra-low latency, ultra-high reliability and massive connectivity of user devices (UE). However, the performance of the NOMA system may be deteriorated by the hardware impairments. In this paper, the joint effects of in-phase and quadrature-phase imbalance (IQI) and imperfect successive interference cancellation (ipSIC) on the performance of two-way relay cooperative NOMA (TWR C-NOMA) networks over the Rician fading channels are studied, where two users exchange information via a decode-and-forward (DF) relay. In order to evaluate the performance of the considered network, analytical expressions for the outage probability of the two users, as well as the overall system throughput are derived. To obtain more insights, the asymptotic outage performance in the high signal-to-noise ratio (SNR) region and the diversity order are analysed and discussed. Throughout the paper, Monte Carlo simulations are provided to verify the accuracy of our analysis. The results show that IQI and ipSIC have significant deleterious effects on the outage performance. It is also demonstrated that the outage behaviours of the conventional OMA approach are worse than those of NOMA. In addition, it is found that residual interference signals (IS) can result in error floors for the outage probability and zero diversity orders. Finally, the system throughput can be limited by IQI and ipSIC, and the system throughput converges to a fixed constant in the high SNR region

    Automatic Speaker Identification System for Urdu Speech

    Get PDF
    Speaker recognition is the process of recognizing a speaker from a verbal phrase. Such systems generally operates in two ways: to identify a speaker or to verify speakerā€™s claimed identity. Availability of valuable research material witnessed efforts paid to Automatic Speaker Identification (ASI) in East Asian, English and European languages. But unfortunately languages of South Asia especially ā€œUrduā€ have got very less attention. This paper aims to describe a new feature set for ASI in Urdu speech, achieving improved performance than baseline systems. Classifiers like Neural Net, NaĆÆve Bayes and K nearest neighbor (K-NN) have been used for modeling. Results are provided on the dataset of 40 speakers with 82% correct identification. Lastly, improvement in system performance is also reported by changing number of recordings per speaker

    PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance

    Full text link
    Although large language models (LLMs) has shown great performance on natural language processing (NLP) in the financial domain, there are no publicly available financial tailtored LLMs, instruction tuning datasets, and evaluation benchmarks, which is critical for continually pushing forward the open-source development of financial artificial intelligence (AI). This paper introduces PIXIU, a comprehensive framework including the first financial LLM based on fine-tuning LLaMA with instruction data, the first instruction data with 136K data samples to support the fine-tuning, and an evaluation benchmark with 5 tasks and 9 datasets. We first construct the large-scale multi-task instruction data considering a variety of financial tasks, financial document types, and financial data modalities. We then propose a financial LLM called FinMA by fine-tuning LLaMA with the constructed dataset to be able to follow instructions for various financial tasks. To support the evaluation of financial LLMs, we propose a standardized benchmark that covers a set of critical financial tasks, including five financial NLP tasks and one financial prediction task. With this benchmark, we conduct a detailed analysis of FinMA and several existing LLMs, uncovering their strengths and weaknesses in handling critical financial tasks. The model, datasets, benchmark, and experimental results are open-sourced to facilitate future research in financial AI.Comment: 12 pages, 1 figure

    Performance Analysis of a Recycled Concrete Interfacial Transition Zone in a Rapid Carbonization Environment

    Get PDF
    Based on the characteristics of recycled concrete interface structures, a multi-interface reconstruction model was established. To study the microstructure evolution of the interfacial transition zone (ITZ) during the carbonization process of recycled concrete, the microstructure characteristics of the ITZ of C30, C40, and C50 grade recycled concrete and the mortar matrix before and after carbonization were studied through the microhardness tester and SEM. The results show that the microhardness values of the ITZ and the mortar matrix are obviously increased and that the width of the ITZ decreases, while the ITZ performance of the C50 grade recycled concrete is not significantly changed. The ITZ exhibits a large amount of granular CaCO3 after carbonization, the pores are refined, and microcracks are generated. Overall, there are significant differences in the microstructures between the ITZ and the mortar matrix before and after carbonization
    • ā€¦
    corecore