416 research outputs found
PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs.
BACKGROUND: Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s) which usually involves multiple single-nucleotide polymorphisms (SNPs) available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE) and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA) are preferable in this regard but their performance varies with methods of extracting principal components (PCs). RESULTS: PCA-based bootstrap confidence interval test (PCA-BCIT), which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES), controls only(COES) and cases and controls combined(CES). Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. CONCLUSIONS: PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Security Analysis of Pairing-based Cryptography
Recent progress in number field sieve (NFS) has shaken the security of
Pairing-based Cryptography. For the discrete logarithm problem (DLP) in finite
field, we present the first systematic review of the NFS algorithms from three
perspectives: the degree , constant , and hidden constant in
the asymptotic complexity and indicate that further
research is required to optimize the hidden constant. Using the special
extended tower NFS algorithm, we conduct a thorough security evaluation for all
the existing standardized PF curves as well as several commonly utilized
curves, which reveals that the BN256 curves recommended by the SM9 and the
previous ISO/IEC standard exhibit only 99.92 bits of security, significantly
lower than the intended 128-bit level. In addition, we comprehensively analyze
the security and efficiency of BN, BLS, and KSS curves for different security
levels. Our analysis suggests that the BN curve exhibits superior efficiency
for security strength below approximately 105 bit. For a 128-bit security
level, BLS12 and BLS24 curves are the optimal choices, while the BLS24 curve
offers the best efficiency for security levels of 160bit, 192bit, and 256bit.Comment: 8 figures, 8 tables, 5121 word
SMiLE: Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction
Link prediction is the task of inferring missing links between entities in
knowledge graphs. Embedding-based methods have shown effectiveness in
addressing this problem by modeling relational patterns in triples. However,
the link prediction task often requires contextual information in entity
neighborhoods, while most existing embedding-based methods fail to capture it.
Additionally, little attention is paid to the diversity of entity
representations in different contexts, which often leads to false prediction
results. In this situation, we consider that the schema of knowledge graph
contains the specific contextual information, and it is beneficial for
preserving the consistency of entities across contexts. In this paper, we
propose a novel Schema-augmented Multi-level contrastive LEarning framework
(SMiLE) to conduct knowledge graph link prediction. Specifically, we first
exploit network schema as the prior constraint to sample negatives and
pre-train our model by employing a multi-level contrastive learning method to
yield both prior schema and contextual information. Then we fine-tune our model
under the supervision of individual triples to learn subtler representations
for link prediction. Extensive experimental results on four knowledge graph
datasets with thorough analysis of each component demonstrate the effectiveness
of our proposed framework against state-of-the-art baselines. The
implementation of SMiLE is available at https://github.com/GKNL/SMiLE.Comment: Findings of EMNLP 202
The Wall Street Neophyte: A Zero-Shot Analysis of ChatGPT Over MultiModal Stock Movement Prediction Challenges
Recently, large language models (LLMs) like ChatGPT have demonstrated
remarkable performance across a variety of natural language processing tasks.
However, their effectiveness in the financial domain, specifically in
predicting stock market movements, remains to be explored. In this paper, we
conduct an extensive zero-shot analysis of ChatGPT's capabilities in multimodal
stock movement prediction, on three tweets and historical stock price datasets.
Our findings indicate that ChatGPT is a "Wall Street Neophyte" with limited
success in predicting stock movements, as it underperforms not only
state-of-the-art methods but also traditional methods like linear regression
using price features. Despite the potential of Chain-of-Thought prompting
strategies and the inclusion of tweets, ChatGPT's performance remains subpar.
Furthermore, we observe limitations in its explainability and stability,
suggesting the need for more specialized training or fine-tuning. This research
provides insights into ChatGPT's capabilities and serves as a foundation for
future work aimed at improving financial market analysis and prediction by
leveraging social media sentiment and historical stock data.Comment: 13 page
Select and Trade: Towards Unified Pair Trading with Hierarchical Reinforcement Learning
Pair trading is one of the most effective statistical arbitrage strategies
which seeks a neutral profit by hedging a pair of selected assets. Existing
methods generally decompose the task into two separate steps: pair selection
and trading. However, the decoupling of two closely related subtasks can block
information propagation and lead to limited overall performance. For pair
selection, ignoring the trading performance results in the wrong assets being
selected with irrelevant price movements, while the agent trained for trading
can overfit to the selected assets without any historical information of other
assets. To address it, in this paper, we propose a paradigm for automatic pair
trading as a unified task rather than a two-step pipeline. We design a
hierarchical reinforcement learning framework to jointly learn and optimize two
subtasks. A high-level policy would select two assets from all possible
combinations and a low-level policy would then perform a series of trading
actions. Experimental results on real-world stock data demonstrate the
effectiveness of our method on pair trading compared with both existing pair
selection and trading methods.Comment: 10 pages, 6 figure
Recommended from our members
A Simple Graphene NHā Gas Sensor via Laser Direct Writing.
Ammonia gas sensors are very essential in many industries and everyday life. However, their complicated fabrication process, severe environmental fabrication requirements and desorption of residual ammonia molecules result in high cost and hinder their market acceptance. Here, laser direct writing is used to fabricate three parallel porous 3D graphene lines on a polyimide (PI) tape to simply construct an ammonia gas sensor. The middle one works as an ammonia sensing element and the other two on both sides work as heaters to improve the desorption performance of the sensing element to ammonia gas molecules. The graphene lines were characterized by scanning electron microscopy and Raman spectroscopy. The response and recovery time of the sensor without heating are 214 s and 222 s with a sensitivity of 0.087% ppm-1 for sensing 75 ppm ammonia gas, respectively. The experimental results prove that under the optimized heating temperature of about 70 Ā°C the heaters successfully help implement complete desorption of residual NHā showing a good sensitivity and cyclic stability
I/Q Imbalance and Imperfect SIC on Two-way Relay NOMA Systems
Abstract: Non-orthogonal multiple access (NOMA) system can meet the demands of ultra-high data rate, ultra-low latency, ultra-high reliability and massive connectivity of user devices (UE). However, the performance of the NOMA system may be deteriorated by the hardware impairments. In this paper, the joint effects of in-phase and quadrature-phase imbalance (IQI) and imperfect successive interference cancellation (ipSIC) on the performance of two-way relay cooperative NOMA (TWR C-NOMA) networks over the Rician fading channels are studied, where two users exchange information via a decode-and-forward (DF) relay. In order to evaluate the performance of the considered network, analytical expressions for the outage probability of the two users, as well as the overall system throughput are derived. To obtain more insights, the asymptotic outage performance in the high signal-to-noise ratio (SNR) region and the diversity order are analysed and discussed. Throughout the paper, Monte Carlo simulations are provided to verify the accuracy of our analysis. The results show that IQI and ipSIC have significant deleterious effects on the outage performance. It is also demonstrated that the outage behaviours of the conventional OMA approach are worse than those of NOMA. In addition, it is found that residual interference signals (IS) can result in error floors for the outage probability and zero diversity orders. Finally, the system throughput can be limited by IQI and ipSIC, and the system throughput converges to a fixed constant in the high SNR region
Automatic Speaker Identification System for Urdu Speech
Speaker recognition is the process of recognizing a speaker from a verbal phrase. Such systems generally operates in two ways: to identify a speaker or to verify speakerās claimed identity. Availability of valuable research material witnessed efforts paid to Automatic Speaker Identification (ASI) in East Asian, English and European languages. But unfortunately languages of South Asia especially āUrduā have got very less attention. This paper aims to describe a new feature set for ASI in Urdu speech, achieving improved performance than baseline systems. Classifiers like Neural Net, NaĆÆve Bayes and K nearest neighbor (K-NN) have been used for modeling. Results are provided on the dataset of 40 speakers with 82% correct identification. Lastly, improvement in system performance is also reported by changing number of recordings per speaker
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance
Although large language models (LLMs) has shown great performance on natural
language processing (NLP) in the financial domain, there are no publicly
available financial tailtored LLMs, instruction tuning datasets, and evaluation
benchmarks, which is critical for continually pushing forward the open-source
development of financial artificial intelligence (AI). This paper introduces
PIXIU, a comprehensive framework including the first financial LLM based on
fine-tuning LLaMA with instruction data, the first instruction data with 136K
data samples to support the fine-tuning, and an evaluation benchmark with 5
tasks and 9 datasets. We first construct the large-scale multi-task instruction
data considering a variety of financial tasks, financial document types, and
financial data modalities. We then propose a financial LLM called FinMA by
fine-tuning LLaMA with the constructed dataset to be able to follow
instructions for various financial tasks. To support the evaluation of
financial LLMs, we propose a standardized benchmark that covers a set of
critical financial tasks, including five financial NLP tasks and one financial
prediction task. With this benchmark, we conduct a detailed analysis of FinMA
and several existing LLMs, uncovering their strengths and weaknesses in
handling critical financial tasks. The model, datasets, benchmark, and
experimental results are open-sourced to facilitate future research in
financial AI.Comment: 12 pages, 1 figure
Performance Analysis of a Recycled Concrete Interfacial Transition Zone in a Rapid Carbonization Environment
Based on the characteristics of recycled concrete interface structures, a multi-interface reconstruction model was established. To study the microstructure evolution of the interfacial transition zone (ITZ) during the carbonization process of recycled concrete, the microstructure characteristics of the ITZ of C30, C40, and C50 grade recycled concrete and the mortar matrix before and after carbonization were studied through the microhardness tester and SEM. The results show that the microhardness values of the ITZ and the mortar matrix are obviously increased and that the width of the ITZ decreases, while the ITZ performance of the C50 grade recycled concrete is not significantly changed. The ITZ exhibits a large amount of granular CaCO3 after carbonization, the pores are refined, and microcracks are generated. Overall, there are significant differences in the microstructures between the ITZ and the mortar matrix before and after carbonization
- ā¦