112 research outputs found

    Statistical arbitrage powered by Explainable Artificial Intelligence

    Get PDF
    Machine learning techniques have recently become the norm for detecting patterns in financial markets. However, relying solely on machine learning algorithms for decision-making can have negative consequences, especially in a critical domain such as the financial one. On the other hand, it is well-known that transforming data into actionable insights can pose a challenge even for seasoned practitioners, particularly in the financial world. Given these compelling reasons, this work proposes a machine learning approach powered by eXplainable Artificial Intelligence techniques integrated into a statistical arbitrage trading pipeline. Specifically, we propose three methods to discard irrelevant features for the prediction task. We evaluate the approaches on historical data of component stocks of the S&P500 index and aim at improving not only the prediction performance at the stock level but also overall at the stock set level. Our analysis shows that our trading strategies that include such feature selection methods improve the portfolio performances by providing predictive signals whose information content suffices and is less noisy than the one embedded in the whole feature set. By performing an in-depth risk-return analysis, we show that the proposed trading strategies powered by explainable AI outperform highly competitive trading strategies considered as baselines

    Ensembling and Dynamic Asset Selection for Risk-Controlled Statistical Arbitrage

    Get PDF
    In recent years, machine learning algorithms have been successfully employed to leverage the potential of identifying hidden patterns of financial market behavior and, consequently, have become a land of opportunities for financial applications such as algorithmic trading. In this paper, we propose a statistical arbitrage trading strategy with two key elements: an ensemble of regression algorithms for asset return prediction, followed by a dynamic asset selection. More specifically, we construct an extremely heterogeneous ensemble ensuring model diversity by using state-of-the-art machine learning algorithms, data diversity by using a feature selection process, and method diversity by using individual models for each asset, as well models that learn cross-sectional across multiple assets. Then, their predictive results are fed into a quality assurance mechanism that prunes assets with poor forecasting performance in the previous periods. We evaluate the approach on historical data of component stocks of the SP500 index. By performing an in-depth risk-return analysis, we show that this setup outperforms highly competitive trading strategies considered as baselines. Experimentally, we show that the dynamic asset selection enhances overall trading performance both in terms of return and risk. Moreover, the proposed approach proved to yield superior results during both financial turmoil and massive market growth periods, and it showed to have general application for any risk-balanced trading strategy aiming to exploit different asset classes

    A P2P Platform for real-time multicast video streaming leveraging on scalable multiple descriptions to cope with bandwidth fluctuations

    Get PDF
    In the immediate future video distribution applications will increase their diffusion thanks tothe ever-increasing user capabilities and improvements in the Internet access speed and performance.The target of this paper is to propose a content delivery system for real-time streaming services based ona peer-to-peer approach that exploits multicast overlay organization of the peers to address thechallenges due to bandwidth heterogeneity. To improve reliability and flexibility, video is coded using ascalable multiple description approach that allows delivery of sub-streams over multiple trees andallows rate adaptation along the trees as the available bandwidth changes. Moreover, we have deployeda new algorithm for tree-based topology management of the overlay network. In fact, tree based overlaynetworks better perform in terms of end-to-end delay and ordered delivery of video flow packets withrespect to mesh based ones. We also show with a case study that the proposed system works better thansimilar systems using only either multicast or multiple trees

    Citation prediction by leveraging transformers and natural language processing heuristics

    Get PDF
    In scientific papers, it is common practice to cite other articles to substantiate claims, provide evidence for factual assertions, reference limitations, and research gaps, and fulfill various other purposes. When authors include a citation in a given sentence, there are two considerations they need to take into account: (i) where in the sentence to place the citation and (ii) which citation to choose to support the underlying claim. In this paper, we focus on the first task as it allows multiple potential approaches that rely on the researcher's individual style and the specific norms and conventions of the relevant scientific community. We propose two automatic methodologies that leverage transformers architecture for either solving a Mask-Filling problem or a Named Entity Recognition problem. On top of the results of the proposed methodologies, we apply ad-hoc Natural Language Processing heuristics to further improve their outcome. We also introduce s2orc-9K, an open dataset for fine-tuning models on this task. A formal evaluation demonstrates that the generative approach significantly outperforms five alternative methods when fine-tuned on the novel dataset. Furthermore, this model's results show no statistically significant deviation from the outputs of three senior researchers

    Explainable Machine Learning Exploiting News and Domain-Specific Lexicon for Stock Market Forecasting

    Get PDF
    In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the SP 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label (high or low) represents the underlying company's stock price variation on the next day, being either higher or lower than a certain threshold. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, shows that our approach clearly outperforms the competitors. Moreover, the devised Artificial Intelligence (AI) approach is explainable, in the sense that we analyze the white-box behind the classifier and provide a set of explanations on the obtained results

    Evaluation of Variability in the Sweet Orange Germplasm through Next Generation Clonal Fingerprinting

    Get PDF
    The great phenotypic variability characterizing the sweet orange [Citrus sinensis(L.) Osbeck] germplasm arises from spontaneous bud mutations, causing a diversification into major groups (common, Navel and blood oranges). A huge divergence also occurred within each varietal group. The genetic basis of such variability, also including nutritional and qualitative traits (ripening time, colour, fruit shape, acidity, sugars), is currently uncharacterized, and therefore not exploitable. With the aim of describing the somatic mutation events in the sweet orange group a deep-sequencing of 20 Italian and foreign accessions was performed by Illumina platform, allowing the identification of single nucleotide polymorphisms (SNPs), structural variants (SVs) and large deletions, specific to each varietal group or clone-specific. A subset of SNPs used for the design of two 384 SNP - GoldenGate Assays allowed to genotype 225 CREA sweet orange accessions. The developed markers represent the first reliable molecular tools able to unambiguously fingerprint each somatic mutant. Moreover, they might be used to associate mutations with phenotypic traits, and are a powerful tool for traceability. By using the GoldenGate assay, we have been able to fingerprint several blood orange clones starting from DNAs isolated from leaves or juice. These tools will potentially provide the consumer with a guarantee on the quality and origin of juices, avoiding eventual frauds

    Arachidonic acid-evoked Ca^{2+} signals promote nitric oxide release and proliferation in human endothelial colony forming cells

    Get PDF
    Arachidonic acid (AA) stimulates endothelial cell (EC) proliferation through an increase in intracellular Ca^{2+} concentration ([Ca^{2+}]_{i}), that, in turn, promotes nitric oxide (NO) release. AA-evoked Ca^{2+} signals are mainly mediated by Transient Receptor Potential Vanilloid 4 (TRPV4) channels. Circulating endothelial colony forming cells (ECFCs) represent the only established precursors of ECs. In the present study, we, therefore, sought to elucidate whether AA promotes human ECFC (hECFC) proliferation through an increase in [Ca^{2+}]_{i} and the following activation of the endothelial NO synthase (eNOS). AA induced a dose-dependent [Ca^{2+}]_{i} raise that was mimicked by its non-metabolizable analogue eicosatetraynoic acid. AA-evoked Ca^{2+} signals required both intracellular Ca^{2+} release and external Ca^{2+} inflow. AA-induced Ca^{2+} release was mediated by inositol-1,4,5-trisphosphate receptors from the endoplasmic reticulum and by two pore channel 1 from the acidic stores of the endolysosomal system. AA-evoked Ca^{2+} entry was, in turn, mediated by TRPV4, while it did not involve store-operated Ca^{2+} entry. Moreover, AA caused an increase in NO levels which was blocked by preventing the concomitant increase in [Ca^{2+}]_{i} and by inhibiting eNOS activity with NG-nitro-l-arginine methyl ester (l-NAME). Finally, AA per se did not stimulate hECFC growth, but potentiated growth factors-induced hECFC proliferation in a Ca^{2+} - and NO-dependent manner. Therefore, AA-evoked Ca^{2+} signals emerge as an additional target to prevent cancer vascularisation, which may be sustained by ECFC recruitment

    Blood orange juice inhibits fat accumulation in mice

    Get PDF
    Objective: To analyze the effect of the juice obtained from two varieties of sweet orange (Citrus sinensis L. Osbeck), Moro (a blood orange) and Navelina (a blond orange), on fat accumulation in mice fed a standard or a high-fat diet (HFD). Methods: Obesity was induced in male C57/Bl6 mice by feeding a HFD. Moro and Navelina juices were provided instead of water. The effect of an anthocyanin-enriched extract from Moro oranges or purified cyanidin-3-glucoside (C3G) was also analyzed. Body weight and food intake were measured regularly over a 12-week period. The adipose pads were weighted and analyzed histologically; total RNA was also isolated for microarray analysis. Results: Dietary supplementation of Moro juice, but not Navelina juice significantly reduced body weight gain and fat accumulation regardless of the increased energy intake because of sugar content. Furthermore, mice drinking Moro juice were resistant to HFD-induced obesity with no alterations in food intake. Only the anthocyanin extract, but not the purified C3G, slightly affected fat accumulation. High-throughput gene expression analysis of fat tissues confirmed that Moro juice could entirely rescue the high fat-induced transcriptional reprogramming. Conclusion: Moro juice anti-obesity effect on fat accumulation cannot be explained only by its anthocyanin content. Our findings suggest that multiple components present in the Moro orange juice might act synergistically to inhibit fat accumulation

    Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

    Get PDF
    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes-a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes-and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement. (Résumé d'auteur
    • …
    corecore