65 research outputs found

    Sector identification in a set of stock return time series traded at the London Stock Exchange

    Full text link
    We compare some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a portfolio of stocks traded at the London Stock Exchange. The investigated time series are recorded both at a daily time horizon and at a 5-minute time horizon. The correlation coefficient matrix is very different at different time horizons confirming that more structured correlation coefficient matrices are observed for long time horizons. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However different methods present a different degree of sensitivity with respect to different sectors. Our comparative analysis suggests that the application of just a single method could not be able to extract all the economic information present in the correlation coefficient matrix of a stock portfolio.Comment: 28 pages, 13 figures, 3 Tables. Proceedings of the conference on "Applications of Random Matrices to Economy and other Complex Systems", Krakow (Poland), May 25-28 2005. Submitted for pubblication to Acta Phys. Po

    Economic sector identification in a set of stocks traded at the New York Stock Exchange: a comparative analysis

    Get PDF
    We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorded at a daily time horizon. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However, different methodologies provide different information about the considered set. Our comparative analysis suggests that the application of just a single method could not be able to extract all the economic information present in the correlation coefficient matrix of a set of stocks.Comment: 13 pages, 8 figures, 2 Table

    Gene-based and semantic structure of the Gene Ontology as a complex network

    Get PDF
    The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The Gene Ontology (GO) is constantly evolving over time. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. Here we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium. Moreover, the GO is a natural example of bipartite network of terms and genes. Here we are interested in studying the properties of the projected network of terms, i.e. a gene-based weighted network of GO terms, in which a link between any two terms is set if at least one gene is annotated in both terms. One aim of the present paper is to compare the structural properties of the semantic and the gene-based network. The relative importance of terms is very similar in the two networks, but the community structure changes. We show that in some cases GO terms that appear to be distinct from a semantic point of view are instead connected, and appear in the same community when considering their gene content. The identification of such gene-based communities of terms might therefore be the basis of a simple protocol aiming at improving the semantic structure of GO. Information about terms that share large gene content might also be important from a biomedical point of view, as it might reveal how genes over-expressed in a certain term also affect other biological processes, molecular functions and cellular components not directly linked according to GO semantics

    Spanning Trees and bootstrap reliability estimation in correlation based networks

    Get PDF
    We introduce a new technique to associate a spanning tree to the average linkage cluster analysis. We term this tree as the Average Linkage Minimum Spanning Tree. We also introduce a technique to associate a value of reliability to links of correlation based graphs by using bootstrap replicas of data. Both techniques are applied to the portfolio of the 300 most capitalized stocks traded at New York Stock Exchange during the time period 2001-2003. We show that the Average Linkage Minimum Spanning Tree recognizes economic sectors and sub-sectors as communities in the network slightly better than the Minimum Spanning Tree does. We also show that the average reliability of links in the Minimum Spanning Tree is slightly greater than the average reliability of links in the Average Linkage Minimum Spanning Tree.Comment: 17 pages, 3 figure

    Networks in biological systems: An investigation of the Gene Ontology as an evolving network

    Get PDF
    Many biological systems can be described as networks where different elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology

    Emergence of time-horizon invariant correlation structure in financial returns by subtraction of the market mode

    Get PDF
    We investigate the emergence of a structure in the correlation matrix of assets' returns as the time-horizon over which returns are computed increases from the minutes to the daily scale. We analyze data from different stock markets (New York, Paris, London, Milano) and with different methods. Result crucially depends on whether the data is restricted to the ``internal'' dynamics of the market, where the ``center of mass'' motion (the market mode) is removed or not. If the market mode is not removed, we find that the structure emerges, as the time-horizon increases, from splitting a single large cluster. In NYSE we find that when the market mode is removed, the structure of correlation at the daily scale is already well defined at the 5 minutes time-horizon, and this structure accounts for 80 % of the classification of stocks in economic sectors. Similar results, though less sharp, are found for the other markets. We also find that the structure of correlations in the overnight returns is markedly different from that of intraday activity.Comment: 12 pages, 17 figure

    Correlation, Network and Multifractal Analysis of Global Financial Indices

    Full text link
    We apply RMT, Network and MF-DFA methods to investigate correlation, network and multifractal properties of 20 global financial indices. We compare results before and during the financial crisis of 2008 respectively. We find that the network method gives more useful information about the formation of clusters as compared to results obtained from eigenvectors corresponding to second largest eigenvalue and these sectors are formed on the basis of geographical location of indices. At threshold 0.6, indices corresponding to Americas, Europe and Asia/Pacific disconnect and form different clusters before the crisis but during the crisis, indices corresponding to Americas and Europe are combined together to form a cluster while the Asia/Pacific indices forms another cluster. By further increasing the value of threshold to 0.9, European countries France, Germany and UK constitute the most tightly linked markets. We study multifractal properties of global financial indices and find that financial indices corresponding to Americas and Europe almost lie in the same range of degree of multifractality as compared to other indices. India, South Korea, Hong Kong are found to be near the degree of multifractality of indices corresponding to Americas and Europe. A large variation in the degree of multifractality in Egypt, Indonesia, Malaysia, Taiwan and Singapore may be a reason that when we increase the threshold in financial network these countries first start getting disconnected at low threshold from the correlation network of financial indices. We fit Binomial Multifractal Model (BMFM) to these financial markets.Comment: 32 pages, 25 figures, 1 tabl

    Detecting significant features in modeling microRNA-target interactions

    Get PDF
    MicroRNAs (miRNAs) are small non-coding RNA molecules mediating the translational repression and degradation of target mRNAs in the cell. Mature miRNAs are used as a template by the RNA-induced silencing complex (RISC) to recognize the complementary mRNAs to be regulated. Up to 60% of human genes are putative targets of one or more miRNAs. Several prediction tools are available to suggest putative miRNA targets, however, only a small part of the interaction pairs has been validated by experimental approaches. The analysis of the expression profile of the RNA fraction immunoprecipitated (IP) with the RISC proteins is an established method to detect which genes are actually regulated by the RISC machinery. In fact, genes that result over-expressed in the IP sample with respect to the whole cell lysate RNA, are considered as involved in the RISC complex, then miRNA targets. Here, we aim to find the features useful to predict which genes are overexpressed in IP, i.e. miRNA targets, without actually performing the IP experiments. To this purpose, we compiled and analyzed a novel high throughput data set suitable to unravel the features involved in the miRNA regulatory activities. We analyzed IP samples obtained by the immunoprecipitation of two RISC proteins, AGO2 and GW182. The two proteins shows different behaviors, in terms of enriched genes and features characterizing the immunoprecipitated RNA fractio. Further analysis is needed to unravel the reason of such different behavior

    Hierarchically nested factor model from multivariate data

    Full text link
    We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.Comment: 7 pages, 5 figures; accepted for publication in Europhys. Lett. ; the Appendix corresponds to the additional material of the accepted letter

    Usefulness of regional right ventricular and right atrial strain for prediction of early and late right ventricular failure following a left ventricular assist device implant: A machine learning approach

    Get PDF
    Background: Identifying candidates for left ventricular assist device surgery at risk of right ventricular failure remains difficult. The aim was to identify the most accurate predictors of right ventricular failure among clinical, biological, and imaging markers, assessed by agreement of different supervised machine learning algorithms. Methods: Seventy-four patients, referred to HeartWare left ventricular assist device since 2010 in two Italian centers, were recruited. Biomarkers, right ventricular standard, and strain echocardiography, as well as cath-lab measures, were compared among patients who did not develop right ventricular failure (N = 56), those with acute–right ventricular failure (N = 8, 11%) or chronic–right ventricular failure (N = 10, 14%). Logistic regression, penalized logistic regression, linear support vector machines, and naïve Bayes algorithms with leave-one-out validation were used to evaluate the efficiency of any combination of three collected variables in an “all-subsets” approach. Results: Michigan risk score combined with central venous pressure assessed invasively and apical longitudinal systolic strain of the right ventricular–free wall were the most significant predictors of acute–right ventricular failure (maximum receiver operating characteristic–area under the curve = 0.95, 95% confidence interval = 0.91–1.00, by the naïve Bayes), while the right ventricular–free wall systolic strain of the middle segment, right atrial strain (QRS-synced), and tricuspid annular plane systolic excursion were the most significant predictors of Chronic-RVF (receiver operating characteristic–area under the curve = 0.97, 95% confidence interval = 0.91–1.00, according to naïve Bayes). Conclusion: Apical right ventricular strain as well as right atrial strain provides complementary information, both critical to predict acute–right ventricular failure and chronic–right ventricular failure, respectively
    corecore