3,561 research outputs found

    Toward a generic representation of random variables for machine learning

    Full text link
    This paper presents a pre-processing and a distance which improve the performance of machine learning algorithms working on independent and identically distributed stochastic processes. We introduce a novel non-parametric approach to represent random variables which splits apart dependency and distribution without losing any information. We also propound an associated metric leveraging this representation and its statistical estimate. Besides experiments on synthetic datasets, the benefits of our contribution is illustrated through the example of clustering financial time series, for instance prices from the credit default swaps market. Results are available on the website www.datagrapple.com and an IPython Notebook tutorial is available at www.datagrapple.com/Tech for reproducible research.Comment: submitted to Pattern Recognition Letter

    A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series

    Full text link
    We present in this paper an empirical framework motivated by the practitioner point of view on stability. The goal is to both assess clustering validity and yield market insights by providing through the data perturbations we propose a multi-view of the assets' clustering behaviour. The perturbation framework is illustrated on an extensive credit default swap time series database available online at www.datagrapple.com.Comment: Accepted at ICMLA 201

    Some Reflections on the “New Law of the Sea”

    Get PDF
    The fortieth anniversary of the opening for signature of the UN Convention on the Law of the Sea provides an opportunity to offer some reflections on the “new law of the sea”; that is, the comprehensive set of rules that was the result of an unprecedented multilateral negotiation process: the Third United Nations Conference on the Law of the Sea (1973–1982). The focus of this article is on the adjudication of sea-related disputes in the “new law of the sea.” In an international legal order where access to a judge or an arbitrator requires the consent of both parties to the dispute, the importance of such a mechanism—which is absent in most multilateral environmental treaties—should be underlined. Indeed, under Section 2 of Part XV (“Compulsory procedures entailing binding decisions”), a State party to the Convention which alleges that another State party has breached a provision is entitled to bring such a dispute to a third-party mechanism. This could be the International Court of Justice, the International Tribunal for the Law of the Sea, or arbitration. This article addresses four topics: the increase of maritime disputes brought before international courts and tribunals over the last forty years; the relative harmony in the jurisprudence produced by those international courts and tribunals; the importance of jurisdictional basis in law of the sea disputes; and future disputes in the law of the sea

    Clustering Financial Time Series: How Long is Enough?

    Get PDF
    Researchers have used from 30 days to several years of daily returns as source data for clustering financial time series based on their correlations. This paper sets up a statistical framework to study the validity of such practices. We first show that clustering correlated random variables from their observed values is statistically consistent. Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out.Comment: Accepted at IJCAI 201
    • …
    corecore