504 research outputs found

    A Scalable Algorithm For Sparse Portfolio Selection

    Full text link
    The sparse portfolio selection problem is one of the most famous and frequently-studied problems in the optimization and financial economics literatures. In a universe of risky assets, the goal is to construct a portfolio with maximal expected return and minimum variance, subject to an upper bound on the number of positions, linear inequalities and minimum investment constraints. Existing certifiably optimal approaches to this problem do not converge within a practical amount of time at real world problem sizes with more than 400 securities. In this paper, we propose a more scalable approach. By imposing a ridge regularization term, we reformulate the problem as a convex binary optimization problem, which is solvable via an efficient outer-approximation procedure. We propose various techniques for improving the performance of the procedure, including a heuristic which supplies high-quality warm-starts, a preprocessing technique for decreasing the gap at the root node, and an analytic technique for strengthening our cuts. We also study the problem's Boolean relaxation, establish that it is second-order-cone representable, and supply a sufficient condition for its tightness. In numerical experiments, we establish that the outer-approximation procedure gives rise to dramatic speedups for sparse portfolio selection problems.Comment: Submitted to INFORMS Journal on Computin

    Sparse PCA With Multiple Components

    Full text link
    Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves solving a sparsity and orthogonality constrained convex maximization problem, which is extremely computationally challenging. Most existing works address sparse PCA via methods-such as iteratively computing one sparse PC and deflating the covariance matrix-that do not guarantee the orthogonality, let alone the optimality, of the resulting solution when we seek multiple mutually orthogonal PCs. We challenge this status by reformulating the orthogonality conditions as rank constraints and optimizing over the sparsity and rank constraints simultaneously. We design tight semidefinite relaxations to supply high-quality upper bounds, which we strengthen via additional second-order cone inequalities when each PC's individual sparsity is specified. Further, we derive a combinatorial upper bound on the maximum amount of variance explained as a function of the support. We exploit these relaxations and bounds to propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets with p = 100s or 1000s of features and r \in {2, 3} components. Numerically, our algorithms match (and sometimes surpass) the best performing methods in terms of fraction of variance explained and systematically return PCs that are sparse and orthogonal. In contrast, we find that existing methods like deflation return solutions that violate the orthogonality constraints, even when the data is generated according to sparse orthogonal PCs. Altogether, our approach solves sparse PCA problems with multiple components to certifiable (near) optimality in a practically tractable fashion.Comment: Updated version with improved algorithmics and a new section containing a generalization of the Gershgorin circle theorem; comments or suggestions welcom

    Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach

    Full text link
    We study the Sparse Plus Low-Rank decomposition problem (SLR), which is the problem of decomposing a corrupted data matrix into a sparse matrix of perturbations plus a low-rank matrix containing the ground truth. SLR is a fundamental problem in Operations Research and Machine Learning which arises in various applications, including data compression, latent semantic indexing, collaborative filtering, and medical imaging. We introduce a novel formulation for SLR that directly models its underlying discreteness. For this formulation, we develop an alternating minimization heuristic that computes high-quality solutions and a novel semidefinite relaxation that provides meaningful bounds for the solutions returned by our heuristic. We also develop a custom branch-and-bound algorithm that leverages our heuristic and convex relaxations to solve small instances of SLR to certifiable (near) optimality. Given an input nn-by-nn matrix, our heuristic scales to solve instances where n=10000n=10000 in minutes, our relaxation scales to instances where n=200n=200 in hours, and our branch-and-bound algorithm scales to instances where n=25n=25 in minutes. Our numerical results demonstrate that our approach outperforms existing state-of-the-art approaches in terms of rank, sparsity, and mean-square error while maintaining a comparable runtime

    A Stochastic Benders Decomposition Scheme for Large-Scale Data-Driven Network Design

    Full text link
    Network design problems involve constructing edges in a transportation or supply chain network to minimize construction and daily operational costs. We study a data-driven version of network design where operational costs are uncertain and estimated using historical data. This problem is notoriously computationally challenging, and instances with as few as fifty nodes cannot be solved to optimality by current decomposition techniques. Accordingly, we propose a stochastic variant of Benders decomposition that mitigates the high computational cost of generating each cut by sampling a subset of the data at each iteration and nonetheless generates deterministically valid cuts (as opposed to the probabilistically valid cuts frequently proposed in the stochastic optimization literature) via a dual averaging technique. We implement both single-cut and multi-cut variants of this Benders decomposition algorithm, as well as a k-cut variant that uses clustering of the historical scenarios. On instances with 100-200 nodes, our algorithm achieves 4-5% optimality gaps, compared with 13-16% for deterministic Benders schemes, and scales to instances with 700 nodes and 50 commodities within hours. Beyond network design, our strategy could be adapted to generic two-stage stochastic mixed-integer optimization problems where second-stage costs are estimated via a sample average

    AI Hilbert: A New Paradigm for Scientific Discovery by Unifying Data and Background Knowledge

    Full text link
    The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that fits the data best is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We further model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivestellensatz certificates. Remarkably, the optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory, or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from background axioms and experimental data.Comment: Slightly revised from version 1, in particular polished the figure

    Non-Chemical Stressors and Cumulative Risk Assessment: An Overview of Current Initiatives and Potential Air Pollutant Interactions

    Get PDF
    Regulatory agencies are under increased pressure to consider broader public health concerns that extend to multiple pollutant exposures, multiple exposure pathways, and vulnerable populations. Specifically, cumulative risk assessment initiatives have stressed the importance of considering both chemical and non-chemical stressors, such as socioeconomic status (SES) and related psychosocial stress, in evaluating health risks. The integration of non-chemical stressors into a cumulative risk assessment framework has been largely driven by evidence of health disparities across different segments of society that may also bear a disproportionate risk from chemical exposures. This review will discuss current efforts to advance the field of cumulative risk assessment, highlighting some of the major challenges, discussed within the construct of the traditional risk assessment paradigm. Additionally, we present a summary of studies of potential interactions between social stressors and air pollutants on health as an example of current research that supports the incorporation of non-chemical stressors into risk assessment. The results from these studies, while suggestive of possible interactions, are mixed and hindered by inconsistent application of social stress indicators. Overall, while there have been significant advances, further developments across all of the risk assessment stages (i.e., hazard identification, exposure assessment, dose-response, and risk characterization) are necessary to provide a scientific basis for regulatory actions and effective community interventions, particularly when considering non-chemical stressors. A better understanding of the biological underpinnings of social stress on disease and implications for chemical-based dose-response relationships is needed. Furthermore, when considering non-chemical stressors, an appropriate metric, or series of metrics, for risk characterization is also needed. Cumulative risk assessment research will benefit from coordination of information from several different scientific disciplines, including, for example, toxicology, epidemiology, nutrition, neurotoxicology, and the social sciences

    Rare coding variants in PLCG2, ABI3, and TREM2 implicate microglial-mediated innate immunity in Alzheimer's disease

    Get PDF
    We identified rare coding variants associated with Alzheimer’s disease (AD) in a 3-stage case-control study of 85,133 subjects. In stage 1, 34,174 samples were genotyped using a whole-exome microarray. In stage 2, we tested associated variants (P<1×10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, an additional 14,997 samples were used to test the most significant stage 2 associations (P<5×10-8) using imputed genotypes. We observed 3 novel genome-wide significant (GWS) AD associated non-synonymous variants; a protective variant in PLCG2 (rs72824905/p.P522R, P=5.38×10-10, OR=0.68, MAFcases=0.0059, MAFcontrols=0.0093), a risk variant in ABI3 (rs616338/p.S209F, P=4.56×10-10, OR=1.43, MAFcases=0.011, MAFcontrols=0.008), and a novel GWS variant in TREM2 (rs143332484/p.R62H, P=1.55×10-14, OR=1.67, MAFcases=0.0143, MAFcontrols=0.0089), a known AD susceptibility gene. These protein-coding changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified AD risk genes. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to AD development

    Penilaian Kinerja Keuangan Koperasi di Kabupaten Pelalawan

    Full text link
    This paper describe development and financial performance of cooperative in District Pelalawan among 2007 - 2008. Studies on primary and secondary cooperative in 12 sub-districts. Method in this stady use performance measuring of productivity, efficiency, growth, liquidity, and solvability of cooperative. Productivity of cooperative in Pelalawan was highly but efficiency still low. Profit and income were highly, even liquidity of cooperative very high, and solvability was good

    Juxtaposing BTE and ATE – on the role of the European insurance industry in funding civil litigation

    Get PDF
    One of the ways in which legal services are financed, and indeed shaped, is through private insurance arrangement. Two contrasting types of legal expenses insurance contracts (LEI) seem to dominate in Europe: before the event (BTE) and after the event (ATE) legal expenses insurance. Notwithstanding institutional differences between different legal systems, BTE and ATE insurance arrangements may be instrumental if government policy is geared towards strengthening a market-oriented system of financing access to justice for individuals and business. At the same time, emphasizing the role of a private industry as a keeper of the gates to justice raises issues of accountability and transparency, not readily reconcilable with demands of competition. Moreover, multiple actors (clients, lawyers, courts, insurers) are involved, causing behavioural dynamics which are not easily predicted or influenced. Against this background, this paper looks into BTE and ATE arrangements by analysing the particularities of BTE and ATE arrangements currently available in some European jurisdictions and by painting a picture of their respective markets and legal contexts. This allows for some reflection on the performance of BTE and ATE providers as both financiers and keepers. Two issues emerge from the analysis that are worthy of some further reflection. Firstly, there is the problematic long-term sustainability of some ATE products. Secondly, the challenges faced by policymakers that would like to nudge consumers into voluntarily taking out BTE LEI
    corecore