504 research outputs found
A Scalable Algorithm For Sparse Portfolio Selection
The sparse portfolio selection problem is one of the most famous and
frequently-studied problems in the optimization and financial economics
literatures. In a universe of risky assets, the goal is to construct a
portfolio with maximal expected return and minimum variance, subject to an
upper bound on the number of positions, linear inequalities and minimum
investment constraints. Existing certifiably optimal approaches to this problem
do not converge within a practical amount of time at real world problem sizes
with more than 400 securities. In this paper, we propose a more scalable
approach. By imposing a ridge regularization term, we reformulate the problem
as a convex binary optimization problem, which is solvable via an efficient
outer-approximation procedure. We propose various techniques for improving the
performance of the procedure, including a heuristic which supplies high-quality
warm-starts, a preprocessing technique for decreasing the gap at the root node,
and an analytic technique for strengthening our cuts. We also study the
problem's Boolean relaxation, establish that it is second-order-cone
representable, and supply a sufficient condition for its tightness. In
numerical experiments, we establish that the outer-approximation procedure
gives rise to dramatic speedups for sparse portfolio selection problems.Comment: Submitted to INFORMS Journal on Computin
Sparse PCA With Multiple Components
Sparse Principal Component Analysis (sPCA) is a cardinal technique for
obtaining combinations of features, or principal components (PCs), that explain
the variance of high-dimensional datasets in an interpretable manner. This
involves solving a sparsity and orthogonality constrained convex maximization
problem, which is extremely computationally challenging. Most existing works
address sparse PCA via methods-such as iteratively computing one sparse PC and
deflating the covariance matrix-that do not guarantee the orthogonality, let
alone the optimality, of the resulting solution when we seek multiple mutually
orthogonal PCs. We challenge this status by reformulating the orthogonality
conditions as rank constraints and optimizing over the sparsity and rank
constraints simultaneously. We design tight semidefinite relaxations to supply
high-quality upper bounds, which we strengthen via additional second-order cone
inequalities when each PC's individual sparsity is specified. Further, we
derive a combinatorial upper bound on the maximum amount of variance explained
as a function of the support. We exploit these relaxations and bounds to
propose exact methods and rounding mechanisms that, together, obtain solutions
with a bound gap on the order of 0%-15% for real-world datasets with p = 100s
or 1000s of features and r \in {2, 3} components. Numerically, our algorithms
match (and sometimes surpass) the best performing methods in terms of fraction
of variance explained and systematically return PCs that are sparse and
orthogonal. In contrast, we find that existing methods like deflation return
solutions that violate the orthogonality constraints, even when the data is
generated according to sparse orthogonal PCs. Altogether, our approach solves
sparse PCA problems with multiple components to certifiable (near) optimality
in a practically tractable fashion.Comment: Updated version with improved algorithmics and a new section
containing a generalization of the Gershgorin circle theorem; comments or
suggestions welcom
Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach
We study the Sparse Plus Low-Rank decomposition problem (SLR), which is the
problem of decomposing a corrupted data matrix into a sparse matrix of
perturbations plus a low-rank matrix containing the ground truth. SLR is a
fundamental problem in Operations Research and Machine Learning which arises in
various applications, including data compression, latent semantic indexing,
collaborative filtering, and medical imaging. We introduce a novel formulation
for SLR that directly models its underlying discreteness. For this formulation,
we develop an alternating minimization heuristic that computes high-quality
solutions and a novel semidefinite relaxation that provides meaningful bounds
for the solutions returned by our heuristic. We also develop a custom
branch-and-bound algorithm that leverages our heuristic and convex relaxations
to solve small instances of SLR to certifiable (near) optimality. Given an
input -by- matrix, our heuristic scales to solve instances where
in minutes, our relaxation scales to instances where in
hours, and our branch-and-bound algorithm scales to instances where in
minutes. Our numerical results demonstrate that our approach outperforms
existing state-of-the-art approaches in terms of rank, sparsity, and
mean-square error while maintaining a comparable runtime
A Stochastic Benders Decomposition Scheme for Large-Scale Data-Driven Network Design
Network design problems involve constructing edges in a transportation or
supply chain network to minimize construction and daily operational costs. We
study a data-driven version of network design where operational costs are
uncertain and estimated using historical data. This problem is notoriously
computationally challenging, and instances with as few as fifty nodes cannot be
solved to optimality by current decomposition techniques. Accordingly, we
propose a stochastic variant of Benders decomposition that mitigates the high
computational cost of generating each cut by sampling a subset of the data at
each iteration and nonetheless generates deterministically valid cuts (as
opposed to the probabilistically valid cuts frequently proposed in the
stochastic optimization literature) via a dual averaging technique. We
implement both single-cut and multi-cut variants of this Benders decomposition
algorithm, as well as a k-cut variant that uses clustering of the historical
scenarios. On instances with 100-200 nodes, our algorithm achieves 4-5%
optimality gaps, compared with 13-16% for deterministic Benders schemes, and
scales to instances with 700 nodes and 50 commodities within hours. Beyond
network design, our strategy could be adapted to generic two-stage stochastic
mixed-integer optimization problems where second-stage costs are estimated via
a sample average
AI Hilbert: A New Paradigm for Scientific Discovery by Unifying Data and Background Knowledge
The discovery of scientific formulae that parsimoniously explain natural
phenomena and align with existing background theory is a key goal in science.
Historically, scientists have derived natural laws by manipulating equations
based on existing knowledge, forming new equations, and verifying them
experimentally. In recent years, data-driven scientific discovery has emerged
as a viable competitor in settings with large amounts of experimental data.
Unfortunately, data-driven methods often fail to discover valid laws when data
is noisy or scarce. Accordingly, recent works combine regression and reasoning
to eliminate formulae inconsistent with background theory. However, the problem
of searching over the space of formulae consistent with background theory to
find one that fits the data best is not well-solved. We propose a solution to
this problem when all axioms and scientific laws are expressible via polynomial
equalities and inequalities and argue that our approach is widely applicable.
We further model notions of minimal complexity using binary variables and
logical constraints, solve polynomial optimization problems via mixed-integer
linear or semidefinite optimization, and prove the validity of our scientific
discoveries in a principled manner using Positivestellensatz certificates.
Remarkably, the optimization techniques leveraged in this paper allow our
approach to run in polynomial time with fully correct background theory, or
non-deterministic polynomial (NP) time with partially correct background
theory. We demonstrate that some famous scientific laws, including Kepler's
Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated
Gravitational Wave Power equation, can be derived in a principled manner from
background axioms and experimental data.Comment: Slightly revised from version 1, in particular polished the figure
Non-Chemical Stressors and Cumulative Risk Assessment: An Overview of Current Initiatives and Potential Air Pollutant Interactions
Regulatory agencies are under increased pressure to consider broader public health concerns that extend to multiple pollutant exposures, multiple exposure pathways, and vulnerable populations. Specifically, cumulative risk assessment initiatives have stressed the importance of considering both chemical and non-chemical stressors, such as socioeconomic status (SES) and related psychosocial stress, in evaluating health risks. The integration of non-chemical stressors into a cumulative risk assessment framework has been largely driven by evidence of health disparities across different segments of society that may also bear a disproportionate risk from chemical exposures. This review will discuss current efforts to advance the field of cumulative risk assessment, highlighting some of the major challenges, discussed within the construct of the traditional risk assessment paradigm. Additionally, we present a summary of studies of potential interactions between social stressors and air pollutants on health as an example of current research that supports the incorporation of non-chemical stressors into risk assessment. The results from these studies, while suggestive of possible interactions, are mixed and hindered by inconsistent application of social stress indicators. Overall, while there have been significant advances, further developments across all of the risk assessment stages (i.e., hazard identification, exposure assessment, dose-response, and risk characterization) are necessary to provide a scientific basis for regulatory actions and effective community interventions, particularly when considering non-chemical stressors. A better understanding of the biological underpinnings of social stress on disease and implications for chemical-based dose-response relationships is needed. Furthermore, when considering non-chemical stressors, an appropriate metric, or series of metrics, for risk characterization is also needed. Cumulative risk assessment research will benefit from coordination of information from several different scientific disciplines, including, for example, toxicology, epidemiology, nutrition, neurotoxicology, and the social sciences
Rare coding variants in PLCG2, ABI3, and TREM2 implicate microglial-mediated innate immunity in Alzheimer's disease
We identified rare coding variants associated with Alzheimer’s disease (AD) in a 3-stage case-control study of 85,133 subjects. In stage 1, 34,174 samples were genotyped using a whole-exome microarray. In stage 2, we tested associated variants (P<1×10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, an additional 14,997 samples were used to test the most significant stage 2 associations (P<5×10-8) using imputed genotypes. We observed 3 novel genome-wide significant (GWS) AD associated non-synonymous variants; a protective variant in PLCG2 (rs72824905/p.P522R, P=5.38×10-10, OR=0.68, MAFcases=0.0059, MAFcontrols=0.0093), a risk variant in ABI3 (rs616338/p.S209F, P=4.56×10-10, OR=1.43, MAFcases=0.011, MAFcontrols=0.008), and a novel GWS variant in TREM2 (rs143332484/p.R62H, P=1.55×10-14, OR=1.67, MAFcases=0.0143, MAFcontrols=0.0089), a known AD susceptibility gene. These protein-coding changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified AD risk genes. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to AD development
Penilaian Kinerja Keuangan Koperasi di Kabupaten Pelalawan
This paper describe development and financial performance of cooperative in District Pelalawan among 2007 - 2008. Studies on primary and secondary cooperative in 12 sub-districts. Method in this stady use performance measuring of productivity, efficiency, growth, liquidity, and solvability of cooperative. Productivity of cooperative in Pelalawan was highly but efficiency still low. Profit and income were highly, even liquidity of cooperative very high, and solvability was good
Juxtaposing BTE and ATE – on the role of the European insurance industry in funding civil litigation
One of the ways in which legal services are financed, and indeed shaped, is through private insurance arrangement. Two contrasting types of legal expenses insurance contracts (LEI) seem to dominate in Europe: before the event (BTE) and after the event (ATE) legal expenses insurance. Notwithstanding institutional differences between different legal systems, BTE and ATE insurance arrangements may be instrumental if government policy is geared towards strengthening a market-oriented system of financing access to justice for individuals and business. At the same time, emphasizing the role of a private industry as a keeper of the gates to justice raises issues of accountability and transparency, not readily reconcilable with demands of competition. Moreover, multiple actors (clients, lawyers, courts, insurers) are involved, causing behavioural dynamics which are not easily predicted or influenced.
Against this background, this paper looks into BTE and ATE arrangements by analysing the particularities of BTE and ATE arrangements currently available in some European jurisdictions and by painting a picture of their respective markets and legal contexts. This allows for some reflection on the performance of BTE and ATE providers as both financiers and keepers. Two issues emerge from the analysis that are worthy of some further reflection. Firstly, there is the problematic long-term sustainability of some ATE products. Secondly, the challenges faced by policymakers that would like to nudge consumers into voluntarily taking out BTE LEI
- …