1,874 research outputs found
Have Econometric Analyses of Happiness Data Been Futile? A Simple Truth About Happiness Scales
Econometric analyses in the happiness literature typically use subjective
well-being (SWB) data to compare the mean of observed or latent happiness
across samples. Recent critiques show that comparing the mean of ordinal data
is only valid under strong assumptions that are usually rejected by SWB data.
This leads to an open question whether much of the empirical studies in the
economics of happiness literature have been futile. In order to salvage some of
the prior results and avoid future issues, we suggest regression analysis of
SWB (and other ordinal data) should focus on the median rather than the mean.
Median comparisons using parametric models such as the ordered probit and logit
can be readily carried out using familiar statistical softwares like STATA. We
also show a previously assumed impractical task of estimating a semiparametric
median ordered-response model is also possible by using a novel constrained
mixed integer optimization technique. We use GSS data to show the famous
Easterlin Paradox from the happiness literature holds for the US independent of
any parametric assumption
Separable Convex Optimization with Nested Lower and Upper Constraints
We study a convex resource allocation problem in which lower and upper bounds
are imposed on partial sums of allocations. This model is linked to a large
range of applications, including production planning, speed optimization,
stratified sampling, support vector machines, portfolio management, and
telecommunications. We propose an efficient gradient-free divide-and-conquer
algorithm, which uses monotonicity arguments to generate valid bounds from the
recursive calls, and eliminate linking constraints based on the information
from sub-problems. This algorithm does not need strict convexity or
differentiability. It produces an -approximate solution for the
continuous problem in time
and an integer solution in time, where is
the number of decision variables, is the number of constraints, and is
the resource bound. A complexity of is also achieved
for the linear and quadratic cases. These are the best complexities known to
date for this important problem class. Our experimental analyses confirm the
good performance of the method, which produces optimal solutions for problems
with up to 1,000,000 variables in a few seconds. Promising applications to the
support vector ordinal regression problem are also investigated
Training linear ranking SVMs in linearithmic time using red-black trees
We introduce an efficient method for training the linear ranking support
vector machine. The method combines cutting plane optimization with red-black
tree based approach to subgradient calculations, and has O(m*s+m*log(m)) time
complexity, where m is the number of training examples, and s the average
number of non-zero features per example. Best previously known training
algorithms achieve the same efficiency only for restricted special cases,
whereas the proposed approach allows any real valued utility scores in the
training data. Experiments demonstrate the superior scalability of the proposed
approach, when compared to the fastest existing RankSVM implementations.Comment: 20 pages, 4 figure
Sparse Regression with Multi-type Regularized Feature Modeling
Within the statistical and machine learning literature, regularization
techniques are often used to construct sparse (predictive) models. Most
regularization strategies only work for data where all predictors are treated
identically, such as Lasso regression for (continuous) predictors treated as
linear effects. However, many predictive problems involve different types of
predictors and require a tailored regularization term. We propose a multi-type
Lasso penalty that acts on the objective function as a sum of subpenalties, one
for each type of predictor. As such, we allow for predictor selection and level
fusion within a predictor in a data-driven way, simultaneous with the parameter
estimation process. We develop a new estimation strategy for convex predictive
models with this multi-type penalty. Using the theory of proximal operators,
our estimation procedure is computationally efficient, partitioning the overall
optimization problem into easier to solve subproblems, specific for each
predictor type and its associated penalty. Earlier research applies
approximations to non-differentiable penalties to solve the optimization
problem. The proposed SMuRF algorithm removes the need for approximations and
achieves a higher accuracy and computational efficiency. This is demonstrated
with an extensive simulation study and the analysis of a case-study on
insurance pricing analytics
Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data
This paper addresses the learning task of estimating driver drowsiness from
the signals of car acceleration sensors. Since even drivers themselves cannot
perceive their own drowsiness in a timely manner unless they use burdensome
invasive sensors, obtaining labeled training data for each timestamp is not a
realistic goal. To deal with this difficulty, we formulate the task as a weakly
supervised learning. We only need to add labels for each complete trip, not for
every timestamp independently. By assuming that some aspects of driver
drowsiness increase over time due to tiredness, we formulate an algorithm that
can learn from such weakly labeled data. We derive a scalable stochastic
optimization method as a way of implementing the algorithm. Numerical
experiments on real driving datasets demonstrate the advantages of our
algorithm against baseline methods.Comment: Accepted by ICASSP202
- …