239 research outputs found
Faster Algorithms for Structured Linear and Kernel Support Vector Machines
Quadratic programming is a ubiquitous prototype in convex programming. Many
combinatorial optimizations on graphs and machine learning problems can be
formulated as quadratic programming; for example, Support Vector Machines
(SVMs). Linear and kernel SVMs have been among the most popular models in
machine learning over the past three decades, prior to the deep learning era.
Generally, a quadratic program has an input size of , where
is the number of variables. Assuming the Strong Exponential Time Hypothesis
(), it is known that no algorithm exists
(Backurs, Indyk, and Schmidt, NIPS'17). However, problems such as SVMs usually
feature much smaller input sizes: one is given data points, each of
dimension , with . Furthermore, SVMs are variants with only
linear constraints. This suggests that faster algorithms are feasible, provided
the program exhibits certain underlying structures.
In this work, we design the first nearly-linear time algorithm for solving
quadratic programs whenever the quadratic objective has small treewidth or
admits a low-rank factorization, and the number of linear constraints is small.
Consequently, we obtain a variety of results for SVMs:
* For linear SVM, where the quadratic constraint matrix has treewidth ,
we can solve the corresponding program in time ;
* For linear SVM, where the quadratic constraint matrix admits a low-rank
factorization of rank-, we can solve the corresponding program in time
;
* For Gaussian kernel SVM, where the data dimension and
the squared dataset radius is small, we can solve it in time
. We also prove that when the squared dataset
radius is large, then time is required.Comment: New results: almost-linear time algorithm for Gaussian kernel SVM and
complementary lower bounds. Abstract shortened to meet arxiv requiremen
Efficient Algorithm for Solving Hyperbolic Programs
Hyperbolic polynomials is a class of real-roots polynomials that has wide
range of applications in theoretical computer science. Each hyperbolic
polynomial also induces a hyperbolic cone that is of particular interest in
optimization due to its generality, as by choosing the polynomial properly, one
can easily recover the classic optimization problems such as linear programming
and semidefinite programming. In this work, we develop efficient algorithms for
hyperbolic programming, the problem in each one wants to minimize a linear
objective, under a system of linear constraints and the solution must be in the
hyperbolic cone induced by the hyperbolic polynomial. Our algorithm is an
instance of interior point method (IPM) that, instead of following the central
path, it follows the central Swath, which is a generalization of central path.
To implement the IPM efficiently, we utilize a relaxation of the hyperbolic
program to a quadratic program, coupled with the first four moments of the
hyperbolic eigenvalues that are crucial to update the optimization direction.
We further show that, given an evaluation oracle of the polynomial, our
algorithm only requires oracle calls, where is the number
of variables and is the degree of the polynomial, with extra arithmetic operations, where is the number of constraints
Jerk as a Method of Identifying Physical Fatigue and Skill Level in Construction Work
Researchers have shown that physically demanding work, characterized by forceful exertions, repetition, and prolonged duration can result in fatigue. Physical fatigue has been identified as a risk factor for both acute and cumulative injuries. Thus, monitoring worker fatigue levels is highly important in health and safety programs as it supports proactive measures to prevent or reduce instances of injury to workers. Recent advancements in sensing technologies, including inertial measurement units (IMUs), present an opportunity for the real-time assessment of individuals' physical exposures. These sensors also exceed the ability of mature motion capture technologies to accurately provide fundamental parameters such as acceleration and its derivative, jerk.
Although jerk has been used for a variety of clinical application to assess motor control, it has seldom been studied for applications in physically-demanding occupations that are directly related to physical fatigue detection. This research uses IMU-based motion tracking suits to evaluate the use of jerk to detect changes in motor control. Since fatigue degrades motor control, and thus motion smoothness, it is expected that jerk values will increase with fatigue. Jerk can be felt as the change in force on the body leading to biomechanical injuries over time. Although it is known that fatigue contributes to a decline in motor control, there are no explicit studies that show the relationship between jerk and fatigue. In addition, jerk as it relates to skill level of highly repetitive and demanding work has also remained unexplored. To examine these relationships, our first study evaluates: 1) the use of jerk to detect changes in motor control arising from physical exertion and 2) differences in jerk values between motions performed by workers with varying skill levels. Additionally, we conducted a second study to assess the suitability of machine learning techniques for automated physical fatigue monitoring.
Bricklaying experiments were conducted with participants recruited from the Ontario Brick and Stone Mason apprenticeship program. Participants were classified into four groups based on their level of masonry experience including novices, first-year apprentices, third-year apprentices, and journeymen who have greater than five years of experience. In our first study, jerk analysis was carried out on eleven body segments, namely the pelvis, and the dominant and non-dominant upper and lower limb segments. Our findings show that jerk values were consistently lowest for journeymen and highest for third-year apprentices across all eleven body segments. These findings suggest that the experience that journeymen gain over the course of their career improves their ability to perform repetitive heavy lifts with smoother motions and greater control. Third-year apprentices performed lifts with the greatest jerk values, indicating poor motor performance. Attributed to this finding was the pressure that third-year apprentices felt to match their production levels to that of journeymen’s, leading third-year apprentices to use jerkier, less controlled motions. Novices and first-year apprentices showed more caution towards risks of injury, moving with greater motor control, compared to the more experienced third-year apprentices. However, the production levels of novices and first-year apprentices falter far behind the production levels of other experience groups. Detectable increases between jerk values during the beginning (rested) and end (exerted) of the task were found only for the journeymen, which is attributed to their greater interpersonal similarities in learned technique and work pace.
In our second study, we investigated the use of support-vector machines (SVM) to automate the monitoring of physical exertion levels using jerk. The jerk values of the pelvis, upper arms, and thighs were used to classify inter-and intra-subject rested and exerted states. As expected, classification results demonstrated a significantly higher intra-subject rested/exerted classification than the inter-subject classification. On average, intra-subject classification achieved an accuracy of 94% for the wall building experiment and 80% for the first-course-of-masonry-units experiment.
The thesis findings lead us to conclude that: 1) jerk changes resulting from physical exertion and skill level can be assessed using IMUs, and 2) SVMs have the ability to automatically classify rested and exerted movements. The investigated jerk analysis holds promise for in-situ and real-time monitoring of physical exertion and fatigue which can help in reducing work-related injuries and illnesses
Streaming Semidefinite Programs: Passes, Small Space and Fast Runtime
We study the problem of solving semidefinite programs (SDP) in the streaming
model. Specifically, constraint matrices and a target matrix , all of
size together with a vector are streamed to us
one-by-one. The goal is to find a matrix such
that is maximized, subject to
for all and . Previous algorithmic studies of SDP
primarily focus on \emph{time-efficiency}, and all of them require a
prohibitively large space in order to store \emph{all the
constraints}. Such space consumption is necessary for fast algorithms as it is
the size of the input. In this work, we design an interior point method (IPM)
that uses space, which is strictly sublinear in the
regime . Our algorithm takes passes, which
is standard for IPM. Moreover, when is much smaller than , our algorithm
also matches the time complexity of the state-of-the-art SDP solvers. To
achieve such a sublinear space bound, we design a novel sketching method that
enables one to compute a spectral approximation to the Hessian matrix in
space. To the best of our knowledge, this is the first method that
successfully applies sketching technique to improve SDP algorithm in terms of
space (also time)
Dynamic Tensor Product Regression
In this work, we initiate the study of \emph{Dynamic Tensor Product
Regression}. One has matrices and a label vector , and the goal is to solve the regression problem with the design matrix
being the tensor product of the matrices i.e.
. At each time step, one matrix receives a sparse change, and
the goal is to maintain a sketch of the tensor product so that the regression solution can be updated quickly.
Recomputing the solution from scratch for each round is very slow and so it is
important to develop algorithms which can quickly update the solution with the
new design matrix. Our main result is a dynamic tree data structure where any
update to a single matrix can be propagated quickly throughout the tree. We
show that our data structure can be used to solve dynamic versions of not only
Tensor Product Regression, but also Tensor Product Spline regression (which is
a generalization of ridge regression) and for maintaining Low Rank
Approximations for the tensor product.Comment: NeurIPS 202
Solving Attention Kernel Regression Problem via Pre-conditioner
The attention mechanism is the key to large language models, and the
attention matrix serves as an algorithmic and computational bottleneck for such
a scheme. In this paper, we define two problems, motivated by designing fast
algorithms for proxy of attention matrix and solving regressions against them.
Given an input matrix with and a
response vector , we first consider the matrix exponential of the matrix
as a proxy, and we in turn design algorithms for two types of
regression problems: and
for any positive integer .
Studying algorithms for these regressions is essential, as matrix exponential
can be approximated term-by-term via these smaller problems. The second proxy
is applying exponential entrywise to the Gram matrix, denoted by
and solving the regression . We call this problem the attention
kernel regression problem, as the matrix could be viewed as a
kernel function with respect to . We design fast algorithms for these
regression problems, based on sketching and preconditioning. We hope these
efforts will provide an alternative perspective of studying efficient
approximation of attention matrices.Comment: AISTATS 202
Accelerating Frank-Wolfe Algorithm using Low-Dimensional and Adaptive Data Structures
In this paper, we study the problem of speeding up a type of optimization
algorithms called Frank-Wolfe, a conditional gradient method. We develop and
employ two novel inner product search data structures, improving the prior
fastest algorithm in [Shrivastava, Song and Xu, NeurIPS 2021].
* The first data structure uses low-dimensional random projection to reduce
the problem to a lower dimension, then uses efficient inner product data
structure. It has preprocessing time and
per iteration cost for small constant .
* The second data structure leverages the recent development in adaptive
inner product search data structure that can output estimations to all inner
products. It has preprocessing time and per iteration cost
.
The first algorithm improves the state-of-the-art (with preprocessing time
and per iteration cost ) in all
cases, while the second one provides an even faster preprocessing time and is
suitable when the number of iterations is small
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time
Given a matrix , the low rank matrix completion
problem asks us to find a rank- approximation of as for and by only observing a
few entries specified by a set of entries . In
particular, we examine an approach that is widely used in practice -- the
alternating minimization framework. Jain, Netrapalli and Sanghavi~\cite{jns13}
showed that if has incoherent rows and columns, then alternating
minimization provably recovers the matrix by observing a nearly linear in
number of entries. While the sample complexity has been subsequently
improved~\cite{glz17}, alternating minimization steps are required to be
computed exactly. This hinders the development of more efficient algorithms and
fails to depict the practical implementation of alternating minimization, where
the updates are usually performed approximately in favor of efficiency.
In this paper, we take a major step towards a more efficient and error-robust
alternating minimization framework. To this end, we develop an analytical
framework for alternating minimization that can tolerate moderate amount of
errors caused by approximate updates. Moreover, our algorithm runs in time
, which is nearly linear in the time to verify the
solution while preserving the sample complexity. This improves upon all prior
known alternating minimization approaches which require time.Comment: Improve the runtime from to $O|\Omega| k)
- …