354,921 research outputs found
Online Multistage Subset Maximization Problems
Numerous combinatorial optimization problems (knapsack, maximum-weight matching, etc.) can be expressed as subset maximization problems: One is given a ground set N={1,...,n}, a collection F subseteq 2^N of subsets thereof such that the empty set is in F, and an objective (profit) function p: F -> R_+. The task is to choose a set S in F that maximizes p(S). We consider the multistage version (Eisenstat et al., Gupta et al., both ICALP 2014) of such problems: The profit function p_t (and possibly the set of feasible solutions F_t) may change over time. Since in many applications changing the solution is costly, the task becomes to find a sequence of solutions that optimizes the trade-off between good per-time solutions and stable solutions taking into account an additional similarity bonus. As similarity measure for two consecutive solutions, we consider either the size of the intersection of the two solutions or the difference of n and the Hamming distance between the two characteristic vectors.
We study multistage subset maximization problems in the online setting, that is, p_t (along with possibly F_t) only arrive one by one and, upon such an arrival, the online algorithm has to output the corresponding solution without knowledge of the future.
We develop general techniques for online multistage subset maximization and thereby characterize those models (given by the type of data evolution and the type of similarity measure) that admit a constant-competitive online algorithm. When no constant competitive ratio is possible, we employ lookahead to circumvent this issue. When a constant competitive ratio is possible, we provide almost matching lower and upper bounds on the best achievable one
An Improved Differential Evolution Algorithm for Data Stream Clustering
A Few algorithms were actualized by the analysts for performing clustering of data streams. Most of these algorithms require that the number of clusters (K) has to be fixed by the customer based on input data and it can be kept settled all through the clustering process. Stream clustering has faced few difficulties in picking up K. In this paper, we propose an efficient approach for data stream clustering by embracing an Improved Differential Evolution (IDE) algorithm. The IDE algorithm is one of the quick, powerful and productive global optimization approach for programmed clustering. In our proposed approach, we additionally apply an entropy based method for distinguishing the concept drift in the data stream and in this way updating the clustering procedure online. We demonstrated that our proposed method is contrasted with Genetic Algorithm and identified as proficient optimization algorithm. The performance of our proposed technique is assessed and cr eates the accuracy of 92.29%, the precision is 86.96%, recall is 90.30% and F-measure estimate is 88.60%
Process Optimization: A Six Sigma DMAIC Approach
Presented online for the 2016 Research and Creative Achievement Week. Blog website: http://blog.ecu.edu/sites/davidkurgatt/Nap fabrics used in paint roller covers are required to meet nap height specifications
measured as the overall fabric thickness from its backing to meet substrate paint application
standards. Consistency in heat setting process is key to achieving customer specifications for
nap fabrics. Excessive shrinkage or variation in shrinkage during heat setting will lead to nonconforming
nap fabric heights and costly adjustments, tweaking for quality or downgrading
in downstream finishing processes.
An exploratory analysis in the measure phase revealed significant difference in yarn
shrinkage levels between suppliers. Effect of supplier and heat setting temperature levels on
yarn shrinkage was statistically significant, F(2,42)=19.78, P= .000. These exploratory results
reveals evidence of significant vendor factor contribution to process variability. This paper
will discuss the six sigma DMAIC tools applied in this project and highlight results and
opportunities for process optimization, improvement and controls applied to meet expected
annualized savings
Data-driven satisficing measure and ranking
We propose an computational framework for real-time risk assessment and
prioritizing for random outcomes without prior information on probability
distributions. The basic model is built based on satisficing measure (SM) which
yields a single index for risk comparison. Since SM is a dual representation
for a family of risk measures, we consider problems constrained by general
convex risk measures and specifically by Conditional value-at-risk. Starting
from offline optimization, we apply sample average approximation technique and
argue the convergence rate and validation of optimal solutions. In online
stochastic optimization case, we develop primal-dual stochastic approximation
algorithms respectively for general risk constrained problems, and derive their
regret bounds. For both offline and online cases, we illustrate the
relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure
Total Variation Regularized Tensor RPCA for Background Subtraction from Compressive Measurements
Background subtraction has been a fundamental and widely studied task in
video analysis, with a wide range of applications in video surveillance,
teleconferencing and 3D modeling. Recently, motivated by compressive imaging,
background subtraction from compressive measurements (BSCM) is becoming an
active research task in video surveillance. In this paper, we propose a novel
tensor-based robust PCA (TenRPCA) approach for BSCM by decomposing video frames
into backgrounds with spatial-temporal correlations and foregrounds with
spatio-temporal continuity in a tensor framework. In this approach, we use 3D
total variation (TV) to enhance the spatio-temporal continuity of foregrounds,
and Tucker decomposition to model the spatio-temporal correlations of video
background. Based on this idea, we design a basic tensor RPCA model over the
video frames, dubbed as the holistic TenRPCA model (H-TenRPCA). To characterize
the correlations among the groups of similar 3D patches of video background, we
further design a patch-group-based tensor RPCA model (PG-TenRPCA) by joint
tensor Tucker decompositions of 3D patch groups for modeling the video
background. Efficient algorithms using alternating direction method of
multipliers (ADMM) are developed to solve the proposed models. Extensive
experiments on simulated and real-world videos demonstrate the superiority of
the proposed approaches over the existing state-of-the-art approaches.Comment: To appear in IEEE TI
Online Optimization Methods for the Quantification Problem
The estimation of class prevalence, i.e., the fraction of a population that
belongs to a certain class, is a very useful tool in data analytics and
learning, and finds applications in many domains such as sentiment analysis,
epidemiology, etc. For example, in sentiment analysis, the objective is often
not to estimate whether a specific text conveys a positive or a negative
sentiment, but rather estimate the overall distribution of positive and
negative sentiments during an event window. A popular way of performing the
above task, often dubbed quantification, is to use supervised learning to train
a prevalence estimator from labeled data.
Contemporary literature cites several performance measures used to measure
the success of such prevalence estimators. In this paper we propose the first
online stochastic algorithms for directly optimizing these
quantification-specific performance measures. We also provide algorithms that
optimize hybrid performance measures that seek to balance quantification and
classification performance. Our algorithms present a significant advancement in
the theory of multivariate optimization and we show, by a rigorous theoretical
analysis, that they exhibit optimal convergence. We also report extensive
experiments on benchmark and real data sets which demonstrate that our methods
significantly outperform existing optimization techniques used for these
performance measures.Comment: 26 pages, 6 figures. A short version of this manuscript will appear
in the proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery
and Data Mining, KDD 201
- …