354,921 research outputs found

    Online Multistage Subset Maximization Problems

    Get PDF
    Numerous combinatorial optimization problems (knapsack, maximum-weight matching, etc.) can be expressed as subset maximization problems: One is given a ground set N={1,...,n}, a collection F subseteq 2^N of subsets thereof such that the empty set is in F, and an objective (profit) function p: F -> R_+. The task is to choose a set S in F that maximizes p(S). We consider the multistage version (Eisenstat et al., Gupta et al., both ICALP 2014) of such problems: The profit function p_t (and possibly the set of feasible solutions F_t) may change over time. Since in many applications changing the solution is costly, the task becomes to find a sequence of solutions that optimizes the trade-off between good per-time solutions and stable solutions taking into account an additional similarity bonus. As similarity measure for two consecutive solutions, we consider either the size of the intersection of the two solutions or the difference of n and the Hamming distance between the two characteristic vectors. We study multistage subset maximization problems in the online setting, that is, p_t (along with possibly F_t) only arrive one by one and, upon such an arrival, the online algorithm has to output the corresponding solution without knowledge of the future. We develop general techniques for online multistage subset maximization and thereby characterize those models (given by the type of data evolution and the type of similarity measure) that admit a constant-competitive online algorithm. When no constant competitive ratio is possible, we employ lookahead to circumvent this issue. When a constant competitive ratio is possible, we provide almost matching lower and upper bounds on the best achievable one

    An Improved Differential Evolution Algorithm for Data Stream Clustering

    Get PDF
    A Few algorithms were actualized by the analysts for performing clustering of data streams. Most of these algorithms require that the number of clusters (K) has to be fixed by the customer based on input data and it can be kept settled all through the clustering process. Stream clustering has faced few difficulties in picking up K. In this paper, we propose an efficient approach for data stream clustering by embracing an Improved Differential Evolution (IDE) algorithm. The IDE algorithm is one of the quick, powerful and productive global optimization approach for programmed clustering. In our proposed approach, we additionally apply an entropy based method for distinguishing the concept drift in the data stream and in this way updating the clustering procedure online. We demonstrated that our proposed method is contrasted with Genetic Algorithm and identified as proficient optimization algorithm. The performance of our proposed technique is assessed and cr eates the accuracy of 92.29%, the precision is 86.96%, recall is 90.30% and F-measure estimate is 88.60%

    Process Optimization: A Six Sigma DMAIC Approach

    Get PDF
    Presented online for the 2016 Research and Creative Achievement Week. Blog website: http://blog.ecu.edu/sites/davidkurgatt/Nap fabrics used in paint roller covers are required to meet nap height specifications measured as the overall fabric thickness from its backing to meet substrate paint application standards. Consistency in heat setting process is key to achieving customer specifications for nap fabrics. Excessive shrinkage or variation in shrinkage during heat setting will lead to nonconforming nap fabric heights and costly adjustments, tweaking for quality or downgrading in downstream finishing processes. An exploratory analysis in the measure phase revealed significant difference in yarn shrinkage levels between suppliers. Effect of supplier and heat setting temperature levels on yarn shrinkage was statistically significant, F(2,42)=19.78, P= .000. These exploratory results reveals evidence of significant vendor factor contribution to process variability. This paper will discuss the six sigma DMAIC tools applied in this project and highlight results and opportunities for process optimization, improvement and controls applied to meet expected annualized savings

    Data-driven satisficing measure and ranking

    Full text link
    We propose an computational framework for real-time risk assessment and prioritizing for random outcomes without prior information on probability distributions. The basic model is built based on satisficing measure (SM) which yields a single index for risk comparison. Since SM is a dual representation for a family of risk measures, we consider problems constrained by general convex risk measures and specifically by Conditional value-at-risk. Starting from offline optimization, we apply sample average approximation technique and argue the convergence rate and validation of optimal solutions. In online stochastic optimization case, we develop primal-dual stochastic approximation algorithms respectively for general risk constrained problems, and derive their regret bounds. For both offline and online cases, we illustrate the relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure

    Total Variation Regularized Tensor RPCA for Background Subtraction from Compressive Measurements

    Full text link
    Background subtraction has been a fundamental and widely studied task in video analysis, with a wide range of applications in video surveillance, teleconferencing and 3D modeling. Recently, motivated by compressive imaging, background subtraction from compressive measurements (BSCM) is becoming an active research task in video surveillance. In this paper, we propose a novel tensor-based robust PCA (TenRPCA) approach for BSCM by decomposing video frames into backgrounds with spatial-temporal correlations and foregrounds with spatio-temporal continuity in a tensor framework. In this approach, we use 3D total variation (TV) to enhance the spatio-temporal continuity of foregrounds, and Tucker decomposition to model the spatio-temporal correlations of video background. Based on this idea, we design a basic tensor RPCA model over the video frames, dubbed as the holistic TenRPCA model (H-TenRPCA). To characterize the correlations among the groups of similar 3D patches of video background, we further design a patch-group-based tensor RPCA model (PG-TenRPCA) by joint tensor Tucker decompositions of 3D patch groups for modeling the video background. Efficient algorithms using alternating direction method of multipliers (ADMM) are developed to solve the proposed models. Extensive experiments on simulated and real-world videos demonstrate the superiority of the proposed approaches over the existing state-of-the-art approaches.Comment: To appear in IEEE TI

    Online Optimization Methods for the Quantification Problem

    Full text link
    The estimation of class prevalence, i.e., the fraction of a population that belongs to a certain class, is a very useful tool in data analytics and learning, and finds applications in many domains such as sentiment analysis, epidemiology, etc. For example, in sentiment analysis, the objective is often not to estimate whether a specific text conveys a positive or a negative sentiment, but rather estimate the overall distribution of positive and negative sentiments during an event window. A popular way of performing the above task, often dubbed quantification, is to use supervised learning to train a prevalence estimator from labeled data. Contemporary literature cites several performance measures used to measure the success of such prevalence estimators. In this paper we propose the first online stochastic algorithms for directly optimizing these quantification-specific performance measures. We also provide algorithms that optimize hybrid performance measures that seek to balance quantification and classification performance. Our algorithms present a significant advancement in the theory of multivariate optimization and we show, by a rigorous theoretical analysis, that they exhibit optimal convergence. We also report extensive experiments on benchmark and real data sets which demonstrate that our methods significantly outperform existing optimization techniques used for these performance measures.Comment: 26 pages, 6 figures. A short version of this manuscript will appear in the proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 201
    corecore