157,380 research outputs found

    A new block exact fast LMS/newton adaptive filtering algorithm

    Get PDF
    The 47th Midwest Symposium on Circuits and Systems Conference, Salt Lake City, Utah, USA, 25-28 July 2004This paper proposes a new block exact fast LMS/Newton algorithm for adaptive filtering. It is obtained by exploiting the shifting property of the whitened input of the fast LMS/Newton algorithm so that a block exact update can be carried out in the LMS part of the algorithm. The proposed algorithm has significantly reduced arithmetic complexity than but exact arithmetic equivalence to the LMS/Newton algorithm. Since short block length is allowed, the processing delay introduced is not excessively large as in conventional block filtering generalization, Implementation issues and the experimental results are given to illustrate the principle and efficiency of the proposed algorithm.published_or_final_versio

    Characterization and computation of restless bandit marginal productivity indices

    Get PDF
    The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e., binary-action (active/passive) (semi-) Markov decision processes. A growing body of evidence shows that such a rule is nearly optimal in a wide variety of applications, which raises the need to efficiently compute the Whittle index and more general marginal productivity index (MPI) extensions in large-scale models. For such a purpose, this paper extends to restless bandits the parametric linear programming (LP) approach deployed in [J. Niño-Mora. A (2/3)n3n^{3} fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J. Comp., in press], which yielded a fast Gittins-index algorithm. Yet the extension is not straightforward, as the MPI is only defined for the limited range of socalled indexable bandits, which motivates the quest for methods to establish indexability. This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in largescale applications, presenting the following contributions: (i) a complete algorithmic characterization of indexability, for which two block implementations are given; and (ii) more importantly, new analytical conditions for indexability — termed LP-indexability — that leverage knowledge on the structure of optimal policies in particular models, under which the MPI is computed faster by the adaptive-greedy algorithm previously introduced by the author under the more stringent PCL-indexability conditions, for which a new fast-pivoting block implementation is given. The paper further reports on a computational study, measuring the runtime performance of the algorithms, and assessing by a simulation study the high prevalence of indexability and PCL-indexability.

    Classification-Based Adaptive Search Algorithm for Video Motion Estimation

    Get PDF
    A video sequence consists of a series of frames. In order to compress the video for efficient storage and transmission, the temporal redundancy among adjacent frames must be exploited. A frame is selected as reference frame and subsequent frames are predicted from the reference frame using a technique known as motion estimation. Real videos contain a mixture of motions with slow and fast contents. Among block matching motion estimation algorithms, the full search algorithm is known for its superiority in the performance over other matching techniques. However, this method is computationally very extensive. Several fast block matching algorithms (FBMAs) have been proposed in the literature with the aim to reduce computational costs while maintaining desired quality performance, but all these methods are considered to be sub-optimal. No fixed fast block matching algorithm can effi- ciently remove temporal redundancy of video sequences with wide motion contents. Adaptive fast block matching algorithm, called classification based adaptive search (CBAS) has been proposed. A Bayes classifier is applied to classify the motions into slow and fast categories. Accordingly, appropriate search strategy is applied for each class. The algorithm switches between different search patterns according to the content of motions within video frames. The proposed technique outperforms conventional stand-alone fast block matching methods in terms of both peak signal to noise ratio (PSNR) and computational complexity. In addition, a new hierarchical method for detecting and classifying shot boundaries in video sequences is proposed which is based on information theoretic classification (ITC). ITC relies on likelihood of class label transmission of a data point to the data points in its vicinity. ITC focuses on maximizing the global transmission of true class labels and classify the frames into classes of cuts and non-cuts. Applying the same rule, the non-cut frames are also classified into two categories of arbitrary shot frames and gradual transition frames. CBAS is applied on the proposed shot detection method to handle camera or object motions. Experimental evidence demonstrates that our method can detect shot breaks with high accuracy

    Backward adaptive pixel-based fast predictive motion estimation

    Get PDF

    Center of Mass-Based Adaptive Fast Block Motion Estimation

    Get PDF
    This work presents an efficient adaptive algorithm based on center of mass (CEM) for fast block motion estimation. Binary transform, subsampling, and horizontal/vertical projection techniques are also proposed. As the conventional CEM calculation is computationally intensive, binary transform and subsampling approaches are proposed to simplify CEM calculation; the binary transform center of mass (BITCEM) is then derived. The BITCEM motion types are classified by percentage of (0, 0) BITCEM motion vectors. Adaptive search patterns are allocated according to the BITCEM moving direction and the BITCEM motion type. Moreover, the BITCEM motion vector is utilized as the initial search point for near-still or slow BITCEM motion types. To support the variable block sizes, the horizontal/vertical projections of a binary transformed macroblock are utilized to determine whether the block requires segmentation. Experimental results indicate that the proposed algorithm is better than the five conventional algorithms, that is, three-step search (TSS), new three-step search (N3SS), four three-step search (4SS), block-based gradient decent search (BBGDS), and diamond search (DS), in terms of speed or picture quality for eight benchmark sequences

    Semi-hierarchical based motion estimation algorithm for the dirac video encoder

    Get PDF
    Having fast and efficient motion estimation is crucial in today’s advance video compression technique since it determines the compression efficiency and the complexity of a video encoder. In this paper, a method which we call semi-hierarchical motion estimation is proposed for the Dirac video encoder. By considering the fully hierarchical motion estimation only for a certain type of inter frame encoding, complexity of the motion estimation can be greatly reduced while maintaining the desirable accuracy. The experimental results show that the proposed algorithm gives two to three times reduction in terms of the number of SAD calculation compared with existing motion estimation algorithm of Dirac for the same motion estimation accuracy, compression efficiency and PSNR performance. Moreover, depending upon the complexity of the test sequence, the proposed algorithm has the ability to increase or decrease the search range in order to maintain the accuracy of the motion estimation to a certain level

    Solution Path Clustering with Adaptive Concave Penalty

    Full text link
    Fast accumulation of large amounts of complex data has created a need for more sophisticated statistical methodologies to discover interesting patterns and better extract information from these data. The large scale of the data often results in challenging high-dimensional estimation problems where only a minority of the data shows specific grouping patterns. To address these emerging challenges, we develop a new clustering methodology that introduces the idea of a regularization path into unsupervised learning. A regularization path for a clustering problem is created by varying the degree of sparsity constraint that is imposed on the differences between objects via the minimax concave penalty with adaptive tuning parameters. Instead of providing a single solution represented by a cluster assignment for each object, the method produces a short sequence of solutions that determines not only the cluster assignment but also a corresponding number of clusters for each solution. The optimization of the penalized loss function is carried out through an MM algorithm with block coordinate descent. The advantages of this clustering algorithm compared to other existing methods are as follows: it does not require the input of the number of clusters; it is capable of simultaneously separating irrelevant or noisy observations that show no grouping pattern, which can greatly improve data interpretation; it is a general methodology that can be applied to many clustering problems. We test this method on various simulated datasets and on gene expression data, where it shows better or competitive performance compared against several clustering methods.Comment: 36 page

    Algorithmic patterns for H\mathcal{H}-matrices on many-core processors

    Get PDF
    In this work, we consider the reformulation of hierarchical (H\mathcal{H}) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H\mathcal{H} matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H\mathcal{H} matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H\mathcal{H} matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H\mathcal{H} matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H\mathcal{H} matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H\mathcal{H} matrix library, highlighting profound speedups of our many-core parallel approach
    corecore