3,763 research outputs found

    Nonlocal Myriad Filters for Cauchy Noise Removal

    Full text link
    The contribution of this paper is two-fold. First, we introduce a generalized myriad filter, which is a method to compute the joint maximum likelihood estimator of the location and the scale parameter of the Cauchy distribution. Estimating only the location parameter is known as myriad filter. We propose an efficient algorithm to compute the generalized myriad filter and prove its convergence. Special cases of this algorithm result in the classical myriad filtering, respective an algorithm for estimating only the scale parameter. Based on an asymptotic analysis, we develop a second, even faster generalized myriad filtering technique. Second, we use our new approaches within a nonlocal, fully unsupervised method to denoise images corrupted by Cauchy noise. Special attention is paid to the determination of similar patches in noisy images. Numerical examples demonstrate the excellent performance of our algorithms which have moreover the advantage to be robust with respect to the parameter choice

    Efficient Exact Maximum a Posteriori Computation for Bayesian SNP Genotyping in Polyploids

    Get PDF
    The problem of genotyping polyploids is extremely important for the creation of genetic maps and assembly of complex plant genomes. Despite its significance, polyploid genotyping still remains largely unsolved and suffers from a lack of statistical formality. In this paper a graphical Bayesian model for SNP genotyping data is introduced. This model can infer genotypes even when the ploidy of the population is unknown. We also introduce an algorithm for finding the exact maximum a posteriori genotype configuration with this model. This algorithm is implemented in a freely available web-based software package SuperMASSA. We demonstrate the utility, efficiency, and flexibility of the model and algorithm by applying them to two different platforms, each of which is applied to a polyploid data set: Illumina GoldenGate data from potato and Sequenom MassARRAY data from sugarcane. Our method achieves state-of-the-art performance on both data sets and can be trivially adapted to use models that utilize prior information about any platform or species

    Online Modeling and Tuning of Parallel Stream Processing Systems

    Get PDF
    Writing performant computer programs is hard. Code for high performance applications is profiled, tweaked, and re-factored for months specifically for the hardware for which it is to run. Consumer application code doesn\u27t get the benefit of endless massaging that benefits high performance code, even though heterogeneous processor environments are beginning to resemble those in more performance oriented arenas. This thesis offers a path to performant, parallel code (through stream processing) which is tuned online and automatically adapts to the environment it is given. This approach has the potential to reduce the tuning costs associated with high performance code and brings the benefit of performance tuning to consumer applications where otherwise it would be cost prohibitive. This thesis introduces a stream processing library and multiple techniques to enable its online modeling and tuning. Stream processing (also termed data-flow programming) is a compute paradigm that views an application as a set of logical kernels connected via communications links or streams. Stream processing is increasingly used by computational-x and x-informatics fields (e.g., biology, astrophysics) where the focus is on safe and fast parallelization of specific big-data applications. A major advantage of stream processing is that it enables parallelization without necessitating manual end-user management of non-deterministic behavior often characteristic of more traditional parallel processing methods. Many big-data and high performance applications involve high throughput processing, necessitating usage of many parallel compute kernels on several compute cores. Optimizing the orchestration of kernels has been the focus of much theoretical and empirical modeling work. Purely theoretical parallel programming models can fail when the assumptions implicit within the model are mis-matched with reality (i.e., the model is incorrectly applied). Often it is unclear if the assumptions are actually being met, even when verified under controlled conditions. Full empirical optimization solves this problem by extensively searching the range of likely configurations under native operating conditions. This, however, is expensive in both time and energy. For large, massively parallel systems, even deciding which modeling paradigm to use is often prohibitively expensive and unfortunately transient (with workload and hardware). In an ideal world, a parallel run-time will re-optimize an application continuously to match its environment, with little additional overhead. This work presents methods aimed at doing just that through low overhead instrumentation, modeling, and optimization. Online optimization provides a good trade-off between static optimization and online heuristics. To enable online optimization, modeling decisions must be fast and relatively accurate. Online modeling and optimization of a stream processing system first requires the existence of a stream processing framework that is amenable to the intended type of dynamic manipulation. To fill this void, we developed the RaftLib C++ template library, which enables usage of the stream processing paradigm for C++ applications (it is the run-time which is the basis of almost all the work within this dissertation). An application topology is specified by the user, however almost everything else is optimizable by the run-time. RaftLib takes advantage of the knowledge gained during the design of several prior streaming languages (notably Auto-Pipe). The resultant framework enables online migration of tasks, auto-parallelization, online buffer-reallocation, and other useful dynamic behaviors that were not available in many previous stream processing systems. Several benchmark applications have been designed to assess the performance gains through our approaches and compare performance to other leading stream processing frameworks. Information is essential to any modeling task, to that end a low-overhead instrumentation framework has been developed which is both dynamic and adaptive. Discovering a fast and relatively optimal configuration for a stream processing application often necessitates solving for buffer sizes within a finite capacity queueing network. We show that a generalized gain/loss network flow model can bootstrap the process under certain conditions. Any modeling effort, requires that a model be selected; often a highly manual task, involving many expensive operations. This dissertation demonstrates that machine learning methods (such as a support vector machine) can successfully select models at run-time for a streaming application. The full set of approaches are incorporated into the open source RaftLib framework

    Sequential Detection of Linear Features in Two-Dimensional Random Fields

    Get PDF
    The detection of edges, lines, and other linear features in two-dimensional discrete images is a low level processing step of fundamental importance in the automatic processing of such data. Many subsequent tasks in computer vision, pattern recognition, and image processing depend on the successful execution of this step. In this thesis, we will address one class of techniques for performing this task: sequential detection. Our aims are fourfold. First, we would like to discuss the use of sequential techniques as an attractive alternative to the somewhat better known methods of approaching this problem. Although several researchers have obtained significant results with sequential type algorithms, the inherent benefits of a sequential approach would appear to have gone largely unappreciated. Secondly, the sequential techniques reported to date appear somewhat lacking with respect to a theoretical foundation. Furthermore, the theory that is advanced incorporates rather severe restrictions on the types of images to which it applies, thus imposing a significant limitation to the generality of the method(s). We seek to advance a more general theory with minimal assumptions regarding the input image. A third goal is to utilize this newly developed theory to obtain quantitative assessments of the performance of the method. This important step, which depends on a computational theory, can answer such vital questions as: Are assumptions about the qualitative behavior of the method justified? How does signal-to-noise ratio impact its behavior? How fast is it? How accurate? The state of theoretical development of present techniques does not allow for this type of analysis. Finally, a fourth aim is to\u27 extend the earlier results to include correlated image data. Present sequential methods as well as many non-sequential methods assume that the image data is uncorrelated and cannot therefore make use of the mutual information between pixels in real-world images. We would like to extend the theory to incorporate correlated images and demonstrate the advantages incurred by the use of the existing mutual information. The topics to be discussed are organized in the following manner. We will first provide a rather general discussion of the problem of detecting intensity edges in images. The edge detection problem will serve as the prototypical problem of linear feature extraction for much of this thesis. It will later be shown that the detection of lines, ramp edges, texture edges, etc. can be handled in similar fashion to intensity edges, the only difference being the nature of the preprocessing operator used. The class of sequential techniques will then be introduced, with a view to emphasize the particular advantages and disadvantages exhibited by the class. This Chapter will conclude with a more detailed treatment of the various sequential algorithms proposed in the literature. Chapter 2 then develops the algorithm proposed by the author, Sequential Edge Linking or SEL. It begins with some definitions, follows with a derivation of the critical path branch metric and some of its properties, and concludes with a discussion of algorithms. The third Chapter is devoted exclusively to an analysis of the dynamical behavior and performance of the method. \u27 Chapter 4 then deals with the case of correlated random fields. In that Chapter, a model is proposed for which paths searched by the SEL algorithm are shown to possess a well-known autocorrelation function. This allows the use of a simple linear filter to decorrelate the raw image data. Finally, Chapter 5 presents a number of experimental results and corroboration of the theoretical conclusions of earlier Chapters. Some concluding remarks are also included in Chapter 5

    Implementation Lessons and Pitfalls for Real-time Optimal Control with Stochastic Systems

    Get PDF
    Modern computational power and efficient direct collocation techniques are decreasing the solution time required for the optimal control problem, making real-time optimal control (RTOC) feasible for modern systems. Current trends in the literature indicate that many authors are applying RTOC with a recursive open-loop structure, relying on a high recursion rate for implicit state feedback to counter disturbances and other unmodeled effects without explicit closed-loop control. The limitations of using rapid, instantaneous optimal solutions are demonstrated analytically and through application to a surface-to-air missile avoidance control system. Two methods are proposed for control structure implementation when using RTOC to take advantage of error integration through either classical feedback or disturbance estimation

    High-level power optimisation for Digital Signal Processing in Recon gurable Logic

    No full text
    This thesis is concerned with the optimisation of Digital Signal Processing (DSP) algorithm implementations on recon gurable hardware via the selection of appropriate word-lengths for the signals in these algorithms, in order to minimise system power consumption. Whilst existing word-length optimisation work has concentrated on the minimisation of the area of algorithm implementations, this work introduces the rst set of power consumption models that can be evaluated quickly enough to be used within the search of the enormous design space of multiple word-length optimisation problems. These models achieve their speed by estimating both the power consumed within the arithmetic components of an algorithm and the power in the routing wires that connect these components, using only a high-level description of the algorithm itself. Trading o a small reduction in power model accuracy for a large increase in speed is one of the major contributions of this thesis. In addition to the work on power consumption modelling, this thesis also develops a new technique for selecting the appropriate word-lengths for an algorithm implementation in order to minimise its cost in terms of power (or some other metric for which models are available). The method developed is able to provide tight lower and upper bounds on the optimal cost that can be obtained for a particular word-length optimisation problem and can, as a result, nd provably near-optimal solutions to word-length optimisation problems without resorting to an NP-hard search of the design space. Finally the costs of systems optimised via the proposed technique are compared to those obtainable by word-length optimisation for minimisation of other metrics (such as logic area) and the results compared, providing greater insight into the nature of wordlength optimisation problems and the extent of the improvements obtainable by them

    Robust and Regularized Algorithms for Vehicle Tractive Force Prediction and Mass Estimation

    Get PDF
    This work provides novel robust and regularized algorithms for parameter estimation with applications in vehicle tractive force prediction and mass estimation. Given a large record of real world data from test runs on public roads, recursive algorithms adjusted the unknown vehicle parameters under a broad variation of statistical assumptions for two linear gray-box models

    Service Abstractions for Scalable Deep Learning Inference at the Edge

    Get PDF
    Deep learning driven intelligent edge has already become a reality, where millions of mobile, wearable, and IoT devices analyze real-time data and transform those into actionable insights on-device. Typical approaches for optimizing deep learning inference mostly focus on accelerating the execution of individual inference tasks, without considering the contextual correlation unique to edge environments and the statistical nature of learning-based computation. Specifically, they treat inference workloads as individual black boxes and apply canonical system optimization techniques, developed over the last few decades, to handle them as yet another type of computation-intensive applications. As a result, deep learning inference on edge devices still face the ever increasing challenges of customization to edge device heterogeneity, fuzzy computation redundancy between inference tasks, and end-to-end deployment at scale. In this thesis, we propose the first framework that automates and scales the end-to-end process of deploying efficient deep learning inference from the cloud to heterogeneous edge devices. The framework consists of a series of service abstractions that handle DNN model tailoring, model indexing and query, and computation reuse for runtime inference respectively. Together, these services bridge the gap between deep learning training and inference, eliminate computation redundancy during inference execution, and further lower the barrier for deep learning algorithm and system co-optimization. To build efficient and scalable services, we take a unique algorithmic approach of harnessing the semantic correlation between the learning-based computation. Rather than viewing individual tasks as isolated black boxes, we optimize them collectively in a white box approach, proposing primitives to formulate the semantics of the deep learning workloads, algorithms to assess their hidden correlation (in terms of the input data, the neural network models, and the deployment trials) and merge common processing steps to minimize redundancy
    • …
    corecore