20,593 research outputs found

    High-performance Kernel Machines with Implicit Distributed Optimization and Randomization

    Full text link
    In order to fully utilize "big data", it is often required to use "big models". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the underlying statistical dependencies. Kernel methods fit this need well, as they constitute a versatile and principled statistical methodology for solving a wide range of non-parametric modelling problems. However, their high computational costs (in storage and time) pose a significant barrier to their widespread adoption in big data applications. We propose an algorithmic framework and high-performance implementation for massive-scale training of kernel-based statistical models, based on combining two key technical ingredients: (i) distributed general purpose convex optimization, and (ii) the use of randomization to improve the scalability of kernel methods. Our approach is based on a block-splitting variant of the Alternating Directions Method of Multipliers, carefully reconfigured to handle very large random feature matrices, while exploiting hybrid parallelism typically found in modern clusters of multicore machines. Our implementation supports a variety of statistical learning tasks by enabling several loss functions, regularization schemes, kernels, and layers of randomized approximations for both dense and sparse datasets, in a highly extensible framework. We evaluate the ability of our framework to learn models on data from applications, and provide a comparison against existing sequential and parallel libraries.Comment: Work presented at MMDS 2014 (June 2014) and JSM 201

    An Efficient Algorithm for Video Super-Resolution Based On a Sequential Model

    Get PDF
    In this work, we propose a novel procedure for video super-resolution, that is the recovery of a sequence of high-resolution images from its low-resolution counterpart. Our approach is based on a "sequential" model (i.e., each high-resolution frame is supposed to be a displaced version of the preceding one) and considers the use of sparsity-enforcing priors. Both the recovery of the high-resolution images and the motion fields relating them is tackled. This leads to a large-dimensional, non-convex and non-smooth problem. We propose an algorithmic framework to address the latter. Our approach relies on fast gradient evaluation methods and modern optimization techniques for non-differentiable/non-convex problems. Unlike some other previous works, we show that there exists a provably-convergent method with a complexity linear in the problem dimensions. We assess the proposed optimization method on {several video benchmarks and emphasize its good performance with respect to the state of the art.}Comment: 37 pages, SIAM Journal on Imaging Sciences, 201

    Histopathological image analysis : a review

    Get PDF
    Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe

    Prerequisites for Affective Signal Processing (ASP)

    Get PDF
    Although emotions are embraced by science, their recognition has not reached a satisfying level. Through a concise overview of affect, its signals, features, and classification methods, we provide understanding for the problems encountered. Next, we identify the prerequisites for successful Affective Signal Processing: validation (e.g., mapping of constructs on signals), triangulation, a physiology-driven approach, and contributions of the signal processing community. Using these directives, a critical analysis of a real-world case is provided. This illustrates that the prerequisites can become a valuable guide for Affective Signal Processing (ASP)

    Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper

    Get PDF
    This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated.E

    Is Captain Kirk a natural blonde? Do X-ray crystallographers dream of electron clouds? Comparing model-based inferences in science with fiction

    Get PDF
    Scientific models share one central characteristic with fiction: their relation to the physical world is ambiguous. It is often unclear whether an element in a model represents something in the world or presents an artifact of model building. Fiction, too, can resemble our world to varying degrees. However, we assign a different epistemic function to scientific representations. As artifacts of human activity, how are scientific representations allowing us to make inferences about real phenomena? In reply to this concern, philosophers of science have started analyzing scientific representations in terms of fictionalization strategies. Many arguments center on a dyadic relation between the model and its target system, focusing on structural resemblances and “as if” scenarios. This chapter provides a different approach. It looks more closely at model building to analyze the interpretative strategies dealing with the representational limits of models. How do we interpret ambiguous elements in models? Moreover, how do we determine the validity of model-based inferences to information that is not an explicit part of a representational structure? I argue that the problem of ambiguous inference emerges from two features of representations, namely their hybridity and incompleteness. To distinguish between fictional and non-fictional elements in scientific models my suggestion is to look at the integrative strategies that link a particular model to other methods in an ongoing research context. To exemplify this idea, I examine protein modeling through X-ray crystallography as a pivotal method in biochemistry
    • 

    corecore