790 research outputs found

    Discriminative Parameter Estimation for Random Walks Segmentation

    Get PDF
    The Random Walks (RW) algorithm is one of the most e - cient and easy-to-use probabilistic segmentation methods. By combining contrast terms with prior terms, it provides accurate segmentations of medical images in a fully automated manner. However, one of the main drawbacks of using the RW algorithm is that its parameters have to be hand-tuned. we propose a novel discriminative learning framework that estimates the parameters using a training dataset. The main challenge we face is that the training samples are not fully supervised. Speci cally, they provide a hard segmentation of the images, instead of a proba- bilistic segmentation. We overcome this challenge by treating the opti- mal probabilistic segmentation that is compatible with the given hard segmentation as a latent variable. This allows us to employ the latent support vector machine formulation for parameter estimation. We show that our approach signi cantly outperforms the baseline methods on a challenging dataset consisting of real clinical 3D MRI volumes of skeletal muscles.Comment: Medical Image Computing and Computer Assisted Interventaion (2013

    CoCoA: A General Framework for Communication-Efficient Distributed Optimization

    Get PDF
    The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for distributed computing environments, CoCoA, that has an efficient communication scheme and is applicable to a wide variety of problems in machine learning and signal processing. We extend the framework to cover general non-strongly-convex regularizers, including L1-regularized problems like lasso, sparse logistic regression, and elastic net regularization, and show how earlier work can be derived as a special case. We provide convergence guarantees for the class of convex regularized loss minimization objectives, leveraging a novel approach in handling non-strongly-convex regularizers and non-smooth loss functions. The resulting framework has markedly improved performance over state-of-the-art methods, as we illustrate with an extensive set of experiments on real distributed datasets

    Regularización Laplaciana en el espacio dual para SVMs

    Full text link
    Máster Universitario en en Investigación e Innovación en Inteligencia Computacional y Sistemas InteractivosNowadays, Machine Learning (ML) is a field with a great impact because of its usefulness in solving many types of problems. However, today large amounts of data are handled and therefore traditional learning methods can be severely limited in performance. To address this problem, Regularized Learning (RL) is used, where the objective is to make the model as flexible as possible but preserving the generalization properties, so that overfitting is avoided. There are many models that use regularization in their formulations, such as Lasso, or models that use intrinsic regularization, such as the Support Vector Machine (SVM). In this model, the margin of a separating hyperplane is maximized, resulting in a solution that depends only on a subset of the samples called support vectors. This Master Thesis aims to develop an SVM model with Laplacian regularization in the dual space, under the intuitive idea that close patterns should have similar coefficients. To construct the Laplacian term we will use as basis the Fused Lasso model which penalizes the differences of the consecutive coefficients, but in our case we seek to penalize the differences between every pair of samples, using the elements of the kernel matrix as weights. This thesis presents the different phases carried out in the implementation of the new proposal, starting from the standard SVM, followed by the comparative experiments between the new model and the original method. As a result, we see that Laplacian regularization is very useful, since the new proposal outperforms the standard SVM in most of the datasets used, both in classification and regression. Furthermore, we observe that if we only consider the Laplacian term and we set the parameter C (upper bound for the coefficients) as if it were infinite, we also obtain better performance than the standard SVM metho

    SCOPE: Scalable Composite Optimization for Learning on Spark

    Full text link
    Many machine learning models, such as logistic regression~(LR) and support vector machine~(SVM), can be formulated as composite optimization problems. Recently, many distributed stochastic optimization~(DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods. However, most of these DSO methods are not scalable enough. In this paper, we propose a novel DSO method, called \underline{s}calable \underline{c}omposite \underline{op}timization for l\underline{e}arning~({SCOPE}), and implement it on the fault-tolerant distributed platform \mbox{Spark}. SCOPE is both computation-efficient and communication-efficient. Theoretical analysis shows that SCOPE is convergent with linear convergence rate when the objective function is convex. Furthermore, empirical results on real datasets show that SCOPE can outperform other state-of-the-art distributed learning methods on Spark, including both batch learning methods and DSO methods

    2D Quantitative Structure-Property Relationship Study of Mycotoxins by Multiple Linear Regression and Support Vector Machine

    Get PDF
    In the present work, support vector machines (SVMs) and multiple linear regression (MLR) techniques were used for quantitative structure–property relationship (QSPR) studies of retention time (tR) in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins) based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLR and SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD). The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described

    Methodology to Predict Daily Groundwater Levels by the Implementation of Machine Learning and Crop Models

    Get PDF
    The continuous decline of groundwater levels caused by variations in climatic conditions and crop water demands is an increased concern for the agricultural community. It is necessary to understand the factors that control these changes in groundwater levels so that we can better address declines and develop improved conservation practices that will lead to a more sustainable use of water. In this study, two machine learning techniques namely support vector regression (SVR) and the nonlinear autoregressive with exogenous inputs (NARX) neural network were implemented to predict daily groundwater levels in a well located in the Mississippi Delta Region (MDR). Results of the NARX model indicate that a Bayesian regularization algorithm with two hidden nodes and 100 time delays was the best architecture to forecast groundwater levels. In another study, the SVR and the NARX model were compared for the prediction of groundwater withdrawal and recharge periods separately. Results from this study showed that input data classified by seasons lead to incremental improvements in the model accuracy, and that the SVR was the most efficient machine learning model with a Mean Squared Error (MSE) of 0.00123 m for the withdrawal season. Analysis of input variables such as previous daily groundwater levels (Gw), precipitation (Pr), and evapotranspiration (ET) showed that the combination of Gw+Pr provides the optimal set for groundwater prediction and that ET degraded the modeling performance, especially during recharge seasons. Finally, the CROPGRO-Soybean crop model was used to simulate the impacts of different volumes of irrigation on the crop height and yield, and to generate the daily irrigation requirements for soybean crops in the MDR. Four irrigation threshold scenarios (20%, 40%, 50% and 60%) were obtained from the CROGRO-Soybean model and used as inputs in the SVR to evaluate the predicted response of daily groundwater levels to different irrigation demands. This study demonstrated that conservative irrigation management, by selecting a low irrigation threshold, can provide good yields comparable to what is produced by a high volume irrigation management practice. Thus, lower irrigation volumes can have a big impact on decreasing the amount of groundwater withdrawals, while still maintaining comparable yields

    Methodology to Predict Daily Groundwater Levels by the Implementation of Machine Learning and Crop Models

    Get PDF
    The continuous decline of groundwater levels caused by variations in climatic conditions and crop water demands is an increased concern for the agricultural community. It is necessary to understand the factors that control these changes in groundwater levels so that we can better address declines and develop improved conservation practices that will lead to a more sustainable use of water. In this study, two machine learning techniques namely support vector regression (SVR) and the nonlinear autoregressive with exogenous inputs (NARX) neural network were implemented to predict daily groundwater levels in a well located in the Mississippi Delta Region (MDR). Results of the NARX model indicate that a Bayesian regularization algorithm with two hidden nodes and 100 time delays was the best architecture to forecast groundwater levels. In another study, the SVR and the NARX model were compared for the prediction of groundwater withdrawal and recharge periods separately. Results from this study showed that input data classified by seasons lead to incremental improvements in the model accuracy, and that the SVR was the most efficient machine learning model with a Mean Squared Error (MSE) of 0.00123 m for the withdrawal season. Analysis of input variables such as previous daily groundwater levels (Gw), precipitation (Pr), and evapotranspiration (ET) showed that the combination of Gw+Pr provides the optimal set for groundwater prediction and that ET degraded the modeling performance, especially during recharge seasons. Finally, the CROPGRO-Soybean crop model was used to simulate the impacts of different volumes of irrigation on the crop height and yield, and to generate the daily irrigation requirements for soybean crops in the MDR. Four irrigation threshold scenarios (20%, 40%, 50% and 60%) were obtained from the CROGRO-Soybean model and used as inputs in the SVR to evaluate the predicted response of daily groundwater levels to different irrigation demands. This study demonstrated that conservative irrigation management, by selecting a low irrigation threshold, can provide good yields comparable to what is produced by a high volume irrigation management practice. Thus, lower irrigation volumes can have a big impact on decreasing the amount of groundwater withdrawals, while still maintaining comparable yields
    corecore