33 research outputs found

    Optimized parameter search for large datasets of the regularization parameter and feature selection for ridge regression

    Get PDF
    In this paper we propose mathematical optimizations to select the optimal regularization parameter for ridge regression using cross-validation. The resulting algorithm is suited for large datasets and the computational cost does not depend on the size of the training set. We extend this algorithm to forward or backward feature selection in which the optimal regularization parameter is selected for each possible feature set. These feature selection algorithms yield solutions with a sparse weight matrix using a quadratic cost on the norm of the weights. A naive approach to optimizing the ridge regression parameter has a computational complexity of the order with the number of applied regularization parameters, the number of folds in the validation set, the number of input features and the number of data samples in the training set. Our implementation has a computational complexity of the order . This computational cost is smaller than that of regression without regularization for large datasets and is independent of the number of applied regularization parameters and the size of the training set. Combined with a feature selection algorithm the algorithm is of complexity and for forward and backward feature selection respectively, with the number of selected features and the number of removed features. This is an order faster than and for the naive implementation, with for large datasets. To show the performance and reduction in computational cost, we apply this technique to train recurrent neural networks using the reservoir computing approach, windowed ridge regression, least-squares support vector machines (LS-SVMs) in primal space using the fixed-size LS-SVM approximation and extreme learning machines

    Generalized methods and solvers for noise removal from piecewise constant signals. I. Background theory

    Get PDF
    Removing noise from piecewise constant (PWC) signals is a challenging signal processing problem arising in many practical contexts. For example, in exploration geosciences, noisy drill hole records need to be separated into stratigraphic zones, and in biophysics, jumps between molecular dwell states have to be extracted from noisy fluorescence microscopy signals. Many PWC denoising methods exist, including total variation regularization, mean shift clustering, stepwise jump placement, running medians, convex clustering shrinkage and bilateral filtering; conventional linear signal processing methods are fundamentally unsuited. This paper (part I, the first of two) shows that most of these methods are associated with a special case of a generalized functional, minimized to achieve PWC denoising. The minimizer can be obtained by diverse solver algorithms, including stepwise jump placement, convex programming, finite differences, iterated running medians, least angle regression, regularization path following and coordinate descent. In the second paper, part II, we introduce novel PWC denoising methods, and comparisons between these methods performed on synthetic and real signals, showing that the new understanding of the problem gained in part I leads to new methods that have a useful role to play

    Mangroves as nature-based mitigation for ENSO-driven compound flood risks in a large river delta

    Get PDF
    Densely populated coastal river deltas are very vulnerable to compound flood risks coming from both oceanic and riverine sources. Climate change may increase these compound flood risks due to sea level rise and intensifying precipitation events. Here, we investigate to what extent nature-based flood defence strategies, through the conservation of mangroves in a tropical river delta, can contribute to mitigate the oceanic and riverine components of compound flood risks. While current knowledge of estuarine compound flood risks is mostly focussed on short-term events such as storm surges (taking 1 or a few days), longer-term events, such as El Niño events (continuing for several weeks to months) along the Pacific coast of Latin America, are less studied. Here, we present a hydrodynamic modelling study of a large river delta in Ecuador aiming to elucidate the compound effects of El Niño-driven oceanic and riverine forcing on extreme high water level propagation through the delta and, in particular, the role of mangroves in reducing the compound high water levels. Our results show that the deltaic high water level anomalies are predominantly driven by the oceanic forcing but that the riverine forcing causes the anomalies to amplify upstream. Furthermore, mangroves in the delta attenuate part of the oceanic contribution to the high water level anomalies, with the attenuating effect increasing in the landward direction, while mangroves have a negligible effect on the riverine component. These findings show that mangrove conservation and restoration programmes can contribute to nature-based mitigation, especially the oceanic component of compound flood risks in a tropical river delta.</p

    Rapid evolution of coordinated and collective movement in response to artificial selection.

    Get PDF
    Collective motion occurs when individuals use social interaction rules to respond to the movements and positions of their neighbors. How readily these social decisions are shaped by selection remains unknown. Through artificial selection on fish (guppies, Poecilia reticulata) for increased group polarization, we demonstrate rapid evolution in how individuals use social interaction rules. Within only three generations, groups of polarization-selected females showed a 15% increase in polarization, coupled with increased cohesiveness, compared to fish from control lines. Although lines did not differ in their physical swimming ability or exploratory behavior, polarization-selected fish adopted faster speeds, particularly in social contexts, and showed stronger alignment and attraction responses to multiple neighbors. Our results reveal the social interaction rules that change when collective behavior evolves

    A Mathematical Model for Interpretable Clinical Decision Support with Applications in Gynecology

    Get PDF
    Over time, methods for the development of clinical decision support (CDS) systems have evolved from interpretable and easy-to-use scoring systems to very complex and non-interpretable mathematical models. In order to accomplish effective decision support, CDS systems should provide information on how the model arrives at a certain decision. To address the issue of incompatibility between performance, interpretability and applicability of CDS systems, this paper proposes an innovative model structure, automatically leading to interpretable and easily applicable models. The resulting models can be used to guide clinicians when deciding upon the appropriate treatment, estimating patient-specific risks and to improve communication with patients.We propose the interval coded scoring (ICS) system, which imposes that the effect of each variable on the estimated risk is constant within consecutive intervals. The number and position of the intervals are automatically obtained by solving an optimization problem, which additionally performs variable selection. The resulting model can be visualised by means of appealing scoring tables and color bars. ICS models can be used within software packages, in smartphone applications, or on paper, which is particularly useful for bedside medicine and home-monitoring. The ICS approach is illustrated on two gynecological problems: diagnosis of malignancy of ovarian tumors using a dataset containing 3,511 patients, and prediction of first trimester viability of pregnancies using a dataset of 1,435 women. Comparison of the performance of the ICS approach with a range of prediction models proposed in the literature illustrates the ability of ICS to combine optimal performance with the interpretability of simple scoring systems.The ICS approach can improve patient-clinician communication and will provide additional insights in the importance and influence of available variables. Future challenges include extensions of the proposed methodology towards automated detection of interaction effects, multi-class decision support systems, prognosis and high-dimensional data

    Subspace identification of Hammerstein systems using least squares support vector machines

    No full text
    corecore