535 research outputs found

    Memetic simulated annealing for data approximation with local-support curves

    Get PDF
    This paper introduces a new memetic optimization algorithm called MeSA (Memetic Simulated Annealing) to address the data fitting problem with local-support free-form curves. The proposed method hybridizes simulated annealing with the COBYLA local search optimization method. This approach is further combined with the centripetal parameterization and the Bayesian information criterion to compute all free variables of the curve reconstruction problem with B-splines. The performance of our approach is evaluated by its application to four different shapes with local deformations and different degrees of noise and density of data points. The MeSA method has also been compared to the non-memetic version of SA. Our results show that MeSA is able to reconstruct the underlying shape of data even in the presence of noise and low density point clouds. It also outperforms SA for all the examples in this paper.This work has been supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grants TEC2013-47141-C4-R (RACHEL) and #TIN2012-30768 (Computer Science National Program) and Toho University (Funabashi, Japan)

    Memetic electromagnetism algorithm for surface reconstruction with rational bivariate Bernstein basis functions

    Get PDF
    Surface reconstruction is a very important issue with outstanding applications in fields such as medical imaging (computer tomography, magnetic resonance), biomedical engineering (customized prosthesis and medical implants), computer-aided design and manufacturing (reverse engineering for the automotive, aerospace and shipbuilding industries), rapid prototyping (scale models of physical parts from CAD data), computer animation and film industry (motion capture, character modeling), archaeology (digital representation and storage of archaeological sites and assets), virtual/augmented reality, and many others. In this paper we address the surface reconstruction problem by using rational Bézier surfaces. This problem is by far more complex than the case for curves we solved in a previous paper. In addition, we deal with data points subjected to measurement noise and irregular sampling, replicating the usual conditions of real-world applications. Our method is based on a memetic approach combining a powerful metaheuristic method for global optimization (the electromagnetism algorithm) with a local search method. This method is applied to a benchmark of five illustrative examples exhibiting challenging features. Our experimental results show that the method performs very well, and it can recover the underlying shape of surfaces with very good accuracy.This research is kindly supported by the Computer Science National Program of the Spanish Ministry of Economy and Competitiveness, Project #TIN2012-30768, Toho University, and the University of Cantabria. The authors are particularly grateful to the Department of Information Science of Toho University for all the facilities given to carry out this work. We also thank the Editor and the two anonymous reviewers who helped us to improve our paper with several constructive comments and suggestions

    Was R < 1 before the English lockdowns?:On modelling mechanistic detail, causality and inference about Covid-19

    Get PDF
    Detail is a double edged sword in epidemiological modelling. The inclusion of mechanistic detail in models of highly complex systems has the potential to increase realism, but it also increases the number of modelling assumptions, which become harder to check as their possible interactions multiply. In a major study of the Covid-19 epidemic in England, Knock et al. (2020) fit an age structured SEIR model with added health service compartments to data on deaths, hospitalization and test results from Covid-19 in seven English regions for the period March to December 2020. The simplest version of the model has 684 states per region. One main conclusion is that only full lockdowns brought the pathogen reproduction number, R, below one, with R ≫ 1 in all regions on the eve of March 2020 lockdown. We critically evaluate the Knock et al. epidemiological model, and the semi-causal conclusions made using it, based on an independent reimplementation of the model designed to allow relaxation of some of its strong assumptions. In particular, Knock et al. model the effect on transmission of both non-pharmaceutical interventions and other effects, such as weather, using a piecewise linear function, b(t), with 12 breakpoints at selected government announcement or intervention dates. We replace this representation by a smoothing spline with time varying smoothness, thereby allowing the form of b(t) to be substantially more data driven, and we check that the corresponding smoothness assumption is not driving our results. We also reset the mean incubation time and time from first symptoms to hospitalisation, used in the model, to values implied by the papers cited by Knock et al. as the source of these quantities. We conclude that there is no sound basis for using the Knock et al. model and their analysis to make counterfactual statements about the number of deaths that would have occurred with different lockdown timings. However, if fits of this epidemiological model structure are viewed as a reasonable basis for inference about the time course of incidence and R, then without very strong modelling assumptions, the pathogen reproduction number was probably below one, and incidence in substantial decline, some days before either of the first two English national lockdowns. This result coincides with that obtained by more direct attempts to reconstruct incidence. Of course it does not imply that lockdowns had no effect, but it does suggest that other non-pharmaceutical interventions (NPIs) may have been much more effective than Knock et al. imply, and that full lockdowns were probably not the cause of R dropping below one

    Gradient boosting in automatic machine learning: feature selection and hyperparameter optimization

    Get PDF
    Das Ziel des automatischen maschinellen Lernens (AutoML) ist es, alle Aspekte der Modellwahl in prädiktiver Modellierung zu automatisieren. Diese Arbeit beschäftigt sich mit Gradienten Boosting im Kontext von AutoML mit einem Fokus auf Gradient Tree Boosting und komponentenweisem Boosting. Beide Techniken haben eine gemeinsame Methodik, aber ihre Zielsetzung ist unterschiedlich. Während Gradient Tree Boosting im maschinellen Lernen als leistungsfähiger Vorhersagealgorithmus weit verbreitet ist, wurde komponentenweises Boosting im Rahmen der Modellierung hochdimensionaler Daten entwickelt. Erweiterungen des komponentenweisen Boostings auf multidimensionale Vorhersagefunktionen werden in dieser Arbeit ebenfalls untersucht. Die Herausforderung der Hyperparameteroptimierung wird mit Fokus auf Bayesianische Optimierung und effiziente Stopping-Strategien diskutiert. Ein groß angelegter Benchmark über Hyperparameter verschiedener Lernalgorithmen, zeigt den kritischen Einfluss von Hyperparameter Konfigurationen auf die Qualität der Modelle. Diese Daten können als Grundlage für neue AutoML- und Meta-Lernansätze verwendet werden. Darüber hinaus werden fortgeschrittene Strategien zur Variablenselektion zusammengefasst und eine neue Methode auf Basis von permutierten Variablen vorgestellt. Schließlich wird ein AutoML-Ansatz vorgeschlagen, der auf den Ergebnissen und Best Practices für die Variablenselektion und Hyperparameteroptimierung basiert. Ziel ist es AutoML zu vereinfachen und zu stabilisieren sowie eine hohe Vorhersagegenauigkeit zu gewährleisten. Dieser Ansatz wird mit AutoML-Methoden, die wesentlich komplexere Suchräume und Ensembling Techniken besitzen, verglichen. Vier Softwarepakete für die statistische Programmiersprache R sind Teil dieser Arbeit, die neu entwickelt oder erweitert wurden: mlrMBO: Ein generisches Paket für die Bayesianische Optimierung; autoxgboost: Ein AutoML System, das sich vollständig auf Gradient Tree Boosting fokusiert; compboost: Ein modulares, in C++ geschriebenes Framework für komponentenweises Boosting; gamboostLSS: Ein Framework für komponentenweises Boosting additiver Modelle für Location, Scale und Shape.The goal of automatic machine learning (AutoML) is to automate all aspects of model selection in (supervised) predictive modeling. This thesis deals with gradient boosting techniques in the context of AutoML with a focus on gradient tree boosting and component-wise gradient boosting. Both techniques have a common methodology, but their goal is quite different. While gradient tree boosting is widely used in machine learning as a powerful prediction algorithm, component-wise gradient boosting strength is in feature selection and modeling of high-dimensional data. Extensions of component-wise gradient boosting to multidimensional prediction functions are considered as well. Focusing on Bayesian optimization and efficient early stopping strategies the challenge of hyperparameter optimization for these algorithms is discussed. Difficulty in the optimization of these algorithms is shown by a large scale random search on hyperparameters for machine learning algorithms, that can build the foundation of new AutoML and metalearning approaches. Furthermore, advanced feature selection strategies are summarized and a new method based on shadow features is introduced. Finally, an AutoML approach based on the results and best practices for feature selection and hyperparameter optimization is proposed, with the goal of simplifying and stabilizing AutoML while maintaining high prediction accuracy. This is compared to AutoML approaches using much more complex search spaces and ensembling techniques. Four software packages for the statistical programming language R have been newly developed or extended as a part of this thesis: mlrMBO: A general framework for Bayesian optimization; autoxgboost: An automatic machine learning framework that heavily utilizes gradient tree boosting; compboost: A modular framework for component-wise boosting written in C++; gamboostLSS: A framework for component-wise boosting for generalized additive models for location scale and shape

    Analysis of binding heterogeneity

    Get PDF
    Binding heterogeneity, due to different functional groups on a reactive surface, plays an important role in the binding of small molecules or ions to many adsorbents, both in industrial processes and in natural environments. The binding heterogeneity is described by a distribution of affinity constants since the different functional groups have different affinities for the adsorbing species.Three appraoches are discussed to obtain distribution functions on the basis of adsorption isotherms: the Local Isotherm Approximation (LIA), the Affinity Spectrum (AS) and the Differential Equilibrium Function (DEF). The methods are compared both on the basis of their derivation and on their ability to reproduce (known) distribution functions. All methods discussed need derivatives of the binding function, which are hard to obtain from experimental data. In order to apply the methods to experimental data a smoothing spline routine was adapted for the present problem. The methodology is applied to proton and copper binding to fulvic acids.Analogous to the heterogeneity analysis for binding under equilibrium conditions, a procedure was derived to determine first order rate constant distributions. The newly developed method is called LOcal Decay function Approximation (LODA). Also here an adapted smoothing spline routine is used to apply the method to experimental data. The method is illustrated by copper dissociation data from estuarine humic material.Finally it is shown how on the basis of the obtained distribution function a suitable model can be chosen for the description and prediction of binding or dissociation data

    Stochastic Dynamic Models for Functional Data.

    Full text link
    Functional data arise frequently in many fields of biomedical research as sequential observations over time. The observations are generated by an unknown dynamic mechanism. This dynamic process has an unspecified mean function, and the observations can be considered as arising from this mean function plus noise. In this dissertation, we treat this unknown function as a realization or sample path of a stochastic process, using a stochastic dynamic model (SDM). This will enable us to study dynamics of the underlying process, including how the stochastic process and its derivatives evolve over time, both within the observation time (through estimation and inference) and afterwards(through forecasting). We first introduce a new modeling strategy to estimate a smooth function for time series functional data. The proposed models and methods are illustrated on prostate specific antigen (PSA) data, where we use a Gaussian process to model the rate function of PSA and achieve more precise forecasting. We then extend the models to multi-subject functional data and consider the effect of covariates on the rate functions. We finally propose a time-varying stochastic position model, which can approximate the breakpoints in the function. The discretized model is applied to array comparative genomic hybridization (CGH) data analysis. The estimation and inference are conducted using MCMC algorithms with Euler approximation and data augmentation. Simulations and real data analysis demonstrate that our methods outperform several alternative approaches.Ph.D.BiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78804/1/bzhu_1.pd

    IEA ECES Annex 31 Final Report - Energy Storage with Energy Efficient Buildings and Districts: Optimization and Automation

    Get PDF
    At present, the energy requirements in buildings are majorly met from non-renewable sources where the contribution of renewable sources is still in its initial stage. Meeting the peak energy demand by non-renewable energy sources is highly expensive for the utility companies and it critically influences the environment through GHG emissions. In addition, renewable energy sources are inherently intermittent in nature. Therefore, to make both renewable and nonrenewable energy sources more efficient in building/district applications, they should be integrated with energy storage systems. Nevertheless, determination of the optimal operation and integration of energy storage with buildings/districts are not straightforward. The real strength of integrating energy storage technologies with buildings/districts is stalled by the high computational demand (or even lack of) tools and optimization techniques. Annex 31 aims to resolve this gap by critically addressing the challenges in integrating energy storage systems in buildings/districts from the perspective of design, development of simplified modeling tools and optimization techniques

    Image and Shape Analysis for Spatiotemporal Data

    Get PDF
    In analyzing brain development or identifying disease it is important to understand anatomical age-related changes and shape differences. Data for these studies is frequently spatiotemporal and collected from normal and/or abnormal subjects. However, images and shapes over time often have complex structures and are best treated as elements of non-Euclidean spaces. This dissertation tackles problems of uncovering time-varying changes and statistical group differences in image or shape time-series. There are three major contributions: 1) a framework of parametric regression models on manifolds to capture time-varying changes. These include a metamorphic geodesic regression approach for image time-series and standard geodesic regression, time-warped geodesic regression, and cubic spline regression on the Grassmann manifold; 2) a spatiotemporal statistical atlas approach, which augments a commonly used atlas such as the median with measures of data variance via a weighted functional boxplot; 3) hypothesis testing for shape analysis to detect group differences between populations. The proposed method for cross-sectional data uses shape ordering and hence does not require dense shape correspondences or strong distributional assumptions on the data. For longitudinal data, hypothesis testing is performed on shape trajectories which are estimated from individual subjects. Applications of these methods include 1) capturing brain development and degeneration; 2) revealing growth patterns in pediatric upper airways and the scoring of airway abnormalities; 3) detecting group differences in longitudinal corpus callosum shapes of subjects with dementia versus normal controls.Doctor of Philosoph

    The application of three-dimensional mass-spring structures in the real-time simulation of sheet materials for computer generated imagery

    Get PDF
    Despite the resources devoted to computer graphics technology over the last 40 years, there is still a need to increase the realism with which flexible materials are simulated. However, to date reported methods are restricted in their application by their use of two-dimensional structures and implicit integration methods that lend themselves to modelling cloth-like sheets but not stiffer, thicker materials in which bending moments play a significant role. This thesis presents a real-time, computationally efficient environment for simulations of sheet materials. The approach described differs from other techniques principally through its novel use of multilayer sheet structures. In addition to more accurately modelling bending moment effects, it also allows the effects of increased temperature within the environment to be simulated. Limitations of this approach include the increased difficulties of calibrating a realistic and stable simulation compared to implicit based methods. A series of experiments are conducted to establish the effectiveness of the technique, evaluating the suitability of different integration methods, sheet structures, and simulation parameters, before conducting a Human Computer Interaction (HCI) based evaluation to establish the effectiveness with which the technique can produce credible simulations. These results are also compared against a system that utilises an established method for sheet simulation and a hybrid solution that combines the use of 3D (i.e. multilayer) lattice structures with the recognised sheet simulation approach. The results suggest that the use of a three-dimensional structure does provide a level of enhanced realism when simulating stiff laminar materials although the best overall results were achieved through the use of the hybrid model

    Fabrication of Novel In-Situ Remediation Tools for Unconventional Oil Contamination

    Get PDF
    The aftermath of unconventional oil (UO) accidents highlights the lack of preparedness of governments to deal with UO emergencies. Because bioremediation is considered slow process, physicochemical treatment processes are necessary in removing contaminants to constrain the spread of oil. In preliminary phase of study, bed systems for adsorption of oil compounds packed with modified dolomite were applied as pre-treatment for bioremediation systems. The high affinity of oil molecules to the active sites due to hydrophobic nature of dolomite surface, as well as low solubility of oil in water, resulted in rapid process of oil adsorption on external surface of modified dolomite. UO contaminated site contain high concentration of polyaromatic hydrocarbons (PAHs). Thus, the final phase of study focused on finding enzyme mixture for biodegradation of PAHs contaminated sites for water and soil treatment. In this regard, screening of indigenous bacteria, identification of involved enzymes, and biodegradation tests were carried out. Several combinations of the pre-selected strains were used to create most prompting consortium for enzyme production. To mimic in situ application of enzyme mixture, bioremediation of pyrene contaminated soil was carried out in soil column tests. The average values of pyrene removal after 6 weeks indicated that the enzyme cocktail can be an appropriate concentration for soil enzymatic bioremediation in the soil column system. A bioinspired device was fabricated as a sustainable remedial method. Our results showed that after 200 seconds of circulating the enzyme solution 100% of anthracene in 1.5 L of 4.6 mg/L was removed from the beaker side. In addition to the circulation of PAH degrading enzymes in hollow fiber lumens, aliphatic degrading enzymes confined in multilayer nanofibrous membrane systems play an important role in the removal of oily compounds. Based on our studies, modified polyimide aerogels were suitable to support enzyme immobilization. The degradation tests clearly showed that immobilized enzymes had biodegradation ability for model substrate in contaminated water. Our results confirmed that immobilization of cocktail enzyme mixture enhanced their storage stability, more than 45% of its residual activity at 15 ± 1 ºC for 16 days. This study could set the guideline for the enzymatic bioremediation of aromatic pollutants especially polycyclic aromatic hydrocarbons in highly contaminated soil and water body
    corecore