1,511 research outputs found

    Adaptive spline fitting with particle swarm optimization

    Get PDF
    In fitting data with a spline, finding the optimal placement of knots can significantly improve the quality of the fit. However, the challenging high-dimensional and non-convex optimization problem associated with completely free knot placement has been a major roadblock in using this approach. We present a method that uses particle swarm optimization (PSO) combined with model selection to address this challenge. The problem of overfitting due to knot clustering that accompanies free knot placement is mitigated in this method by explicit regularization, resulting in a significantly improved performance on highly noisy data. The principal design choices available in the method are delineated and a statistically rigorous study of their effect on performance is carried out using simulated data and a wide variety of benchmark functions. Our results demonstrate that PSO-based free knot placement leads to a viable and flexible adaptive spline fitting approach that allows the fitting of both smooth and non-smooth functions.Comment: Accepted version; Typo corrected in equation 3; Minor changes to tex

    Adaptive spline fitting with particle swarm optimization

    Get PDF
    In fitting data with a spline, finding the optimal placement of knots can significantly improve the quality of the fit. However, the challenging high-dimensional and non-convex optimization problem associated with completely free knot placement has been a major roadblock in using this approach. We present a method that uses particle swarm optimization (PSO) combined with model selection to address this challenge. The problem of overfitting due to knot clustering that accompanies free knot placement is mitigated in this method by explicit regularization, resulting in a significantly improved performance on highly noisy data. The principal design choices available in the method are delineated and a statistically rigorous study of their effect on performance is carried out using simulated data and a wide variety of benchmark functions. Our results demonstrate that PSO-based free knot placement leads to a viable and flexible adaptive spline fitting approach that allows the fitting of both smooth and non-smooth functions

    COMPARATIVE ANALYSIS OF PARTICLE SWARM OPTIMIZATION ALGORITHMS FOR TEXT FEATURE SELECTION

    Get PDF
    With the rapid growth of Internet, more and more natural language text documents are available in electronic format, making automated text categorization a must in most fields. Due to the high dimensionality of text categorization tasks, feature selection is needed before executing document classification. There are basically two kinds of feature selection approaches: the filter approach and the wrapper approach. For the wrapper approach, a search algorithm for feature subsets and an evaluation algorithm for assessing the fitness of the selected feature subset are required. In this work, I focus on the comparison between two wrapper approaches. These two approaches use Particle Swarm Optimization (PSO) as the search algorithm. The first algorithm is PSO based K-Nearest Neighbors (KNN) algorithm, while the second is PSO based Rocchio algorithm. Three datasets are used in this study. The result shows that BPSO-KNN is slightly better in classification results than BPSO-Rocchio, while BPSO-Rocchio has far shorter computation time than BPSO-KNN

    Methods for Shape-Constrained Kernel Density Estimation

    Get PDF
    Nonparametric density estimators are used to estimate an unknown probability density while making minimal assumptions about its functional form. Although the low reliance of nonparametric estimators on modelling assumptions is a benefit, their performance will be improved if auxiliary information about the density\u27s shape is incorporated into the estimate. Auxiliary information can take the form of shape constraints, such as unimodality or symmetry, that the estimate must satisfy. Finding the constrained estimate is usually a difficult optimization problem, however, and a consistent framework for finding estimates across a variety of problems is lacking. It is proposed to find shape-constrained density estimates by starting with a pilot estimate obtained by standard methods, and subsequently adjusting its shape until the constraints are satisfied. This strategy is part of a general approach, in which a constrained estimation problem is defined by an estimator, a method of shape adjustment, a constraint, and an objective function. Optimization methods are developed to suit this approach, with a focus on kernel density estimation under a variety of constraints. Two methods of shape adjustment are examined in detail. The first is data sharpening, for which two optimization algorithms are proposed: a greedy algorithm that runs quickly but can handle a limited set of constraints, and a particle swarm algorithm that is suitable for a wider range of problems. The second is the method of adjustment curves, for which it is often possible to use quadratic programming to find optimal estimates. The methods presented here can be used for univariate or higher-dimensional kernel density estimation with shape constraints. They can also be extended to other estimators, in both the density estimation and regression settings. As such they constitute a step toward a truly general optimizer, that can be used on arbitrary combinations of estimator and constraint

    A comparative analysis of breast cancer detection and diagnosis using data visualization and machine learning applications

    Get PDF
    In the developing world, cancer death is one of the major problems for humankind. Even though there are many ways to prevent it before happening, some cancer types still do not have any treatment. One of the most common cancer types is breast cancer, and early diagnosis is the most important thing in its treatment. Accurate diagnosis is one of the most important processes in breast cancer treatment. In the literature, there are many studies about predicting the type of breast tumors. In this research paper, data about breast cancer tumors from Dr. William H. Walberg of the University of Wisconsin Hospital were used for making predictions on breast tumor types. Data visualization and machine learning techniques including logistic regression, k-nearest neighbors, support vector machine, naĂŻve Bayes, decision tree, random forest, and rotation forest were applied to this dataset. R, Minitab, and Python were chosen to be applied to these machine learning techniques and visualization. The paper aimed to make a comparative analysis using data visualization and machine learning applications for breast cancer detection and diagnosis. Diagnostic performances of applications were comparable for detecting breast cancers. Data visualization and machine learning techniques can provide significant benefits and impact cancer detection in the decision-making process. In this paper, different machine learning and data mining techniques for the detection of breast cancer were proposed. Results obtained with the logistic regression model with all features included showed the highest classification accuracy (98.1%), and the proposed approach revealed the enhancement in accuracy performances. These results indicated the potential to open new opportunities in the detection of breast cancer.No sponso

    Towards Visualization of Discrete Optimization Problems and Search Algorithms

    Get PDF
    Diskrete Optimierung beschĂ€ftigt sich mit dem Identifizieren einer Kombination oder Permutation von Elementen, die im Hinblick auf ein gegebenes quantitatives Kriterium optimal ist. Anwendungen dafĂŒr entstehen aus Problemen in der Wirtschaft, der industriellen Fertigung, den Ingenieursdisziplinen, der Mathematik und Informatik. Dazu gehören unter anderem maschinelles Lernen, die Planung der Reihenfolge und Terminierung von Fertigungsprozessen oder das Layout von integrierten Schaltkreisen. HĂ€ufig sind diskrete Optimierungsprobleme NP-hart. Dadurch kommt der Erforschung effizienter, heuristischer Suchalgorithmen eine große Bedeutung zu, um fĂŒr mittlere und große Probleminstanzen ĂŒberhaupt gute Lösungen finden zu können. Dabei wird die Entwicklung von Algorithmen dadurch erschwert, dass Eigenschaften der Probleminstanzen aufgrund von deren GrĂ¶ĂŸe und KomplexitĂ€t hĂ€ufig schwer zu identifizieren sind. Ebenso herausfordernd ist die Analyse und Evaluierung von gegebenen Algorithmen, da das Suchverhalten hĂ€ufig schwer zu charakterisieren ist. Das trifft besonders im Fall von emergentem Verhalten zu, wie es in der Forschung der Schwarmintelligenz vorkommt. Visualisierung zielt auf das Nutzen des menschlichen Sehens zur Datenverarbeitung ab. Das Gehirn hat enorme FĂ€higkeiten optische Reize von den Sehnerven zu analysieren, Formen und Muster darin zu erkennen, ihnen Bedeutung zu verleihen und dadurch ein intuitives Verstehen des Gesehenen zu ermöglichen. Diese FĂ€higkeit kann im Speziellen genutzt werden, um Hypothesen ĂŒber komplexe Daten zu generieren, indem man sie in einem Bild reprĂ€sentiert und so dem visuellen System des Betrachters zugĂ€nglich macht. Bisher wurde Visualisierung kaum genutzt um speziell die Forschung in diskreter Optimierung zu unterstĂŒtzen. Mit dieser Dissertation soll ein Ausgangspunkt geschaffen werden, um den vermehrten Einsatz von Visualisierung bei der Entwicklung von Suchheuristiken zu ermöglichen. Dazu werden zunĂ€chst die zentralen Fragen in der Algorithmenentwicklung diskutiert und daraus folgende Anforderungen an Visualisierungssysteme abgeleitet. Mögliche Forschungsrichtungen in der Visualisierung, die konkreten Nutzen fĂŒr die Forschung in der Optimierung ergeben, werden vorgestellt. Darauf aufbauend werden drei Visualisierungssysteme und eine Analysemethode fĂŒr die Erforschung diskreter Suche vorgestellt. Drei wichtige Aufgaben von Algorithmendesignern werden dabei adressiert. ZunĂ€chst wird ein System fĂŒr den detaillierten Vergleich von Algorithmen vorgestellt. Auf der Basis von Zwischenergebnissen der Algorithmen auf einer Probleminstanz wird der Suchverlauf der Algorithmen dargestellt. Der Fokus liegt dabei dem Verlauf der QualitĂ€t der Lösungen ĂŒber die Zeit, wobei die Darstellung durch den Experten mit zusĂ€tzlichem Wissen oder Klassifizierungen angereichert werden kann. Als zweites wird ein System fĂŒr die Analyse von Suchlandschaften vorgestellt. Auf Basis von Pfaden und AbstĂ€nden in der Landschaft wird eine Karte der Probleminstanz gezeichnet, die strukturelle Merkmale intuitiv erfassbar macht. Der zweite Teil der Dissertation beschĂ€ftigt sich mit der topologischen Analyse von Suchlandschaften, aufbauend auf einer Schwellwertanalyse. Ein Visualisierungssystem wird vorgestellt, dass ein topologisch equivalentes Höhenprofil der Suchlandschaft darstellt, um die topologische Struktur begreifbar zu machen. Dieses System ermöglicht zudem, den Suchverlauf eines Algorithmus direkt in der Suchlandschaft zu beobachten, was insbesondere bei der Untersuchung von Schwarmintelligenzalgorithmen interessant ist. Die Berechnung der topologischen Struktur setzt eine vollstĂ€ndige AufzĂ€hlung aller Lösungen voraus, was aufgrund der GrĂ¶ĂŸe der Suchlandschaften im allgemeinen nicht möglich ist. Um eine Anwendbarkeit der Analyse auf grĂ¶ĂŸere Probleminstanzen zu ermöglichen, wird eine Methode zur AbschĂ€tzung der Topologie vorgestellt. Die Methode erlaubt eine schrittweise Verfeinerung der topologischen Struktur und lĂ€sst sich heuristisch steuern. Dadurch können Wissen und Hypothesen des Experten einfließen um eine möglichst hohe QualitĂ€t der AnnĂ€herung zu erreichen bei gleichzeitig ĂŒberschaubarem Berechnungsaufwand.Discrete optimization deals with the identification of combinations or permutations of elements that are optimal with regard to a specific, quantitative criterion. Applications arise from problems in economy, manufacturing, engineering, mathematics and computer sciences. Among them are machine learning, scheduling of production processes, and the layout of integrated electrical circuits. Typically, discrete optimization problems are NP hard. Thus, the investigation of efficient, heuristic search algorithms is of high relevance in order to find good solutions for medium- and large-sized problem instances, at all. The development of such algorithms is complicated, because the properties of problem instances are often hard to identify due to the size and complexity of the instances. Likewise, the analysis and evaluation of given algorithms is challenging, because the search behavior of an algorithm is hard to characterize, especially in case of emergent behavior as investigated in swarm intelligence research. Visualization targets taking advantage of human vision in order to do data processing. The visual brain possesses tremendous capabilities to analyse optical stimulation through the visual nerves, perceive shapes and patterns, assign meaning to them and thus facilitate an intuitive understanding of the seen. In particular, this can be used to generate hypotheses about complex data by representing them in a well-designed depiction and making it accessible to the visual system of the viewer. So far, there is only little use of visualization to support the discrete optimization research. This thesis is meant as a starting point to allow for an increased application of visualization throughout the process of developing discrete search heuristics. For this, we discuss the central questions that arise from the development of heuristics as well as the resulting requirements on visualization systems. Possible directions of research for visualization are described that yield a specific benefit for optimization research. Based on this, three visualization systems and one analysis method are presented. These address three important tasks of algorithm designers. First, a system for the fine-grained comparison of algorithms is introduced. Based on the intermediate results of algorithm runs on a given problem instance the search process is visualized. The focus is on the progress of the solution quality over time while allowing the algorithm expert to augment the depiction with additional domain knowledge and classification of individual solutions. Second, a system for the analysis of search landscapes is presented. Based on paths and distances in the landscape, a map of the problem instance is drawn that facilitates an intuitive cognition of structural properties. The second part of this thesis focuses on the topological analysis of search landscapes, based on barriers. A visualization system is presented that shows a topological equivalent height profile of the search landscape. Further, the system facilitates to observe the search process of an algorithm directly within the search landscape. This is of particular interest when researching swarm intelligence algorithms. The computation of topological structure requires a complete enumeration of all solutions which is not possible in the general case due to the size of the search landscapes. In order to enable an application to larger problem instances, we introduce a method to approximate the topological structure. The method allows for an incremental refinement of the topological approximation that can be controlled using a heuristic. Thus, the domain expert can introduce her knowledge and also hypotheses about the problem instance into the analysis so that an approximation of good quality is achieved with reasonable computational effort
    • 

    corecore