567 research outputs found

    Query-driven learning for predictive analytics of data subspace cardinality

    Get PDF
    Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of data items) of multi-dimensional data subspaces, defined by query selections over datasets. This is crucial for data analysts dealing with, e.g., interactive data subspace explorations, data subspace visualizations, and in query processing optimization. However, in many modern data systems, predictive analytics may be (i) too costly money-wise, e.g., in clouds, (ii) unreliable, e.g., in modern Big Data query engines, where accurate statistics are difficult to obtain/maintain, or (iii) infeasible, e.g., for privacy issues. We contribute a novel, query-driven, function estimation model of analyst-defined data subspace cardinality. The proposed estimation model is highly accurate in terms of prediction and accommodating the well-known selection queries: multi-dimensional range and distance-nearest neighbors (radius) queries. Our function estimation model: (i) quantizes the vectorial query space, by learning the analysts’ access patterns over a data space, (ii) associates query vectors with their corresponding cardinalities of the analyst-defined data subspaces, (iii) abstracts and employs query vectorial similarity to predict the cardinality of an unseen/unexplored data subspace, and (iv) identifies and adapts to possible changes of the query subspaces based on the theory of optimal stopping. The proposed model is decentralized, facilitating the scaling-out of such predictive analytics queries. The research significance of the model lies in that (i) it is an attractive solution when data-driven statistical techniques are undesirable or infeasible, (ii) it offers a scale-out, decentralized training solution, (iii) it is applicable to different selection query types, and (iv) it offers a performance that is superior to that of data-driven approaches

    Similarity-based methods: a general framework for classification, approximation and association

    Get PDF
    Similarity-based methods (SBM) are a generalization of the minimal distance (MD) methods which form a basis of several machine learning and pattern recognition methods. Investigation of similarity leads to a fruitful framework in which many classification, approximation and association methods are accommodated. Probability p(C|X;M) of assigning class C to a vector X, given a classification modelM, depends on adaptive parameters and procedures used in construction of the model. Systematic overview of choices available for model building is described and numerous improvements suggested. Similarity-Based Methods have natural neural-network type realizations. Such neural network models as the Radial Basis Functions (RBF) and the Multilayer Perceptrons (MLPs) are included in this framework as special cases. SBM may also include several different submodels and a procedure to combine their results. Many new versions of similarity-based methods are derived from this framework. A search in the space of all methods belonging to the SBM framework finds a particular combination of parameterizations and procedures that is most appropriate for a given data. No single classification method can beat this approach. Preliminary implementation of SBM elements tested on a realworld datasets gave very good results

    Robust Classification with Convolutional Prototype Learning

    Full text link
    Convolutional neural networks (CNNs) have been widely used for image classification. Despite its high accuracies, CNN has been shown to be easily fooled by some adversarial examples, indicating that CNN is not robust enough for pattern classification. In this paper, we argue that the lack of robustness for CNN is caused by the softmax layer, which is a totally discriminative model and based on the assumption of closed world (i.e., with a fixed number of categories). To improve the robustness, we propose a novel learning framework called convolutional prototype learning (CPL). The advantage of using prototypes is that it can well handle the open world recognition problem and therefore improve the robustness. Under the framework of CPL, we design multiple classification criteria to train the network. Moreover, a prototype loss (PL) is proposed as a regularization to improve the intra-class compactness of the feature representation, which can be viewed as a generative model based on the Gaussian assumption of different classes. Experiments on several datasets demonstrate that CPL can achieve comparable or even better results than traditional CNN, and from the robustness perspective, CPL shows great advantages for both the rejection and incremental category learning tasks

    Structural Analysis Algorithms for Nanomaterials

    Get PDF

    Robust recognition and exploratory analysis of crystal structures using machine learning

    Get PDF
    In den Materialwissenschaften läuten Künstliche-Intelligenz Methoden einen Paradigmenwechsel in Richtung Big-data zentrierter Forschung ein. Datenbanken mit Millionen von Einträgen, sowie hochauflösende Experimente, z.B. Elektronenmikroskopie, enthalten eine Fülle wachsender Information. Um diese ungenützten, wertvollen Daten für die Entdeckung verborgener Muster und Physik zu nutzen, müssen automatische analytische Methoden entwickelt werden. Die Kristallstruktur-Klassifizierung ist essentiell für die Charakterisierung eines Materials. Vorhandene Daten bieten vielfältige atomare Strukturen, enthalten jedoch oft Defekte und sind unvollständig. Eine geeignete Methode sollte diesbezüglich robust sein und gleichzeitig viele Systeme klassifizieren können, was für verfügbare Methoden nicht zutrifft. In dieser Arbeit entwickeln wir ARISE, eine Methode, die auf Bayesian deep learning basiert und mehr als 100 Strukturklassen robust und ohne festzulegende Schwellwerte klassifiziert. Die einfach erweiterbare Strukturauswahl ist breit gefächert und umfasst nicht nur Bulk-, sondern auch zwei- und ein-dimensionale Systeme. Für die lokale Untersuchung von großen, polykristallinen Systemen, führen wir die strided pattern matching Methode ein. Obwohl nur auf perfekte Strukturen trainiert, kann ARISE stark gestörte mono- und polykristalline Systeme synthetischen als auch experimentellen Ursprungs charakterisieren. Das Model basiert auf Bayesian deep learning und ist somit probabilistisch, was die systematische Berechnung von Unsicherheiten erlaubt, welche mit der Kristallordnung von metallischen Nanopartikeln in Elektronentomographie-Experimenten korrelieren. Die Anwendung von unüberwachtem Lernen auf interne Darstellungen des neuronalen Netzes enthüllt Korngrenzen und nicht ersichtliche Regionen, die über interpretierbare geometrische Eigenschaften verknüpft sind. Diese Arbeit ermöglicht die Analyse atomarer Strukturen mit starken Rauschquellen auf bisher nicht mögliche Weise.In materials science, artificial-intelligence tools are driving a paradigm shift towards big data-centric research. Large computational databases with millions of entries and high-resolution experiments such as electron microscopy contain large and growing amount of information. To leverage this under-utilized - yet very valuable - data, automatic analytical methods need to be developed. The classification of the crystal structure of a material is essential for its characterization. The available data is structurally diverse but often defective and incomplete. A suitable method should therefore be robust with respect to sources of inaccuracy, while being able to treat multiple systems. Available methods do not fulfill both criteria at the same time. In this work, we introduce ARISE, a Bayesian-deep-learning based framework that can treat more than 100 structural classes in robust fashion, without any predefined threshold. The selection of structural classes, which can be easily extended on demand, encompasses a wide range of materials, in particular, not only bulk but also two- and one-dimensional systems. For the local study of large, polycrystalline samples, we extend ARISE by introducing so-called strided pattern matching. While being trained on ideal structures only, ARISE correctly characterizes strongly perturbed single- and polycrystalline systems, from both synthetic and experimental resources. The probabilistic nature of the Bayesian-deep-learning model allows to obtain principled uncertainty estimates which are found to be correlated with crystalline order of metallic nanoparticles in electron-tomography experiments. Applying unsupervised learning to the internal neural-network representations reveals grain boundaries and (unapparent) structural regions sharing easily interpretable geometrical properties. This work enables the hitherto hindered analysis of noisy atomic structural data

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    Predicting Porosity and Microstructure of 3D Printed Part Using Machine Learning

    Full text link
    Additive Manufacturing (AM) is characterized as building a 3-D object one layer at a time. Due to flexibility in design and functionality, additive manufacturing (AM) is an attractive technology for the manufacturing industry. Still, the lack of consistency in quality is one of the main limitations preventing the use of this process to produce end-use products. Current techniques in additive manufacturing face a significant challenge concerning various processing parameters, including scan speed/velocity, laser power, layer thickness, etc. which leads to the inconsistency of the quality of the printed products. Therefore, this research focuses on change, especially on the monitoring and regulation of processes, and helps us predict the level of porosity in a 3D printed part and classify grain growth structure as equiaxed or columnar given the simulation data using state-of-the-art machine learning algorithms. The input parameters considered in this study that affects porosity and grain growth structure are energy density, gas atmosphere, powder particle size and shape, and overlap rate. The data for training machine learning models are collected using ANSYS Additive Manufacturing simulations. The total data collected for porosity prediction is 482 data points, and for the grain growth structure is 12,333 data points. In order to predict the porosity and grain growth structure, a technique based on Artificial Intelligence (Machine learning) is suggested to make the necessary compensations to process monitoring and control, which will subsequently improve the quality of the final product. A feed-forward ANN model is trained in this methodology using an error back-propagation algorithm to predict the porosity level. Also, different classification models such as Support Vector Machines, Meta-classifier classify the microstructure as columnar or equiaxed grains, resulting in part quality improvement. The Backpropagation Neural Network model for porosity prediction gave an accuracy of 100% while outperforming other models. The best results for microstructure prediction are achieved by Meta-classifier, K-Nearest Neighbor, and Random Forest classifier with 100% accuracy. The findings in this study provide evidence and insight that Artificial intelligence and machine learning techniques can be used in the field of Additive Manufacturing for real-time process control and monitoring with the scope of implementation on a larger scale.Master of Science in EngineeringIndustrial and Systems Engineering, College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/156397/1/Priya Dhage Final Thesis.pdfDescription of Priya Dhage Final Thesis.pdf : Thesi
    corecore