173,272 research outputs found

    Automated Knowledge Discovery using Neural Networks

    Get PDF
    The natural world is known to consistently abide by scientific laws that can be expressed concisely in mathematical terms, including differential equations. To understand the patterns that define these scientific laws, it is necessary to discover and solve these mathematical problems after making observations and collecting data on natural phenomena. While artificial neural networks are powerful black-box tools for automating tasks related to intelligence, the solutions we seek are related to the concise and interpretable form of symbolic mathematics. In this work, we focus on the idea of a symbolic function learner, or SFL. A symbolic function learner can be any algorithm that is able to produce a symbolic mathematical expression that aims to optimize a given objective function. By choosing different objective functions, the SFL can be tuned to handle different learning tasks. We present a model for an SFL that is based on neural networks and can be trained using deep learning. We then use this SFL to approach the computational task of automating discovery of scientific knowledge in three ways. We first apply our symbolic function learner as a tool for symbolic regression, a curve-fitting problem that has traditionally been approached using genetic evolution algorithms. We show that our SFL performs competitively in comparison to genetic algorithms and neural network regressors on a sample collection of regression instances. We also reframe the problem of learning differential equations as a task in symbolic regression, and use our SFL to rediscover some equations from classical physics from data. We next present a machine-learning based method for solving differential equations symbolically. When neural networks are used to solve differential equations, they usually produce solutions in the form of black-box functions that are not directly mathematically interpretable. We introduce a method for generating symbolic expressions to solve differential equations while leveraging deep learning training methods. Unlike existing methods, our system does not require learning a language model over symbolic mathematics, making it scalable, compact, and easily adaptable for a variety of tasks and configurations. The system is designed to always return a valid symbolic formula, generating a useful approximation when an exact analytic solution to a differential equation is not or cannot be found. We demonstrate through examples the way our method can be applied on a number of differential equations that are rooted in the natural sciences, often obtaining symbolic approximations that are useful or insightful. Furthermore, we show how the system can be effortlessly generalized to find symbolic solutions to other mathematical tasks, including integration and functional equations. We then introduce a novel method for discovering implicit relationships between variables in structured datasets in an unsupervised way. Rather than explicitly designating a causal relationship between input and output variables, our method finds mathematical relationships between variables without treating any variable as distinguished from any other. As a result, properties about the data itself can be discovered, rather than rules for predicting one variable from the others. We showcase examples of our method in the domain of geometry, demonstrating how we can re-discover famous geometric identities automatically from artificially generated data. In total, this thesis aims to strengthen the connection between neural networks and problems in symbolic mathematics. Our proposed SFL is the main tool that we show can be applied to a variety of tasks, including but not limited to symbolic regression. We show how using this approach to symbolic function learning paves the way for future developments in automated scientific knowledge discovery

    Self-Organizing Information Fusion and Hierarchical Knowledge Discovery: A New Framework Using Artmap Neural Networks

    Full text link
    Classifying novel terrain or objects from sparse, complex data may require the resolution of conflicting information from sensors woring at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when eveidence variously suggests that and object's class is car, truck, or airplane. The methods described her address a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among classes are assumed to be unknown to the autonomated system or the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierachical knowlege structures. The fusion system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples, but is not limited to image domain.Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016, NMA 501-03-1-2030); National Science Foundation (SBE-0354378, DGE-0221680); Office of Naval Research (N00014-01-1-0624); Department of Homeland Securit

    The Impact of Data Imputation Methodologies on Knowledge Discovery

    Get PDF
    The purpose of this research is to investigate the impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a KDD (Knowledge Discovery in Databases) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. Several Knowledge Discovery models will be developed utilizing secondary data containing known correct values. Tests will be conducted on the secondary data both before and after storing data instances with known results and then identifying imprecise data values. One of the integral stages in the accomplishment of successful Knowledge Discovery is the Data Mining phase. The actual Data Mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Neural Networks are the most commonly selected tools for Data Mining classification and prediction. Neural Networks employ various types of Transfer Functions when outputting data. The most commonly employed Transfer Function is the s-Sigmoid Function. Various Knowledge Discovery Models from various research and business disciplines were tested using this framework. However, missing and inconsistent data has been pervasive problems in the history of data analysis since the origin of data collection. Due to advancements in the capacities of data storage and the proliferation of computer software, more historical data is being collected and analyzed today than ever before. The issue of missing data must be addressed, since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of Missing Data and Data Imputation on the Data Mining phase of Knowledge Discovery when Neural Networks are utilized when employing an s-Sigmoid Transfer function, and are confronted with Missing Data and Data Imputation methodologie

    The Impact of Data Imputation Methodologies on Knowledge Discovery

    Get PDF
    The purpose of this research is to investigate the impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a KDD (Knowledge Discovery in Databases) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. Several Knowledge Discovery models will be developed utilizing secondary data containing known correct values. Tests will be conducted on the secondary data both before and after storing data instances with known results and then identifying imprecise data values. One of the integral stages in the accomplishment of successful Knowledge Discovery is the Data Mining phase. The actual Data Mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Neural Networks are the most commonly selected tools for Data Mining classification and prediction. Neural Networks employ various types of Transfer Functions when outputting data. The most commonly employed Transfer Function is the s-Sigmoid Function. Various Knowledge Discovery Models from various research and business disciplines were tested using this framework. However, missing and inconsistent data has been pervasive problems in the history of data analysis since the origin of data collection. Due to advancements in the capacities of data storage and the proliferation of computer software, more historical data is being collected and analyzed today than ever before. The issue of missing data must be addressed, since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of Missing Data and Data Imputation on the Data Mining phase of Knowledge Discovery when Neural Networks are utilized when employing an s-Sigmoid Transfer function, and are confronted with Missing Data and Data Imputation methodologie

    Neural Networks in Data Mining

    Get PDF
    Data Mining means extraction of hidden predictive information from huge amount of databases. It is beneficial in every field like business, engineering, web data etc. In data mining classification of data is very difficult task that can be solving by using different algorithms. The more common model functions in data mining include classification, clustering, rule generation and knowledge discovery. There are many technologies available to data mining, including Artificial Neural Networks, Regression, and Decision Trees. In this paper the data mining based on neural networks is studied in detail, and the key technology and ways to achieve the data mining based on neural networks are also studied

    Extraction of similarity based fuzzy rules from artificial neural networks

    Get PDF
    A method to extract a fuzzy rule based system from a trained artificial neural network for classification is presented. The fuzzy system obtained is equivalent to the corresponding neural network. In the antecedents of the fuzzy rules, it uses the similarity between the input datum and the weight vectors. This implies rules highly understandable. Thus, both the fuzzy system and a simple analysis of the weight vectors are enough to discern the hidden knowledge learnt by the neural network. Several classification problems are presented to illustrate this method of knowledge discovery by using artificial neural networks
    • …
    corecore