5,086 research outputs found

    Predictive modeling of die filling of the pharmaceutical granules using the flexible neural tree

    Get PDF
    In this work, a computational intelligence (CI) technique named flexible neural tree (FNT) was developed to predict die filling performance of pharmaceutical granules and to identify significant die filling process variables. FNT resembles feedforward neural network, which creates a tree-like structure by using genetic programming. To improve accuracy, FNT parameters were optimized by using differential evolution algorithm. The performance of the FNT-based CI model was evaluated and compared with other CI techniques: multilayer perceptron, Gaussian process regression, and reduced error pruning tree. The accuracy of the CI model was evaluated experimentally using die filling as a case study. The die filling experiments were performed using a model shoe system and three different grades of microcrystalline cellulose (MCC) powders (MCC PH 101, MCC PH 102, and MCC DG). The feed powders were roll-compacted and milled into granules. The granules were then sieved into samples of various size classes. The mass of granules deposited into the die at different shoe speeds was measured. From these experiments, a dataset consisting true density, mean diameter (d50), granule size, and shoe speed as the inputs and the deposited mass as the output was generated. Cross-validation (CV) methods such as 10FCV and 5x2FCV were applied to develop and to validate the predictive models. It was found that the FNT-based CI model (for both CV methods) performed much better than other CI models. Additionally, it was observed that process variables such as the granule size and the shoe speed had a higher impact on the predictability than that of the powder property such as d50. Furthermore, validation of model prediction with experimental data showed that the die filling behavior of coarse granules could be better predicted than that of fine granules

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    A Multi-Agent Architecture for the Design of Hierarchical Interval Type-2 Beta Fuzzy System

    Get PDF
    This paper presents a new methodology for building and evolving hierarchical fuzzy systems. For the system design, a tree-based encoding method is adopted to hierarchically link low dimensional fuzzy systems. Such tree structural representation has by nature a flexible design offering more adjustable and modifiable structures. The proposed hierarchical structure employs a type-2 beta fuzzy system to cope with the faced uncertainties, and the resulting system is called the Hierarchical Interval Type-2 Beta Fuzzy System (HT2BFS). For the system optimization, two main tasks of structure learning and parameter tuning are applied. The structure learning phase aims to evolve and learn the structures of a population of HT2BFS in a multiobjective context taking into account the optimization of both the accuracy and the interpretability metrics. The parameter tuning phase is applied to refine and adjust the parameters of the system. To accomplish these two tasks in the most optimal and faster way, we further employ a multi-agent architecture to provide both a distributed and a cooperative management of the optimization tasks. Agents are divided into two different types based on their functions: a structure agent and a parameter agent. The main function of the structure agent is to perform a multi-objective evolutionary structure learning step by means of the Multi-Objective Immune Programming algorithm (MOIP). The parameter agents have the function of managing different hierarchical structures simultaneously to refine their parameters by means of the Hybrid Harmony Search algorithm (HHS). In this architecture, agents use cooperation and communication concepts to create high-performance HT2BFSs. The performance of the proposed system is evaluated by several comparisons with various state of art approaches on noise-free and noisy time series prediction data sets and regression problems. The results clearly demonstrate a great improvement in the accuracy rate, the convergence speed and the number of used rules as compared with other existing approaches

    Data Mining Using Relational Database Management Systems

    Get PDF
    Software packages providing a whole set of data mining and machine learning algorithms are attractive because they allow experimentation with many kinds of algorithms in an easy setup. However, these packages are often based on main-memory data structures, limiting the amount of data they can handle. In this paper we use a relational database as secondary storage in order to eliminate this limitation. Unlike existing approaches, which often focus on optimizing a single algorithm to work with a database backend, we propose a general approach, which provides a database interface for several algorithms at once. We have taken a popular machine learning software package, Weka, and added a relational storage manager as back-tier to the system. The extension is transparent to the algorithms implemented in Weka, since it is hidden behind Weka’s standard main-memory data structure interface. Furthermore, some general mining tasks are transfered into the database system to speed up execution. We tested the extended system, refered to as WekaDB, and our results show that it achieves a much higher scalability than Weka, while providing the same output and maintaining good computation time

    Reconstructing (super)trees from data sets with missing distances: Not all is lost

    Get PDF
    The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, present significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Here, we introduce Lasso, a novel heuristic approach for reconstructing rooted phylogenetic trees from distance matrices with missing values, for datasets where a molecular clock may be assumed. Contrary to other phylogenetic methods on partial datasets, Lasso possesses desirable properties such as its reconstructed trees being both unique and edge-weighted. These properties are achieved by Lasso restricting its leaf set to a large subset of all possible taxa, which in many practical situations is the entire taxa set. Furthermore, the Lasso approach is distance-based, rendering it very fast to run and suitable for datasets of all sizes, including large datasets such as those generated by modern Next Generation Sequencing technologies. To better understand the performance of Lasso, we assessed it by means of artificial and real biological datasets, showing its effectiveness in the presence of missing data. Furthermore, by formulating the supermatrix problem as a particular case of the missing data problem, we assessed Lasso's ability to reconstruct supertrees. We demonstrate that, although not specifically designed for such a purpose, Lasso performs better than or comparably with five leading supertree algorithms on a challenging biological data set. Finally, we make freely available a software implementation of Lasso so that researchers may, for the first time, perform both rooted tree and supertree reconstruction with branch lengths on their own partial datasets

    How to shift bias: Lessons from the Baldwin effect

    Get PDF
    An inductive learning algorithm takes a set of data as input and generates a hypothesis as output. A set of data is typically consistent with an infinite number of hypotheses; therefore, there must be factors other than the data that determine the output of the learning algorithm. In machine learning, these other factors are called the bias of the learner. Classical learning algorithms have a fixed bias, implicit in their design. Recently developed learning algorithms dynamically adjust their bias as they search for a hypothesis. Algorithms that shift bias in this manner are not as well understood as classical algorithms. In this paper, we show that the Baldwin effect has implications for the design and analysis of bias shifting algorithms. The Baldwin effect was proposed in 1896, to explain how phenomena that might appear to require Lamarckian evolution (inheritance of acquired characteristics) can arise from purely Darwinian evolution. Hinton and Nowlan presented a computational model of the Baldwin effect in 1987. We explore a variation on their model, which we constructed explicitly to illustrate the lessons that the Baldwin effect has for research in bias shifting algorithms. The main lesson is that it appears that a good strategy for shift of bias in a learning algorithm is to begin with a weak bias and gradually shift to a strong bias

    Deep learning for video game playing

    Get PDF
    In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

    A symbolic data-driven technique based on evolutionary polynomial regression

    Get PDF
    This paper describes a new hybrid regression method that combines the best features of conventional numerical regression techniques with the genetic programming symbolic regression technique. The key idea is to employ an evolutionary computing methodology to search for a model of the system/process being modelled and to employ parameter estimation to obtain constants using least squares. The new technique, termed Evolutionary Polynomial Regression (EPR) overcomes shortcomings in the GP process, such as computational performance; number of evolutionary parameters to tune and complexity of the symbolic models. Similarly, it alleviates issues arising from numerical regression, including difficulties in using physical insight and over-fitting problems. This paper demonstrates that EPR is good, both in interpolating data and in scientific knowledge discovery. As an illustration, EPR is used to identify polynomial formulæ with progressively increasing levels of noise, to interpolate the Colebrook-White formula for a pipe resistance coefficient and to discover a formula for a resistance coefficient from experimental data

    Evolutionary polymorphic neural networks in chemical engineering modeling

    Get PDF
    Evolutionary Polymorphic Neural Network (EPNN) is a novel approach to modeling chemical, biochemical and physical processes. This approach has its basis in modern artificial intelligence, especially neural networks and evolutionary computing. EPNN can perform networked symbolic regressions for input-output data, while providing information about both the structure and complexity of a process during its own evolution. In this work three different processes are modeled: 1. A dynamic neutralization process. 2. An aqueous two-phase system. 3. Reduction of a biodegradation model. In all three cases, EPNN shows better or at least equal performances over published data than traditional thermodynamics /transport or neural network models. Furthermore, in those cases where traditional modeling parameters are difficult to determine, EPNN can be used as an auxiliary tool to produce equivalent empirical formulae for the target process. Feedback links in EPNN network can be formed through training (evolution) to perform multiple steps ahead predictions for dynamic nonlinear systems. Unlike existing applications combining neural networks and genetic algorithms, symbolic formulae can be extracted from EPNN modeling results for further theoretical analysis and process optimization. EPNN system can also be used for data prediction tuning. In which case, only a minimum number of initial system conditions need to be adjusted. Therefore, the network structure of EPNN is more flexible and adaptable than traditional neural networks. Due to the polymorphic and evolutionary nature of the EPNN system, the initially randomized values of constants in EPNN networks will converge to the same or similar forms of functions in separate runs until the training process ends. The EPNN system is not sensitive to differences in initial values of the EPNN population. However, if there exists significant larger noise in one or more data sets in the whole data composition, the EPNN system will probably fail to converge to a satisfactory level of prediction on these data sets. EPNN networks with a relatively small number of neurons can achieve similar or better performance than both traditional thermodynamic and neural network models. The developed EPNN approach provides alternative methods for efficiently modeling complex, dynamic or steady-state chemical processes. EPNN is capable of producing symbolic empirical formulae for chemical processes, regardless of whether or not traditional thermodynamic models are available or can be applied. The EPNN approach does overcome some of the limitations of traditional thermodynamic /transport models and traditional neural network models
    corecore