43,187 research outputs found

    Data Driven Discovery in Astrophysics

    Get PDF
    We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    A case study of predicting banking customers behaviour by using data mining

    Get PDF
    Data Mining (DM) is a technique that examines information stored in large database or data warehouse and find the patterns or trends in the data that are not yet known or suspected. DM techniques have been applied to a variety of different domains including Customer Relationship Management CRM). In this research, a new Customer Knowledge Management (CKM) framework based on data mining is proposed. The proposed data mining framework in this study manages relationships between banking organizations and their customers. Two typical data mining techniques - Neural Network and Association Rules - are applied to predict the behavior of customers and to increase the decision-making processes for recalling valued customers in banking industries. The experiments on the real world dataset are conducted and the different metrics are used to evaluate the performances of the two data mining models. The results indicate that the Neural Network model achieves better accuracy but takes longer time to train the model

    Review and Comparison of Intelligent Optimization Modelling Techniques for Energy Forecasting and Condition-Based Maintenance in PV Plants

    Get PDF
    Within the field of soft computing, intelligent optimization modelling techniques include various major techniques in artificial intelligence. These techniques pretend to generate new business knowledge transforming sets of "raw data" into business value. One of the principal applications of these techniques is related to the design of predictive analytics for the improvement of advanced CBM (condition-based maintenance) strategies and energy production forecasting. These advanced techniques can be used to transform control system data, operational data and maintenance event data to failure diagnostic and prognostic knowledge and, ultimately, to derive expected energy generation. One of the systems where these techniques can be applied with massive potential impact are the legacy monitoring systems existing in solar PV energy generation plants. These systems produce a great amount of data over time, while at the same time they demand an important e ort in order to increase their performance through the use of more accurate predictive analytics to reduce production losses having a direct impact on ROI. How to choose the most suitable techniques to apply is one of the problems to address. This paper presents a review and a comparative analysis of six intelligent optimization modelling techniques, which have been applied on a PV plant case study, using the energy production forecast as the decision variable. The methodology proposed not only pretends to elicit the most accurate solution but also validates the results, in comparison with the di erent outputs for the di erent techniques

    Virtual Astronomy, Information Technology, and the New Scientific Methodology

    Get PDF
    All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transform the scientific practice, but also poses a number of common technological challenges. The Virtual Observatory concept is the astronomical community's response to these challenges: it aims to harness the progress in information technology in the service of astronomy, and at the same time provide a valuable testbed for information technology and applied computer science. Challenges broadly fall into two categories: data handling (or "data farming"), including issues such as archives, intelligent storage, databases, interoperability, fast networks, etc., and data mining, data understanding, and knowledge discovery, which include issues such as automated clustering and classification, multivariate correlation searches, pattern recognition, visualization in highly hyperdimensional parameter spaces, etc., as well as various applications of machine learning in these contexts. Such techniques are forming a methodological foundation for science with massive and complex data sets in general, and are likely to have a much broather impact on the modern society, commerce, information economy, security, etc. There is a powerful emerging synergy between the computationally enabled science and the science-driven computing, which will drive the progress in science, scholarship, and many other venues in the 21st century

    Class Association Rules Mining based Rough Set Method

    Full text link
    This paper investigates the mining of class association rules with rough set approach. In data mining, an association occurs between two set of elements when one element set happen together with another. A class association rule set (CARs) is a subset of association rules with classes specified as their consequences. We present an efficient algorithm for mining the finest class rule set inspired form Apriori algorithm, where the support and confidence are computed based on the elementary set of lower approximation included in the property of rough set theory. Our proposed approach has been shown very effective, where the rough set approach for class association discovery is much simpler than the classic association method.Comment: 10 pages, 2 figure
    • 

    corecore