Search CORE

13,043 research outputs found

Data mining as a tool for environmental scientists

Author: Athanasiadis Ioannis
Comas Joaquim
Frank Eibe
Gibert Karina
Letcher Rebecca
Spate Jessica
Sànchez-Marrè Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2006
Field of study

Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

Research Commons@Waikato

A Review of Diagnostic Techniques for ISHM Applications

Author: Aaseng Gordon
Biswas Gautam
Narasimhan Sriam
Patterson-Hine Ann
Pattipati Krishna
Publication venue
Publication date
Field of study

System diagnosis is an integral part of any Integrated System Health Management application. Diagnostic applications make use of system information from the design phase, such as safety and mission assurance analysis, failure modes and effects analysis, hazards analysis, functional models, fault propagation models, and testability analysis. In modern process control and equipment monitoring systems, topological and analytic , models of the nominal system, derived from design documents, are also employed for fault isolation and identification. Depending on the complexity of the monitored signals from the physical system, diagnostic applications may involve straightforward trending and feature extraction techniques to retrieve the parameters of importance from the sensor streams. They also may involve very complex analysis routines, such as signal processing, learning or classification methods to derive the parameters of importance to diagnosis. The process that is used to diagnose anomalous conditions from monitored system signals varies widely across the different approaches to system diagnosis. Rule-based expert systems, case-based reasoning systems, model-based reasoning systems, learning systems, and probabilistic reasoning systems are examples of the many diverse approaches ta diagnostic reasoning. Many engineering disciplines have specific approaches to modeling, monitoring and diagnosing anomalous conditions. Therefore, there is no "one-size-fits-all" approach to building diagnostic and health monitoring capabilities for a system. For instance, the conventional approaches to diagnosing failures in rotorcraft applications are very different from those used in communications systems. Further, online and offline automated diagnostic applications are integrated into an operations framework with flight crews, flight controllers and maintenance teams. While the emphasis of this paper is automation of health management functions, striking the correct balance between automated and human-performed tasks is a vital concern

NASA Technical Reports Server

An overview of decision table literature 1982-1995.

Author: Vanthienen Jan
Verhelle M
Publication venue
Publication date
Field of study

This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

Research Papers in Economics

Psychometrics in Practice at RCEC

Author: Eggen T.J.H.M.
Veldkamp B.P.
Publication venue: Ipskamp Drukkers
Publication date: 01/01/2012
Field of study

A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment

University of Twente Research Information

ISBIS 2016: Meeting on Statistics in Business and Industry

Author: Banks David
Mahmoudvand Rahim
Oliveira Amilcar
Oliveira Teresa
Ravishankar Nalini
Publication venue: Universidade Aberta
Publication date: 10/11/2016
Field of study

This Book includes the abstracts of the talks presented at the 2016 International Symposium on Business and Industrial Statistics, held at Barcelona, June 8-10, 2016, hosted at the Universitat Politècnica de Catalunya - Barcelona TECH, by the Department of Statistics and Operations Research. The location of the meeting was at ETSEIB Building (Escola Tecnica Superior d'Enginyeria Industrial) at Avda Diagonal 647. The meeting organizers celebrated the continued success of ISBIS and ENBIS society, and the meeting draw together the international community of statisticians, both academics and industry professionals, who share the goal of making statistics the foundation for decision making in business and related applications. The Scientific Program Committee was constituted by: David Banks, Duke University Amílcar Oliveira, DCeT - Universidade Aberta and CEAUL Teresa A. Oliveira, DCeT - Universidade Aberta and CEAUL Nalini Ravishankar, University of Connecticut Xavier Tort Martorell, Universitat Politécnica de Catalunya, Barcelona TECH Martina Vandebroek, KU Leuven Vincenzo Esposito Vinzi, ESSEC Business Schoo

Repositório Aberto da Universidade Aberta

Data-driven Soft Sensors in the Process Industry

Author: Abdi
Alhoniemi
Angelov
Angelov
Angelov
Arazo-Bravo
Atkeson
Bastin
Bauer
Bishop
Bogdan Gabrys
Bonne
Breiman
Bro
Casali
Chen
Chen
Chen
Chen
Chen
Choi
Chruy
Davies
Dayal
De Wolf
Desai
Devogelaere
Ding
Dong
Dong
Dote
Doyle
Dunia
Dunia
Dunia
Eriksson
Fellner
Fortuna
Fortuna
Frank
Freund
Funahashi
Gabrielsson
Gabrys
Gabrys
Gabrys
Gabrys
Gama
Geladi
Gomez
Gonzalez
Gonzalez
Goodwin
Gosset
Guyon
Han
Hastie
He
Hodge
Hotelling
Jackson
James
Jang
Jiang
Jolliffe
Jordaan
Jos de Assis
Kadlec
Kadlec
Kalos
Kampjarvi
Kittler
Kohavi
Kohonen
Kordon
Kourti
Kourti
Krogh
Kuncheva
Lee
Lee
Lee
Lee
Li
Li
Lin
Lin
Luo
Macias
Mandic
Marjanovic
Meleiro
Menold
Nauck
Neogi
Nomikos
Nomikos
Nomikos
Opitz
Park
Pearson
Pearson
Petr Kadlec
Poggio
Prasad
Principe
Qin
Qin
Qin
Qin
Radhakrishnan
Rnnar
Rong
Rotem
Ruta
Ruta
Schafer
Scheffer
Serneels
Sibylle Strandt
Stanimirova
Su
Tzanakou
van Sprang
van Sprang
Vapnik
Venkatasubramanian
Venkatasubramanian
Venkatasubramanian
Vilalta
Walczak
Walczak
Walczak
Wang
Wang
Wang
Wang
Warne
Weiss
Widmer
Wold
Wold
Wold
Wolpert
Yan
Yang
Zadeh
Zamprogna
Zamprogna
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/04/2009
Field of study

In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

Crossref

Bournemouth University Research Online

Expert Elicitation for Reliable System Design

Author: Bedford Tim
Quigley John
Walls Lesley
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

This paper reviews the role of expert judgement to support reliability assessments within the systems engineering design process. Generic design processes are described to give the context and a discussion is given about the nature of the reliability assessments required in the different systems engineering phases. It is argued that, as far as meeting reliability requirements is concerned, the whole design process is more akin to a statistical control process than to a straightforward statistical problem of assessing an unknown distribution. This leads to features of the expert judgement problem in the design context which are substantially different from those seen, for example, in risk assessment. In particular, the role of experts in problem structuring and in developing failure mitigation options is much more prominent, and there is a need to take into account the reliability potential for future mitigation measures downstream in the system life cycle. An overview is given of the stakeholders typically involved in large scale systems engineering design projects, and this is used to argue the need for methods that expose potential judgemental biases in order to generate analyses that can be said to provide rational consensus about uncertainties. Finally, a number of key points are developed with the aim of moving toward a framework that provides a holistic method for tracking reliability assessment through the design process.Comment: This paper commented in: [arXiv:0708.0285], [arXiv:0708.0287], [arXiv:0708.0288]. Rejoinder in [arXiv:0708.0293]. Published at http://dx.doi.org/10.1214/088342306000000510 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

University of Strathclyde Institutional Repository

ResearchOnline@GCU

Recommended from our members

Effective techniques for handling incomplete data using decision trees

Author: Twala Bhekisipho E.T.H.
Publication venue
Publication date: 01/01/2005
Field of study

Decision Trees (DTs) have been recognized as one of the most successful formalisms for knowledge representation and reasoning and are currently applied to a variety of data mining or knowledge discovery applications, particularly for classification problems. There are several efficient methods to learn a DT from data. However, these methods are often limited to the assumption that data are complete. In this thesis, some contributions to the field of machine learning and statistics that solve the problem of extracting DTs for learning and classification tasks from incomplete databases are presented. The methodology underlying the thesis blends together well-established statistical theories with the most advanced techniques for machine learning and automated reasoning with uncertainty. The first contribution is the extensive simulations which study the impact of missing data on predictive accuracy of existing DTs which can cope with missing values, when missing values are in both the training and test sets or when they are in either of the two sets. All simulations are performed under missing completely at random, missing at random and informatively missing mechanisms and for different missing data patterns and proportions. The proposal of a simple, novel, yet effective proposed procedure for training and testing using decision trees in the presence of missing data is the next contribution. Original and simple splitting criteria for attribute selection in tree building are put forward. The proposed technique is evaluated and validated in empirical tests over many real world application domains. In this work, the proposed algorithm maintains (sometimes exceeds) the outstanding accuracy of multiple imputation, especially on datasets containing mixed attributes and purely nominal attributes. Also, the proposed algorithm greatly improves in accuracy for IM data. Another major advantage of this method over multiple imputation is the important saving in computational resources due to it simplicity. The next contribution is the proposal of three versions of simple probabilistic techniques that could be used for classifying incomplete vectors using decision trees based on complete data. The proposed procedure is superficially similar to that of fractional cases but more effective. The experimental results demonstrate that these approaches can achieve comparative quality to sophisticated algorithms like multiple imputation and therefore are applicable to all kinds of datasets. Finally, novel uses of two proposed ensemble procedures for handling incomplete training and test data are proposed and discussed. The algorithms combine the two best approaches either with resampling (REMIMIA) or without resampling (EMIMIA) of the training data before growing the decision trees. Experiments are used to evaluate and validate the success of the proposed ensemble methods with respect to individual missing data techniques in the form of empirical tests. EMIMIA attains the highest overall level of prediction accuracy

Open Research Online (The Open University)