13,043 research outputs found
Data mining as a tool for environmental scientists
Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous
A Review of Diagnostic Techniques for ISHM Applications
System diagnosis is an integral part of any Integrated System Health Management application. Diagnostic applications make use of system information from the design phase, such as safety and mission assurance analysis, failure modes and effects analysis, hazards analysis, functional models, fault propagation models, and testability analysis. In modern process control and equipment monitoring systems, topological and analytic , models of the nominal system, derived from design documents, are also employed for fault isolation and identification. Depending on the complexity of the monitored signals from the physical system, diagnostic applications may involve straightforward trending and feature extraction techniques to retrieve the parameters of importance from the sensor streams. They also may involve very complex analysis routines, such as signal processing, learning or classification methods to derive the parameters of importance to diagnosis. The process that is used to diagnose anomalous conditions from monitored system signals varies widely across the different approaches to system diagnosis. Rule-based expert systems, case-based reasoning systems, model-based reasoning systems, learning systems, and probabilistic reasoning systems are examples of the many diverse approaches ta diagnostic reasoning. Many engineering disciplines have specific approaches to modeling, monitoring and diagnosing anomalous conditions. Therefore, there is no "one-size-fits-all" approach to building diagnostic and health monitoring capabilities for a system. For instance, the conventional approaches to diagnosing failures in rotorcraft applications are very different from those used in communications systems. Further, online and offline automated diagnostic applications are integrated into an operations framework with flight crews, flight controllers and maintenance teams. While the emphasis of this paper is automation of health management functions, striking the correct balance between automated and human-performed tasks is a vital concern
An overview of decision table literature 1982-1995.
This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.
Psychometrics in Practice at RCEC
A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud
All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment
ISBIS 2016: Meeting on Statistics in Business and Industry
This Book includes the abstracts of the talks presented at the 2016 International Symposium on Business and Industrial Statistics, held at Barcelona, June 8-10, 2016, hosted at the Universitat PolitĆØcnica de Catalunya - Barcelona TECH, by the Department of Statistics and Operations Research. The location of the meeting was at ETSEIB Building (Escola Tecnica Superior d'Enginyeria Industrial) at Avda Diagonal 647.
The meeting organizers celebrated the continued success of ISBIS and ENBIS society, and the meeting draw together the international community of statisticians, both academics and industry professionals, who share the goal of making statistics the foundation for decision making in business and related applications. The Scientific Program Committee was constituted by:
David Banks, Duke University
AmĆlcar Oliveira, DCeT - Universidade Aberta and CEAUL
Teresa A. Oliveira, DCeT - Universidade Aberta and CEAUL
Nalini Ravishankar, University of Connecticut
Xavier Tort Martorell, Universitat PolitƩcnica de Catalunya, Barcelona TECH
Martina Vandebroek, KU Leuven
Vincenzo Esposito Vinzi, ESSEC Business Schoo
Data-driven Soft Sensors in the Process Industry
In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work
Expert Elicitation for Reliable System Design
This paper reviews the role of expert judgement to support reliability
assessments within the systems engineering design process. Generic design
processes are described to give the context and a discussion is given about the
nature of the reliability assessments required in the different systems
engineering phases. It is argued that, as far as meeting reliability
requirements is concerned, the whole design process is more akin to a
statistical control process than to a straightforward statistical problem of
assessing an unknown distribution. This leads to features of the expert
judgement problem in the design context which are substantially different from
those seen, for example, in risk assessment. In particular, the role of experts
in problem structuring and in developing failure mitigation options is much
more prominent, and there is a need to take into account the reliability
potential for future mitigation measures downstream in the system life cycle.
An overview is given of the stakeholders typically involved in large scale
systems engineering design projects, and this is used to argue the need for
methods that expose potential judgemental biases in order to generate analyses
that can be said to provide rational consensus about uncertainties. Finally, a
number of key points are developed with the aim of moving toward a framework
that provides a holistic method for tracking reliability assessment through the
design process.Comment: This paper commented in: [arXiv:0708.0285], [arXiv:0708.0287],
[arXiv:0708.0288]. Rejoinder in [arXiv:0708.0293]. Published at
http://dx.doi.org/10.1214/088342306000000510 in the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Recommended from our members
Effective techniques for handling incomplete data using decision trees
Decision Trees (DTs) have been recognized as one of the most successful formalisms for knowledge representation and reasoning and are currently applied to a variety of data mining or knowledge discovery applications, particularly for classification problems. There are several efficient methods to learn a DT from data. However, these methods are often limited to the assumption that data are complete.
In this thesis, some contributions to the field of machine learning and statistics that solve the problem of extracting DTs for learning and classification tasks from incomplete databases are presented. The methodology underlying the thesis blends together well-established statistical theories with the most advanced techniques for machine learning and automated reasoning with uncertainty.
The first contribution is the extensive simulations which study the impact of missing data on predictive accuracy of existing DTs which can cope with missing values, when missing values are in both the training and test sets or when they are in either of the two sets. All simulations are performed under missing completely at random, missing at random and informatively missing mechanisms and for different missing data patterns and proportions.
The proposal of a simple, novel, yet effective proposed procedure for training and testing using decision trees in the presence of missing data is the next contribution. Original and simple splitting criteria for attribute selection in tree building are put forward. The proposed technique is evaluated and validated in empirical tests over many real world application domains. In this work, the proposed algorithm maintains (sometimes exceeds) the outstanding accuracy of multiple imputation, especially on datasets containing mixed attributes and purely nominal attributes. Also, the proposed algorithm greatly improves in accuracy for IM data. Another major advantage of this method over multiple imputation is the important saving in computational resources due to it simplicity.
The next contribution is the proposal of three versions of simple probabilistic techniques that could be used for classifying incomplete vectors using decision trees based on complete data. The proposed procedure is superficially similar to that of fractional cases but more effective. The experimental results demonstrate that these approaches can achieve comparative quality to sophisticated algorithms like multiple imputation and therefore are applicable to all kinds of datasets.
Finally, novel uses of two proposed ensemble procedures for handling incomplete training and test data are proposed and discussed. The algorithms combine the two best approaches either with resampling (REMIMIA) or without resampling (EMIMIA) of the training data before growing the decision trees. Experiments are used to evaluate and validate the success of the proposed ensemble methods with respect to individual missing data techniques in the form of empirical tests. EMIMIA attains the highest overall level of prediction accuracy
- ā¦