Search CORE

115,782 research outputs found

Mining developer communication data streams

Author: Connor Andy M.
Finlay Jacqui
Pears Russel
Publication venue
Publication date: 22/07/2014
Field of study

This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build

arXiv.org e-Print Archive

Crossref

Hybrid model using logit and nonparametric methods for predicting micro-entity failure

Author: Blanco Oliver Antonio Jesús
Irimia Diéguez Ana Isabel
Oliver Alfonso María Dolores
Vázquez Cueto María José
Publication venue: LLC “Consulting Publishing Company “Business Perspectives”
Publication date: 01/01/2016
Field of study

Following the calls from literature on bankruptcy, a parsimonious hybrid bankruptcy model is developed in this paper by combining parametric and non-parametric approaches.To this end, the variables with the highest predictive power to detect bankruptcy are selected using logistic regression (LR). Subsequently, alternative non-parametric methods (Multilayer Perceptron, Rough Set, and Classification-Regression Trees) are applied, in turn, to firms classified as either “bankrupt” or “not bankrupt”. Our findings show that hybrid models, particularly those combining LR and Multilayer Perceptron, offer better accuracy performance and interpretability and converge faster than each method implemented in isolation. Moreover, the authors demonstrate that the introduction of non-financial and macroeconomic variables complement financial ratios for bankruptcy prediction

idUS. Depósito de Investigación Universidad de Sevilla

Mining data streams using option trees (revised edition, 2004)

Author: Holmes Geoffrey
Kirkby Richard Brendon
Pfahringer Bernhard
Publication venue: Department of Computer Science, The University of Waikato
Publication date: 01/01/2004
Field of study

The data stream model for data mining places harsh restrictions on a learning algorithm. A model must be induced following the briefest interrogation of the data, must use only available memory and must update itself over time within these constraints. Additionally, the model must be able to be used for data mining at any point in time. This paper describes a data stream classi_cation algorithm using an ensemble of option trees. The ensemble of trees is induced by boosting and iteratively combined into a single interpretable model. The algorithm is evaluated using benchmark datasets for accuracy against state-of-the-art algorithms that make use of the entire dataset

Research Commons@Waikato

Competitive Positioning in International Logistics: Identifying a System of Attributes Through Neural Networks and Decision Trees

Author: Durvasula Srinivas
Lysonski Steven
Mehta Subhash
Publication venue: e-Publications@Marquette
Publication date: 01/01/2007
Field of study

Firms involved in international logistics must develop a system of service attributes that give them a way to be profitable and to satisfy customers’ needs at the same time. How customers trade-off these various attributes in forming satisfaction with competing international logistics providers has not been explored well in the literature. This study explores the ocean freight shipping sector to identify the system of attributes that maximizes customers’ satisfaction. Data were collected from shipping managers in Singapore using personal interviews to identify the chief concerns in choosing and evaluating ocean freight services. The data were then examined using neural networks and decision trees, among other approaches to identify the system of attributes that is connected with customer satisfaction. The results illustrate the power of these methods in understanding how industrial customers with global operations process attributes to derive satisfaction. Implications are discussed

epublications@Marquette