8 research outputs found

    Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

    Get PDF
    Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated three missingness mechanisms, three missing data patterns, and five missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k-NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%

    Reliability and validity in comparative studies of software prediction models

    Get PDF
    Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models

    Development of Software Effort Estimation using a Non Fuzzy Model

    Get PDF
    Now day’s accurate estimation of the software effort is a challenging issue for the modern software developers. So, to bind a contract depends purely on the estimated cost of the software. Over estimate or under estimate lead a loss or gain of the software project and also increase the probability of success and failure of the project and delay of delivery date. In this paper, we use a non fuzzy conditional algorithm to build a suitable model structure to use the improved estimation for NASA software projects. We plan to set of linear conditional models using the domain of possible KLOC (Kilo Lines of Code). The performance of developed model was analyzed using NASA data set and we compare with the result of COCOMO tuned-PSO, Halstead, Walston -Felix, Bailey-Basili and Doty models were provided

    Управління інноваціями на етапах життєвого циклу програмного забезпечення

    Get PDF
    Починаючи з другої половини XX-го сторіччя спостеріга-ється інтенсивний розвиток НТП, який значним чином був викликаний виникненням і широким розповсюдженням ЕОМ. Апаратне забезпечення ЕОМ вдосконалювалося до нинішнього часу і продовжує вдосконалюватися високими темпами, зокрема, продуктивність ЕОМ подвоюється приблизно кожні півтора-два роки. Щодо проду-ктивності розробки і показників ефективності програмного забезпечення (ПЗ), то їх ріст відбувається суттєво повільнішими темпами, ніж ріст показників апаратного за-безпечення. Незважаючи на значний прогрес в сфері створення ПЗ, воно на даний час є і, ймовірно, залишиться у найближчому майбутньому результатом інтелектуа-льної праці людини, а тому в значній мірі залежить від здатності людей у обмеже-ний строк створювати якісне ПЗ, яка розвивається порівняно повільними темпами. Згідно з різними статистичними оцінками, не більше третини проектів з розробки ПЗ можна вважати повністю успішними, інші закінчуються повним провалом, чи суттєво виходять за рамки встановлених бюджетних і часових обмежень. При цитуванні документа, використовуйте посилання http://essuir.sumdu.edu.ua/handle/123456789/1589

    Development of Mathematical Models for the Assessment of Fire Risk of Some Indian Coals using Soft Computing Techniques

    Get PDF
    Coal is the dominant energy source in India and meets 56% of the country’s primary commercial energy supply. In the light of the realization of the supremacy of coal to meet the future energy demands, rapid mechanization of mines is taking place to augment the Indian coal production from 643.75 million tons (MT) per annum in 2014-15 to an expected level of 1086 MT per annum by 2024-25. Most of the coals in India are obtained from low-rank coal seams. Fires have been raging in several coal mines in Indian coalfields. Spontaneous heating of coal is a major problem in the global mining industry. Different researchers have reported that a majority (75%) of these fires owe their origin to spontaneous combustion of coal. Fires, whether surface or underground, pose serious and environmental problems are causing huge loss of coal due to burning and loss of lives, sterilization of coal reserves and environmental pollution on a massive scale. Over the years, the number of active mine fires in India has increased to an alarming 70 locations covering a cumulative area of 17 km2. In Indian coalfield, the fire has engulfed more than 50 million tons of prime coking coal, and about 200 million tons of coals are locked up due to fires. The seriousness of the problem has been realized by the Ministry of Coal, the Ministry of Labour, various statutory agencies and mining companies. The recommendations made in the 10th Conference on Safety in Mine held at New Delhi in 2007 as well as in the Indian Chamber of Commerce (ICC)-2006, New Delhi, it was stated that all the coal mining companies should rank their coal mines on a uniform scale according to their fire risk on scientific basis. This will help the mine planners/engineers to adopt precautionary measures/steps in advance against the occurrence and spread of coal mine fire. Most of the research work carried out in India focused on the assessment of spontaneous combustion liabilities of coals based on limited conventional experimental techniques. The investigators have proposed/established statistical models to establish correlation between various coal parameters, but limited work was done on the development of soft computing techniques to predict the propensity of coal to self-heating that is yet to get due attention. Also, the classifications that have been made earlier are based on limited works which were empirical in nature, without adequate and sound mathematical base. Keeping this in view, an attempt was made in this research work to study forty-nine coal samples of various ranks covering the majority of the Indian coalfields. The experimental/analytical methods that were used to assess the tendencies of coals to spontaneous heating were: proximate analysis, ultimate analysis, petrographic analysis, crossing point temperature, Olpinski index, flammability temperature, wet oxidation potential analysis and differential thermal analysis (DTA). The statistical regression analysis was carried out between the parameters of intrinsic properties and the susceptibility indices and the best-correlated parameters were used as inputs to the soft computing models. Further different ANN models such as Multilayer Perceptron Network (MLP), Functional Link Artificial Neural Network (FLANN) and Radial Basis Function (RBF) were applied for the assessment of fire risk potential of Indian coals. The proposed appropriate ANN fire risk prediction models were designed based on the best-correlated parameters (ultimate analysis) selected as inputs after rigorous statistical analysis. After the successful application of all the proposed ANN models, comparative studies were made based on Mean Magnitude of Relative Error (MMRE) as the performance parameter, model performance curves and Pearson residual boxplots. From the proposed ANN techniques, it was observed that Szb provided better fire risk prediction with RBF model vis-à-vis MLP and FLANN. The results of the proposed RBF network model was closely matching with the field records of the investigated Indian coals and can help the mine management to adopt appropriate strategies and effective action plans in advance to prevent occurrence and spread of fire

    Quality deviation requirements in residential buildings: predictive modeling of the interaction between deviation and cause

    Get PDF
    To address construction defects, sub-task requirements (STRs) were generated alongside a Bayesian belief network-BBN quantification, towards the modelling of a unique causation pattern. The study found that the patterns of direct causes of deviation from quality norms are unique for each STR, and that causation patterns cannot be generalised. The work conducted provides Building-Quality-Managers with a new visualization tool to clarify the STR-specific cause of quality deviation pathways when creating the built environment

    Explaining the cost of European space and military projects

    No full text
    corecore