120,270 research outputs found
A Survey of Prediction and Classification Techniques in Multicore Processor Systems
In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems
Used-habitat calibration plots: a new procedure for validating species distribution, resource selection, and step-selection models
“Species distribution modeling” was recently ranked as one of the top five “research fronts” in ecology and the environmental sciences by ISI's Essential Science Indicators (Renner and Warton 2013), reflecting the importance of predicting how species distributions will respond to anthropogenic change. Unfortunately, species distribution models (SDMs) often perform poorly when applied to novel environments. Compounding on this problem is the shortage of methods for evaluating SDMs (hence, we may be getting our predictions wrong and not even know it). Traditional methods for validating SDMs quantify a model's ability to classify locations as used or unused. Instead, we propose to focus on how well SDMs can predict the characteristics of used locations. This subtle shift in viewpoint leads to a more natural and informative evaluation and validation of models across the entire spectrum of SDMs. Through a series of examples, we show how simple graphical methods can help with three fundamental challenges of habitat modeling: identifying missing covariates, non-linearity, and multicollinearity. Identifying habitat characteristics that are not well-predicted by the model can provide insights into variables affecting the distribution of species, suggest appropriate model modifications, and ultimately improve the reliability and generality of conservation and management recommendations
A Multi-Gene Genetic Programming Application for Predicting Students Failure at School
Several efforts to predict student failure rate (SFR) at school accurately
still remains a core problem area faced by many in the educational sector. The
procedure for forecasting SFR are rigid and most often times require data
scaling or conversion into binary form such as is the case of the logistic
model which may lead to lose of information and effect size attenuation. Also,
the high number of factors, incomplete and unbalanced dataset, and black boxing
issues as in Artificial Neural Networks and Fuzzy logic systems exposes the
need for more efficient tools. Currently the application of Genetic Programming
(GP) holds great promises and has produced tremendous positive results in
different sectors. In this regard, this study developed GPSFARPS, a software
application to provide a robust solution to the prediction of SFR using an
evolutionary algorithm known as multi-gene genetic programming. The approach is
validated by feeding a testing data set to the evolved GP models. Result
obtained from GPSFARPS simulations show its unique ability to evolve a suitable
failure rate expression with a fast convergence at 30 generations from a
maximum specified generation of 500. The multi-gene system was also able to
minimize the evolved model expression and accurately predict student failure
rate using a subset of the original expressionComment: 14 pages, 9 figures, Journal paper. arXiv admin note: text overlap
with arXiv:1403.0623 by other author
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
An Exploratory Study of Patient Falls
Debate continues between the contribution of education level and clinical expertise in the nursing practice environment. Research suggests a link between Baccalaureate of Science in Nursing (BSN) nurses and positive patient outcomes such as lower mortality, decreased falls, and fewer medication errors. Purpose: To examine if there a negative correlation between patient falls and the level of nurse education at an urban hospital located in Midwest Illinois during the years 2010-2014? Methods: A retrospective crosssectional cohort analysis was conducted using data from the National Database of Nursing Quality Indicators (NDNQI) from the years 2010-2014. Sample: Inpatients aged ≥ 18 years who experienced a unintentional sudden descent, with or without injury that resulted in the patient striking the floor or object and occurred on inpatient nursing units. Results: The regression model was constructed with annual patient falls as the dependent variable and formal education and a log transformed variable for percentage of certified nurses as the independent variables. The model overall is a good fit, F (2,22) = 9.014, p = .001, adj. R2 = .40. Conclusion: Annual patient falls will decrease by increasing the number of nurses with baccalaureate degrees and/or certifications from a professional nursing board-governing body
The efficacy of using data mining techniques in predicting academic performance of architecture students.
In recent years, there has been a tremendous increase in the number of applicants seeking placement in the undergraduate architecture programme. It is important to identify new intakes who possess the capability to succeed during the selection phase of admission at universities. Admission variable (i.e. prior academic achievement) is one of the most important criteria considered during selection process. The present study investigates the efficacy of using data mining techniques to predict academic performance of architecture student based on information contained in prior academic achievement.
The input variables, i.e. prior academic achievement, were extracted from students' academic records. Logistic regression and support vector machine (SVM) are the data mining techniques adopted in this study. The collected data was divided into two parts. The first part was used for training the model, while the other part was used to evaluate the predictive accuracy of the developed models.
The results revealed that SVM model outperformed the logistic regression model in terms of accuracy. Taken together, it is evident that prior academic achievement are good predictors of academic performance of architecture students.
Although the factors affecting academic performance of students are numerous, the present study focuses on the effect of prior academic achievement on academic performance of architecture students.
The developed SVM model can be used a decision-making tool for selecting new intakes into the architecture program at Nigerian universities
Would credit scoring work for Islamic finance? A neural network approach
Purpose – The main aim of this paper is to distinguish whether the decision making process of the Islamic financial houses in the UK can be improved through the use of credit scoring modeling techniques as opposed to the currently used judgmental approaches. Subsidiary aims are to identify how scoring models can reclassify accepted applicants who later are considered as having bad credit and how many of the rejected applicants are later considered as having good credit; and highlight significant variables that are crucial in terms of accepting and rejecting applicants which can further aid the decision making process.
Design/methodology/approach – A real data-set of 487 applicants are used consisting of 336 accepted credit applications and 151 rejected credit applications make to an Islamic finance house in the UK. In order to build the proposed scoring models, the data-set is divided into training and hold-out sub-set. The training sub-set is used to build the scoring models and the hold-out sub-set is used to test the predictive capabilities of the scoring models.70 percent of the overall applicants will be used for the training sub-set and 30 percent will be used for the testing sub-set. Three statistical modeling techniques namely Discriminant Analysis (DA), Logistic Regression (LR) and Multi-layer Perceptron (MP) neural network are used to build the proposed scoring models.
Findings – Our findings reveal that the LR model has the highest Correct Classification (CC) rate in the training sub-set whereas MP outperforms other techniques and has the highest CC rate in the hold-out sub-set. MP also outperforms other techniques in terms of predicting the rejected credit applications and has the lowest Misclassification Cost (MC) above other techniques. In addition, results from MP models show that monthly expenses, age and marital status are identified as the key factors affecting the decision making process.
Research limitations/implications – Although our sample is small and restricted to an Islamic Finance house in the UK the results are robust. Future research could consider enlarging the sample in the UK and also internationally allowing for cultural differences to be identified. The results indicate that the scoring models can be of great benefit to Islamic finance houses in regards to their decision making processes of accepting and rejecting new credit applications and thus improve their efficiency and effectiveness.
Originality/value –Our contribution is the first to apply credit scoring modeling techniques in Islamic Finance. Also in building a scoring model our application applies a different approach by using accepted and rejected credit applications instead of good and bad credit histories. This identifies opportunity costs of misclassifying credit applications as rejected
Recommended from our members
Exploration of PET and MRI radiomic features for decoding breast cancer phenotypes and prognosis.
Radiomics is an emerging technology for imaging biomarker discovery and disease-specific personalized treatment management. This paper aims to determine the benefit of using multi-modality radiomics data from PET and MR images in the characterization breast cancer phenotype and prognosis. Eighty-four features were extracted from PET and MR images of 113 breast cancer patients. Unsupervised clustering based on PET and MRI radiomic features created three subgroups. These derived subgroups were statistically significantly associated with tumor grade (p = 2.0 × 10-6), tumor overall stage (p = 0.037), breast cancer subtypes (p = 0.0085), and disease recurrence status (p = 0.0053). The PET-derived first-order statistics and gray level co-occurrence matrix (GLCM) textural features were discriminative of breast cancer tumor grade, which was confirmed by the results of L2-regularization logistic regression (with repeated nested cross-validation) with an estimated area under the receiver operating characteristic curve (AUC) of 0.76 (95% confidence interval (CI) = [0.62, 0.83]). The results of ElasticNet logistic regression indicated that PET and MR radiomics distinguished recurrence-free survival, with a mean AUC of 0.75 (95% CI = [0.62, 0.88]) and 0.68 (95% CI = [0.58, 0.81]) for 1 and 2 years, respectively. The MRI-derived GLCM inverse difference moment normalized (IDMN) and the PET-derived GLCM cluster prominence were among the key features in the predictive models for recurrence-free survival. In conclusion, radiomic features from PET and MR images could be helpful in deciphering breast cancer phenotypes and may have potential as imaging biomarkers for prediction of breast cancer recurrence-free survival
- …