Search CORE

10 research outputs found

A comparative analysis of the effects of instructional design factors on student success in e-learning: multiple-regression versus neural networks

Author: Abdulkadir Geyik
Halil Ibrahim Cebeci
Harun Resit Yazgan
Publication venue: 'Co-Action Publishing'
Publication date: 01/01/2012
Field of study

Comparison of artificial neural network and logistic regression models for prediction of mortality in head trauma based on initial clinical data

Author: A Das
B Kloppel
Behzad Eftekhar
C Bishop
CT Leondes
D Greenwood
D Hosmer
DJ Sargent
DW Patterson
Ebrahim Ketabchi
EW Lang
G Jando
G Vijaya
Hassan Eftekhar Ardebili
IL Des Plaines
J Gaudart
J Grigsby
JC Wyatt
JV Tu
JV Tu
Kazem Mohammad
L Ohno-Machado
M Moini
M Zargar
MG Penedo
Mohammad Ghodsi
N Terrin
SD Izenberg
SG Dorsey
SM DiRusso
T Nguyen
V Kemeny
WG Baxt
WG Baxt
WG Baxt
YC Li
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: In recent years, outcome prediction models using artificial neural network and multivariable logistic regression analysis have been developed in many areas of health care research. Both these methods have advantages and disadvantages. In this study we have compared the performance of artificial neural network and multivariable logistic regression models, in prediction of outcomes in head trauma and studied the reproducibility of the findings. METHODS: 1000 Logistic regression and ANN models based on initial clinical data related to the GCS, tracheal intubation status, age, systolic blood pressure, respiratory rate, pulse rate, injury severity score and the outcome of 1271 mainly head injured patients were compared in this study. For each of one thousand pairs of ANN and logistic models, the area under the receiver operating characteristic (ROC) curves, Hosmer-Lemeshow (HL) statistics and accuracy rate were calculated and compared using paired T-tests. RESULTS: ANN significantly outperformed logistic models in both fields of discrimination and calibration but under performed in accuracy. In 77.8% of cases the area under the ROC curves and in 56.4% of cases the HL statistics for the neural network model were superior to that for the logistic model. In 68% of cases the accuracy of the logistic model was superior to the neural network model. CONCLUSIONS: ANN significantly outperformed the logistic models in both fields of discrimination and calibration but lagged behind in accuracy. This study clearly showed that any single comparison between these two models might not reliably represent the true end results. External validation of the designed models, using larger databases with different rates of outcomes is necessary to get an accurate measure of performance outside the development population

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Macquarie University ResearchOnline

Sentiment without Sentiment Analysis: Using the Recommendation Outcome of Steam Game Reviews as Sentiment Predictor

Author: Zhang Anqi
Publication venue: Open PRAIRIE: Open Public Research Access Institutional Repository and Information Exchange
Publication date: 01/01/2022
Field of study

This paper presents and explores a novel way to determine the sentiment of a Steam game review based on the predicted recommendation of the review, testing different regression models on a combination of Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) features. A dataset of Steam game reviews extracted from the Programming games genre consisting of 21 games along with other significant features such as the number of helpful likes on the recommendation, number of hours played, and others. Based on the features, they are grouped into three datasets: 1) either having keyword features only, 2) keyword features with the numerical features, and 3) numerical features only. The three datasets were trained using five different regression models: Multilinear Regression, Lasso Regression, Ridge Regression, Support Vector Regression, and Multi-layer Perceptron Regression, which were then evaluated using RMSE, MAE, and MAPE. The review recommendation was predicted from each model, and the accuracy of the predictions were measured using the different error rates. The results of this research may prove helpful in the convergence of Machine Learning and Educational Games

Public Research Access Institutional Repository and Information Exchange

Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scale

Author: Cira Calimanut-Ionut
Duan Ting
Han Qianqian
Manfreda Salvatore
Prikaziuk Egor
Su Bob
Szabó Brigitta
Wang Chao
Zeng Yijian
Zhang Lijie
Zhuang Ruodan
Publication venue
Publication date: 19/10/2023
Field of study

Accurate information on surface soil moisture (SSM) content at a global scale under different climatic conditions is important for hydrological and climatological applications. Machine-learning-based systematic integration of in situ hydrological measurements, complex environmental and climate data, and satellite observation facilitate the generation of reliable data products to monitor and analyse the exchange of water, energy, and carbon in the Earth system at a proper space–time resolution. This study investigates the estimation of daily SSM using 8 optimised machine learning (ML) algorithms and 10 ensemble models (constructed via model bootstrap aggregating techniques and five-fold cross-validation). The algorithmic implementations were trained and tested using International Soil Moisture Network (ISMN) data collected from 1722 stations distributed across the world. The result showed that the K-neighbours Regressor (KNR) had the lowest root-mean-square error (0.0379 cm3 cm−3) on the “test_random” set (for testing the performance of randomly split data during training), the Random Forest Regressor (RFR) had the lowest RMSE (0.0599 cm3 cm−3) on the “test_temporal” set (for testing the performance on the period that was not used in training), and AdaBoost (AB) had the lowest RMSE (0.0786 cm3 cm−3) on the “test_independent-stations” set (for testing the performance on the stations that were not used in training). Independent evaluation on novel stations across different climate zones was conducted. For the optimised ML algorithms, the median RMSE values were below 0.1 cm3 cm−3. GradientBoosting (GB), Multi-layer Perceptron Regressor (MLPR), Stochastic Gradient Descent Regressor (SGDR), and RFR achieved a median r score of 0.6 in 12, 11, 9, and 9 climate zones, respectively, out of 15 climate zones. The performance of ensemble models improved significantly, with the median RMSE value below 0.075 cm3 cm−3 for all climate zones. All voting regressors achieved r scores of above 0.6 in 13 climate zones; BSh (hot semi-arid climate) and BWh (hot desert climate) were the exceptions because of the sparse distribution of training stations. The metric evaluation showed that ensemble models can improve the performance of single ML algorithms and achieve more stable results. Based on the results computed for three different test sets, the ensemble model with KNR, RFR and Extreme Gradient Boosting (XB) performed the best. Overall, our investigation shows that ensemble machine learning algorithms have a greater capability with respect to predicting SSM compared with the optimised or base ML algorithms; this indicates their huge potential applicability in estimating water cycle budgets, managing irrigation, and predicting crop yields.</p

Copernicus Publications

University of Twente Research Information

Predicting student satisfaction of emergency remote learning in higher education during COVID-19 using machine learning techniques

Author: Cheong Kai Yuen
Ho Indy
Weldon Anthony
Publication venue: VTC Institutional Repository
Publication date: 01/01/2021
Field of study

Despite the wide adoption of emergency remote learning (ERL) in higher education during the COVID-19 pandemic, there is insufficient understanding of influencing factors predicting student satisfaction for this novel learning environment in crisis. The present study investigated important predictors in determining the satisfaction of undergraduate students (N = 425) from multiple departments in using ERL at a self-funded university in Hong Kong while Moodle and Microsoft Team are the key learning tools. By comparing the predictive accuracy between multiple regression and machine learning models before and after the use of random forest recursive feature elimination, all multiple regression, and machine learning models showed improved accuracy while the most accurate model was the elastic net regression with 65.2% explained variance. The results show only neutral (4.11 on a 7-point Likert scale) regarding the overall satisfaction score on ERL. Even majority of students are competent in technology and have no obvious issue in accessing learning devices or Wi-Fi, face-to-face learning is more preferable compared to ERL and this is found to be the most important predictor. Besides, the level of efforts made by instructors, the agreement on the appropriateness of the adjusted assessment methods, and the perception of online learning being well delivered are shown to be highly important in determining the satisfaction scores. The results suggest that the need of reviewing the quality and quantity of modified assessment accommodated for ERL and structured class delivery with the suitable amount of interactive learning according to the learning culture and program nature

Directory of Open Access Journals

VTC Institutional Repository (Vocational Training Council)

A Data-Driven Approach for Modeling Agents

Author: Kavak Hamdi
Publication venue: ODU Digital Commons
Publication date: 01/04/2019
Field of study

Agents are commonly created on a set of simple rules driven by theories, hypotheses, and assumptions. Such modeling premise has limited use of real-world data and is challenged when modeling real-world systems due to the lack of empirical grounding. Simultaneously, the last decade has witnessed the production and availability of large-scale data from various sensors that carry behavioral signals. These data sources have the potential to change the way we create agent-based models; from simple rules to driven by data. Despite this opportunity, the literature has neglected to offer a modeling approach to generate granular agent behaviors from data, creating a gap in the literature. This dissertation proposes a novel data-driven approach for modeling agents to bridge the research gap. The approach is composed of four detailed steps including data preparation, attribute model creation, behavior model creation, and integration. The connection between and within each step is established using data flow diagrams. The practicality of the approach is demonstrated with a human mobility model that uses millions of location footprints collected from social media. In this model, the generation of movement behavior is tested with five machine learning/statistical modeling techniques covering a large number of model/data configurations. Results show that Random Forest-based learning is the most effective for the mobility use case. Furthermore, agent attribute values are obtained/generated with machine learning and translational assignment techniques. The proposed approach is evaluated in two ways. First, the use case model is compared to another model which is developed using a state-of-the-art data-driven approach. The model’s prediction performance is comparable to the state-of-the-art model. The plausibility of behaviors and model structure in the use case model is found to be closer to real-world than the state-of-the-art model. This outcome indicates that the proposed approach produces realistic results. Second, a standard mobility dataset is used for driving the mobility model in place of social media data. Despite its small size, the data and model resembled the results gathered from the primary use case indicating the possibility of using different datasets with the proposed approach

Old Dominion University

- Case of next-generation transportation market -

Author: 박영준
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 협동과정 기술경영·경제·정책전공, 2020. 8. 이종수.The present dissertation aims to provide insights into the application of different artificial neural network models in the analysis of consumer choice regarding next-generation transportation services (NGT). It categorizes consumers decisions regarding the adoption of new services according to Deweys buyer decision process and then analyzes these decisions using a variety of different methods. In particular, various artificial neural network (ANN) models are applied to predict consumers intentions. Also, the dissertation proposes an attention-based ANN model that identifies the key features that affect consumers choices. Consumers preferences for different types of NGT services are analyzed using a hierarchical Bayesian model. The analyzed consumer preferences are utilized to forecast demand for NGT services, evaluate government policies within the transportation market, and provide evidence regarding the social conflicts among traditional and new transportation services. The dissertation uses the Multiple Discrete-Continuous Extreme Value (MDCEV) model to analyze consumers decisions regarding the use of different transportation modes. It also utilizes this MDCEV model analysis to estimate the effect of NGT services on consumers travel mode selection behavior and the environmental effects of the transportation sector. Finally, the findings of the dissertations analyses are combined to generate marketing and policy insights that will promote NGT services in Korea.본 연구는 기계학습 기반의 인공지능망과 기존의 통계적 마케팅 선택모형을 통합적으로 활용하여 제품 및 서비스 수용 이론으로 정의된 소비자들의 제품 수용 행위를 분석하였다. 기존의 제품 수용 이론들은 소비자들의 선택에 끼치는 영향을 단계별로 정의하였지만, 대부분의 이론은 제품 특성이 소비자 선택에 미치는 영향을 분석하기 보다는 소비자들의 의향, 제품의 대한 의견, 지각 수준과 소비자 선택의 관계 분석에 집중하였다. 따라서 본 연구는 소비자의 제품 수용 의향, 대안 평가 그리고 제품 및 사용량 선택을 포함하여 더욱 포괄적인 측면에서 소비자 제품 수용 행위를 분석하였다. 본 연구에서는 소비자의 제품 수용 관련 선택을 총 세 단계로 분류하였다. 첫 번째는 소비자의 제품 사용 의향을 결정하는 단계, 두 번째는 제품들의 대안을 평가하는 단계, 세 번째는 제품의 사용량을 선택하는 단계로, 각 단계를 분석하기 위해서 본 연구는 인공지능망과 통계적 마케팅 선택모형을 활용하였다. 인공지능망은 예측과 분류하는 작업에서 월등한 성능을 가진 모형으로 소비자들의 제품 수용 의향을 예측하고, 의향 선택에 영향을 주는 주요 변수들을 식별하는 데 활용되었다. 본 연구에서 제안한 주요 변수 식별을 위한 인공지능망은 기존의 변수 선택 기법 보다 모형 추정 적합도 측면에서 높은 성능을 보였다. 본 모형은 향후 빅데이터와 같이 많은 양의 소비자 관련 데이터를 처리하는데 활용될 가능성이 클 뿐만 아니라, 기존의 설문 설계 기법을 개선하는데 용이한 방법론으로 판단된다. 소비자 선호를 기반으로 한 대안 평가 및 사용량을 분석하기 위해서 통계적 선택 모형 중 계층적 베이지안 모형과 혼합 MDCEV 모형을 활용하였다. 계층적 베이지안 모형은개별적인 소비자 선호를 추정할 수 있는 장점이 있고, 혼합 MDCEV 모형의 경우 소비자들의 선호를 기반하여 선택된 대안들로 다양한 포트폴리오를 구성할 수 있고, 각 대안에 대한 사용량을 분석할 수 있다. 제안된 모형들의 실증 연구를 위해 차세대 자동차 수송 서비스에 대한 소비자들의 사용 의향, 서비스 대안에 대한 선호, 수송 서비스별 사용량을 분석하였다. 실증 연구에서는 차세대 자동차 수송 서비스를 수용하기까지 소비자들이 경험하는 단계별 선택 상황을 반영하였으며, 각 단계에서 도출된 결과를 통해 향후 차세대 자동차 수송 서비스의 성장 가능성과 소비자들의 이동 행위 변화에 대해 예측하였다. 본 연구를 통해 인공지능망이 소비자 관련 연구에서 유용하게 활용될 수 있음을 보였으며, 인공지능망과 통계적 마케팅 선택모형이 결합될 경우 소비자들의 제품 선택 행위뿐만 아니라, 제품 선택 의사결정 과정 전반에 걸쳐 소비자 선호를 포괄적으로 분석할 수 있음을 확인하였다.Chapter 1. Introduction 1 1.1 Research Background 1 1.2 Research Objective 7 1.3 Research Outline 12 Chapter 2. Literature Review 14 2.1 Product and Technology Diffusion Theory 14 2.1.1. Extension of Adoption Models 19 2.2 Artificial Neural Network 22 2.2.1 General Component of the Artificial Neural Network 22 2.2.2 Activation Functions of Artificial Neural Network 26 2.3 Modeling Consumer Choice: Discrete Choice Model 32 2.3.1 Multinomial Logit Model 32 2.3.2 Mixed Logit Model 34 2.3.3 Latent Class Model 37 2.4 Modeling Consumer Heuristics in Discrete Choice Model 39 2.4.1 Consumer Decision Rule in Discrete Choice Model: Compensatory and Non-Compensatory Models 39 2.4.2 Choice Set Formation Behaviors: Semi-Compensatory Models 42 2.4.3 Modeling Consumer Usage: MDCEV Model 50 2.5 Difference between Artificial Neural Network and Choice Modeling 53 2.6 Limitations of Previous Studies and Research Motivation 58 Chapter 3. Methodology 63 3.1 Artificial Neural Network Models for Prediction 63 3.1.1 Multiple Perceptron Model 63 3.1.2 Convolutional Neural Network 69 3.1.3 Bayesian Neural Network 72 3.2 Feature Identification Model through Attention 77 3.3 Hierarchical Bayesian Model 83 3.4 Multiple Discrete-Continuous Extreme Value Model 86 Chapter 4. Empirical Analysis: Consumer Preference and Selection of Transportation Mode 98 4.1 Empirical Analysis Framework 98 4.2 Data 101 4.2.1 Overview of the Survey 101 4.3 Empirical Study I: Consumer Intention to New Type of Transportation 110 4.3.1 Research Motivation and Goal 110 4.3.2 Data and Model Setup 114 4.3.3 Result and Discussion 123 4.4 Empirical Study II: Consumer Choice and Preference for New Types of Transportation 142 4.4.1 Research Motivation and Goal 142 4.4.2 Data and Model Setup 144 4.4.3 Result and Discussion 149 4.5 Empirical Study III: Impact of New Transportation Mode on Consumers Travel Behavior 163 4.5.1 Research Motivation and Goal 163 4.5.2 Data and Model Setup 164 4.5.3 Result and Discussion 166 Chapter 5. Discussion 182 Bibliography 187 Appendix: Survey used in the analysis 209 Abstract (Korean) 241Docto

SNU Open Repository and Archive

Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data

Author: Gaudart Jean
Giusiano Bernard
Huiart Laetitia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

International audienceNeural networks are used increasingly as statistical models. The performance of multilayer perceptron (MLP) and that of linear regression (LR) were compared, with regard to the quality of prediction and estimation and the robustness to deviations from underlying assumptions of normality, homoscedasticity and independence of errors. Taking into account those deviations, ÿve designs were constructed, and, for each of them, 3000 data were simulated. The comparison between connectionist and linear models was achieved by graphic means including prediction intervals , as well as by classical criteria including goodness-of-ÿt and relative errors. The empirical distribution of estimations and the stability of MLP and LR were studied by re-sampling methods. MLP and linear regression had comparable performance and robustness. Despite the exibility of connectionist models, their predictions were stable. The empirical variances of weight estimations result from the distributed representation of the information among the processing elements. This emphasizes the major role of variances of weight estimations in the interpretation of neural networks. This needs, however, to be conÿrmed by further studies. Therefore MLP could be useful statistical models, as long as convergence conditions are respected

HAL AMU

Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data

Author: Gaudart Jean
Giusiano Bernard
Huiart Laetitia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

HAL AMU

HAL-Inserm

HAL-IRD

Modeling the risks of age-related eye diseases in a population in South India

Author: Sannapaneni Krishnaiah
Publication venue: UNSW, Sydney
Publication date: 01/01/2013
Field of study

The objective of this research was to determine whether an artificial intelligence methodology such as artificial neural network (ANN), a new type of predictive model offers an increased performance over a conventional logistic regression model (LR) in predicting the ranking of risk factors for irreversible age-related chronic eye diseases age-related macular degeneration (AMD), diabetic retinopathy (DR), primary open-angle glaucoma (POAG) and primary angle-closure glaucoma (PACG) in a South Indian population. The LR and ANN models were derived and validated for their respective models predictive accuracy based on a sample (n=3,723) aged >=40 years old by using a large scale population-based epidemiologic study. Sub-population data were drawn from this sample by appropriate standard techniques that used for modeling. The LR based risk score models (RS) were derived and the model fit was assessed in a standard manner including the bootstrap method for internal validity. The ANN model was built by using the multi-layer feed-forward back propagation network. The ANN models predictive ability was compared with that of traditional model with respect to the Area under the Receiver Operating Characteristic Curve (AUROC). The sensitivity and specificity of the fitted models with a threshold criterion ranged from 70% to nearly 99% overall for all models. The ANN model outperformed the traditional LR model in a sub-population analysis in predicting AMD and DR. The predictive accuracy of ANN and LR model in predicting AMD was statistically significant (AUROC=89% vs 79%; p=10 year (RS ranged from 29 to 42) was a highest priority predictor for DR. The modifiable risk factor intraocular pressure was in order of highest priority predictor for POAG and PACG. Population attributable risk percentage and population attributable fractions revealed that there is an urgent need of prioritizing modifying the modifiable factors as a public health approach. This was supported by a sensitivity analysis of the ANN model which indicated the relative importance of prioritizing modifiable risk factors on which to base preventive interventions to reduce the impact of onset or progression of these diseases

UNSWorks