Gradient Descent (GD) is used to find the local minimum value, its purpose is to find variables on the error
function so that a function can model the data with minimum error. Therefore, the purpose of this research
is to see how much iteration is needed and how big is the accuracy level in predicting the data when using
Gradient Descent (GD) Standard and GD With Momentum and Adaptive Learning Rate (GDMALR)
functions. In this study, the data to be processed using the gradient descent function is the data of School
Participation Rate (SPR) in Indonesia aged 19-24 years, which began in 2011 to 2017. The reason for
selection This age range is one of the factors that determine success education in a country, especially
Indonesia. SPR is known as one of the indicators of successful development of education services in an area
of either Province, Regency or City in Indonesia. The higher the value of SPR, then the area is considered
successful in providing access to education services. SPR data are taken from Indonesian Central Bureau of
Statistics. This study uses 3 models of network architecture, namely: 5-5-1, 5-15-1 and 5-25-1. From 3
models, the best model is 5-5-1 with epoch 6202 iteration, 94% accuracy and MSE 0.0008658637. This
model is then used to predict SPR in Indonesia for the next 3 years (2018-2020). These results will be
expected to help the Indonesian government to further improve the scholarship and improve the quality of
education in the futur