3 research outputs found

    Incorporating neighbourhood features in RNNs for popularity forecasting for emerging research fields

    Full text link
    The accurate modelling and forecasting of the popularity of emerging fields can benefit researchers by allocating resources and efforts on promising research directions. While existing forecasting approaches enjoy various levels of success, most suffer from at least one of the following three challenges: a limited scope due to having to mine topic terms from only a few documents, low generalizability due to assigning arbitrary binary classifications on topics to be either emerging, non-emerging, or using an emerging topic or field of study’s historical features as inputs to forecast its future popularity while disregarding the existing effect of a “cold start”. In this thesis, we propose a framework inclusive of two algorithms. The first algorithm is a customised ontology extraction algorithm that can generate a field of study taxonomy from a scholarly database when none exists. Using the field of study taxonomy, the second forecasting algorithm addresses the three challenges in three steps. Firstly, we leverage the field of study taxonomy present in most academic databases to obtain a neighbourhood of trending fields within the discipline of the field of study of interest. Then, dynamic time warping is used to measure the similarity of each neighbour’s trending pattern compared to the trending pattern of the field of study of interest. Lastly, we conduct multivariate forecasting using a RNN model such as long short-term memory (LSTM) or dual attention recurrent neural networks (DA-RNN) while utilizing the historical popularity scores of similar trending neighbours as input. Experimental results on 10 emerging and non-emerging fields of study showcases the existence and various dynamics of “cold start”. In addition, the proposed algorithm is also shown to greatly reduce the RMSE, MAE, and MAPE of forecasts against traditional methods for emerging fields while retaining similar performance for non-emerging fields. This validates the significance of these challenges against existing methods and provides insight on the dependency structure of emerging topics with their historical features
    corecore