The single most important bibliometric criterion for judging the impact of biomedical papers and their authors’ work is the number of citations received which is commonly referred to as “citation count”. This metric however is unavailable until several years after publication time. In the present work, we build computer models that accurately predict citation counts of biomedical publications within a deep horizon of ten years using only predictive information available at publication time. Our experiments show that it is indeed feasible to accurately predict future citation counts with a mixture of content-based and bibliometric features using machine learning methods. The models pave the way for practical prediction of the long-term impact of publication, and their statistical analysis provides greater insight into citation behavior
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.