Predicting the components and types of kerogen in shale by combining machine learning with NMR spectra

Abstract

This study aims to develop a new method that combines machine learning with nuclear magnetic resonance (NMR) spectra to predict the kemgen components and types. Kerogen is the primary hydrocarbon source of shale oil/gas, and nearly half of the hydrocarbons in shale are adsorbed in kemgen. The adsorption and hydrocarbon generation capacity of kerogen is directly related to its types, molecular components, and structures. Fruitful researches studying kerogen at the molecular level have been conducted. Unfortunately, these methods are complicated, time-consuming, and labor-intensive. Our method has the advantages of high-throughput prediction, high accuracy, and time savings compared with the existing methods. Additionally, this method simplifies the operations from repetitive trial and error. This study proposes a solution to convert non-uniform two-dimensional (2D) graph into a uniform one-dimensional (1D) matrix, which makes 2D graph data available for machine learning models. An automatic labeling platform is constructed that annotated over 22,000 groups of organic matter molecules and their NMR spectra. The results show that the carbon, hydrogen, and oxygen element prediction accuracy reach 96.1%, 94.8%, and 81.7%, respectively. In addition, the accuracy of the three kerogen types is approximately 90% in total. These results reflect the excellent performance of the machine learning method. Therefore, our work provides an automated and intelligent prediction and analysis method, which is a powerful and superior tool in kerogen studies at the molecular level

    Similar works