3 research outputs found

    Curated dataset of asphaltene structures

    Get PDF
    Asphaltenes, a distinct class of molecules found in crude oil, exhibit insolubility in nonpolar solvents like n-heptane but are soluble in aromatic solvents such as toluene and benzene. Understanding asphaltenes is crucial in the petroleum industry due to their detrimental effects on oil processing, resulting in significant economic losses and production disruptions. While no singular structure defines asphaltenes, two major molecular architectures, namely archipelago and continental models, have gained wide acceptance for their consistency with various experimental investigations and subsequent use in computational studies. The archipelago model comprises two or more polyaromatic hydrocarbon entities interconnected via aliphatic side chains. In contrast, the island or continental model features a unified polyaromatic hydrocarbon moiety with 4 to 10 fused aromatic rings, averaging around 7 rings. To establish a comprehensive collection, we meticulously curated over 250 asphaltene structures derived from previous experimental and computational studies in this field. Our curation process involved an extensive literature survey, conversion of figures from publications into molecular structure files, careful verification of conversion accuracy, and structure editing to ensure alignment with molecular formulas. Our database provides digital structure files and optimized geometries for both predominant structural motifs. The optimization procedure commenced with the PM6 semi-empirical method, followed by further optimization utilizing density functional theory employing the B3LYP functional and the 6-31+G(d,p) basis set. Furthermore, we compiled a range of structural and electronic features for these molecules, serving as a valuable foundation for employing machine learning algorithms to investigate asphaltenes. This work provides a ready to use structural database of asphaltenes and sets the stage for future research endeavours in this domain

    Machine learning to identify structural motifs in asphaltenes

    Get PDF
    Asphaltenes are organic compounds that aggregate in crude oil with two dominant molecular architectures: archipelago and continental. Continental architectures possess a single uniform island structure composed of aromatic rings in contrast to archipelago architectures with aromatic cores interconnected through aliphatic chains. The structural composition of asphaltenes varies globally due to geographical differences, posing challenges in their classification due to a lack of uniformity. This study is the first known exploration of using image-based supervised machine learning, particularly the ResNet-50 neural network, for the binary classification of asphaltenes into continental and archipelago motifs. 255 continental and archipelago models underwent structural augmentations to create a sample size of 1,530 asphaltene structures that is robust enough for accurate results in both the training and testing portions of the machine learning. These augmentations included the repeated addition of carbons until a complete pentane chain was added to a specified carbon on each asphaltene structure. Using Mathematica, supervised ResNet-50 image-based classification was used on both original and augmented structure datasets to classify as either archipelago or continental. The classification was also implemented using topological similarity searching for association between atoms and the distance between them for further molecule identification. This study demonstrates the surprising effectiveness of image-based classification compared to traditional topological feature-based methods. Our results reveal that deep learning techniques, especially image-based approaches, provide novel and insightful ways to differentiate complex molecular structures like asphaltenes, challenging the traditional reliance on topological features alone. This research opens new avenues in chemical analysis and molecular characterization, highlighting the potential of machine learning in complex molecular systems

    Curated dataset of asphaltene structures

    No full text
    Asphaltenes, a distinct class of molecules found in crude oil, exhibit insolubility in nonpolar solvents like n-heptane but are soluble in aromatic solvents such as toluene and benzene. Understanding asphaltenes is crucial in the petroleum industry due to their detrimental effects on oil processing, resulting in significant economic losses and production disruptions. While no singular structure defines asphaltenes, two major molecular architectures, namely archipelago and continental models, have gained wide acceptance for their consistency with various experimental investigations and subsequent use in computational studies.The archipelago model comprises two or more polyaromatic hydrocarbon entities interconnected via aliphatic side chains. In contrast, the island or continental model features a unified polyaromatic hydrocarbon moiety with 4 to 10 fused aromatic rings, averaging around 7 rings. To establish a comprehensive collection, we meticulously curated over 250 asphaltene structures derived from previous experimental and computational studies in this field. Our curation process involved an extensive literature survey, conversion of figures from publications into molecular structure files, careful verification of conversion accuracy, and structure editing to ensure alignment with molecular formulas. Our database provides digital structure files and optimized geometries for both predominant structural motifs. The optimization procedure commenced with the PM6 semi-empirical method, followed by further optimization utilizing density functional theory employing the B3LYP functional and the 6-31+G(d,p) basis set. Furthermore, we compiled a range of structural and electronic features for these molecules, serving as a valuable foundation for employing machine learning algorithms to investigate asphaltenes. This work provides a ready to use structural database of asphaltenes and sets the stage for future research endeavours in this domain
    corecore