5 research outputs found

    Database “Pro-family (pronatalist) communities in the social network VKontakte”

    No full text
    The database contains uploading text comments from the social network VKontakte in .csv format (UTF-8 encoding). The comments are collected from communities discussing pregnancy, childhood, motherhood, etc. Uploading contains comments to posts with which the interaction took place. The absolute number of likes was used as a criterion (comments were collected where the number of likes is greater than or equal to 5). Text data was pre-processed (stemmization and lemmatization). The data is suitable for thematic analysis (e.g. LDA – Latent Dirichlet Allocation), for modelling the graph structure of communities (the link_comment variable contains a unique post identifier, link_author contains a unique user identifier), for analysis of tonalities of statements and formation of a dictionary of demographic connotation in Russian. Analysis of the tonalities of statements enables measuring the dynamics of “demographic temperature” in pro-family (pronatalist) communities

    Database “Childfree (antinatalist) communities in the social network VKontakte”

    No full text
    The database contains an upload of text comments in Russian from the social network VKontakte in .csv format (UTF-8 encoding). The comments are collected from communities, which discuss pregnancy, childhood, motherhood, paternity, etc. The upload contains comments under the posts with which the interaction took place. The absolute amount of likes is used as a criterion (comments are collected where the number of likes is greater than or equal to 5). The text data is processed (stemmization and lemmatization). The data are suitable for thematic analysis (e.g. LDA — Latent Dirichlet Allocation), sentiment analysis of statements, modelling the graph structure of communities (the link_comment variable contains a unique identifier of the post, link_author contains a unique user identifier), and forming a dictionary of demographic connotation in Russian. Sentiment analysis of statements enables measuring the dynamics of «demographic temperature» in antinatalist communities. The database is a supplement to the publication Kalabikhina IE, Banin EP (2020) Database «Pro-family (pronatalist) communities in the social network VKontakte». Population and Economics 4(3): 98–130. https://doi.org/10.3897/popecon.4.e60915

    Database of digital media publications on maternal (family) capital in Russia in 2006–2019

    No full text
    The database contains data from publications of digital Russian-language media registered in the Russian Federation on the topic of maternity capital published in the period from May 10, 2006 to June 30, 2019. The database includes general data on publications on maternity capital in .csv formats (UTF-8 encoding). Full texts of publications are presented in .xml format. A specialized request was generated for the aggregator of publications of Russian-language digital mass media public.ru. In total, the database consists of 457,888 publications of 7,665 publishing houses from 1,251 settlements located in 85 regions of Russia. The database includes information about the date and type of publication, publisher, place of publication (municipality), texts about maternity capital, and numbers of unique positive, negative, and neutral words and phrases according to the RuSentiLex2017 dictionary, as well as full texts of publications

    Identifying Reproductive Behavior Arguments in Social Media Content Users’ Opinions through Natural Language Processing Techniques

    No full text
    Big data provides researchers with valuable sources of information for studying demographic behavior in the population. One such source is the texts posted by social network users on various demographic issues. This study utilizes methods for automatically extracting user opinions from the “VKontakte” social network. The extracted texts are then classified using the Conversational RuBERT neural network model to investigate opinions related to reproductive behavior in the population. The classification process addresses two consecutive problems. Firstly, it aims to identify whether a user’s comment contains argumentation. Secondly, if an argument is present, it seeks to determine its type within the context of the “personal-public” dichotomy. To search for arguments and classify their types, six experiments were conducted, varying the dataset and the number of classes. The method employed for automatic extraction and classification of user opinions on the “VKontakte” social network has demonstrated the ability to accurately classify users’ comments, identifying the presence of argumentation and categorizing the arguments within the “personal-public” dichotomy. This enables the identification of personal and social attitudes, values, stories, and opinions, thus facilitating the study of reproductive behavior
    corecore