16 research outputs found

    RESEARCH OF EARLY STAGES OF MODELING

    Get PDF
    In represented article the questions of estimate of accuracy of an average integral characteristics of random process in the course of imitation modeling is considered. For the purposes of analytical treatment of initial stage of modeling a conditionally nonstationary Gaussian process is analyzed as stationary Gaussian process with boundary prehistory. A model of approximant autocorrelation function is recommended. Analytical expression for variance and mathematical expectation of average integral estimation are obtained. Statistical estimation efficiency criterion, the probability of belonging to correct parameter interval is introduced. Dependences of closeness in estimation statistics clearing interval at transient behavior are researched for various types of processes

    Kaznewsdataset: Single country overall digital mass media publication corpus

    No full text
    Mass media is one of the most important elements influencing the information environment of society. The mass media is not only a source of information about what is happening but is often the authority that shapes the information agenda, the boundaries, and forms of discussion on socially relevant topics. A multifaceted and, where possible, quantitative assessment of mass media performance is crucial for understanding their objectivity, tone, thematic focus and, quality. The paper presents a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications). The corpus also includes more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents. The paper briefly describes the natural language processing and multiple-criteria decision-making methods, which are the algorithmic basis of the text and mass media evaluation method, and describes the results of several research cases, such as identification of propaganda, assessment of the tone of publications, calculation of the level of socially relevant negativity, comparative analysis of publication activity in the field of renewable energy. Experiments confirm the general possibility of evaluating the socially significant news, identifying texts with propagandistic content, evaluating the sentiment of publications using the topic model of the text corpus since the area under receiver operating characteristics curve (ROC AUC) values of 0.81, 0.73 and 0.93 were achieved on abovementioned tasks. The described cases do not exhaust the possibilities of thematic, tonal, dynamic, etc., analysis of the considered corpus of texts. The corpus will be interesting to researchers considering both multiple publications and mass media analysis, including comparative analysis and identification of common patterns inherent in the media of different countries

    One method of generating synthetic data to assess the upper limit of machine learning algorithms performance

    No full text
    Based on statistics from the World Nuclear Association, Kazakhstan has the highest uranium production in the world. Most of the uranium in the country is mined via in-situ leaching and the accurate classification of lithologic composition using electric logging data is economically crucial for this type of mining. In general, this classification is done manually, which is both inefficient and erroneous. Information technology tools, such as predictive analytics with Supervised Machine Learning (SML) algorithms and Artificial Neural Networks (ANN) models, are nowadays widely used to automate geophysical processes, but little is known about their application for uranium mines. Previous experiments showed an ANN accuracy of about 60% in the task of lithological interpretation of logging data. To determine the upper limit of the accuracy of machine learning algorithms in such task and for indirect assessment of the experts’ influence, a digital borehole model was developed. This made it possible to generate a complete set of data avoiding subjective expert assessments. Using these data, the work of various ML algorithms, both simple (kNN) and deep learning models (LSTM), was evaluated

    Assessing the Impact of Expert Labelling of Training Data on the Quality of Automatic Classification of Lithological Groups Using Artificial Neural Networks

    No full text
    Machine learning (ML) methods are nowadays widely used to automate geophysical study. Some of ML algorithms are used to solve lithological classification problems during uranium mining process. One of the key aspects of using classical ML methods is causing data features and estimating their influence on the classification. This paper presents a quantitative assessment of the impact of expert opinions on the classification process. In other words, we have prepared the data, identified the experts and performed a series of experiments with and without taking into account the fact that the expert identifier is supplied to the input of the automatic classifier during training and testing. Feedforward artificial neural network (ANN) has been used as a classifier. The results of the experiments show that the “knowledge” of the ANN of which expert interpreted the data improves the quality of the automatic classification in terms of accuracy (by 5 %) and recall (by 20 %). However, due to the fact that the input parameters of the model may depend on each other, the SHapley Additive exPlanations (SHAP) method has been used to further assess the impact of expert identifier. SHAP has allowed assessing the degree of parameter influence. It has revealed that the expert ID is at least two times more influential than any of the other input parameters of the neural network. This circumstance imposes significant restrictions on the application of ANNs to solve the task of lithological classification at the uranium deposits

    Classification of Negative Information on Socially Significant Topics in Mass Media

    No full text
    Mass media not only reflect the activities of state bodies but also shape the informational context, sentiment, depth, and significance level attributed to certain state initiatives and social events. Multilateral and quantitative (to the practicable extent) assessment of media activity is important for understanding their objectivity, role, focus, and, ultimately, the quality of the society’s “fourth power”. The paper proposes a method for evaluating the media in several modalities (topics, evaluation criteria/properties, classes), combining topic modeling of the text corpora and multiple-criteria decision making. The evaluation is based on an analysis of the corpora as follows: the conditional probability distribution of media by topics, properties, and classes is calculated after the formation of the topic model of the corpora. Several approaches are used to obtain weights that describe how each topic relates to each evaluation criterion/property and to each class described in the paper, including manual high-level labeling, a multi-corpora approach, and an automatic approach. The proposed multi-corpora approach suggests assessment of corpora topical asymmetry to obtain the weights describing each topic’s relationship to a certain criterion/property. These weights, combined with the topic model, can be applied to evaluate each document in the corpora according to each of the considered criteria and classes. The proposed method was applied to a corpus of 804,829 news publications from 40 Kazakhstani sources published from 01 January 2018 to 31 December 2019, to classify negative information on socially significant topics. A BigARTM model was derived (200 topics) and the proposed model was applied, including to fill a table of the analytical hierarchical process (AHP) and all of the necessary high-level labeling procedures. Experiments confirm the general possibility of evaluating the media using the topic model of the text corpora, because an area under receiver operating characteristics curve (ROC AUC) score of 0.81 was achieved in the classification task, which is comparable with results obtained for the same task by applying the BERT (Bidirectional Encoder Representations from Transformers) model

    Mass Media as a Mirror of the COVID-19 Pandemic

    No full text
    The media plays an important role in disseminating facts and knowledge to the public at critical times, and the COVID-19 pandemic is a good example of such a period. This research is devoted to performing a comparative analysis of the representation of topics connected with the pandemic in the internet media of Kazakhstan and the Russian Federation. The main goal of the research is to propose a method that would make it possible to analyze the correlation between mass media dynamic indicators and the World Health Organization COVID-19 data. In order to solve the task, three approaches related to the representation of mass media dynamics in numerical form—automatically obtained topics, average sentiment, and dynamic indicators—were proposed and applied according to a manually selected list of search queries. The results of the analysis indicate similarities and differences in the ways in which the epidemiological situation is reflected in publications in Russia and in Kazakhstan. In particular, the publication activity in both countries correlates with the absolute indicators, such as the daily number of new infections, and the daily number of deaths. However, mass media tend to ignore the positive rate of confirmed cases and the virus reproduction rate. If we consider strictness of quarantine measures, mass media in Russia show a rather high correlation, while in Kazakhstan, the correlation is much lower. Analysis of search queries revealed that in Kazakhstan the problem of fake news and disinformation is more acute during periods of deterioration of the epidemiological situation, when the level of crime and poverty increase. The novelty of this work is the proposal and implementation of a method that allows the performing of a comparative analysis of objective COVID-19 statistics and several mass media indicators. In addition, it is the first time that such a comparative analysis, between different countries, has been performed on a corpus in a language other than English

    Coverage Path Planning Optimization of Heterogeneous UAVs Group for Precision Agriculture

    No full text
    Precision farming is one of the ways of transition to the intensive methods of agricultural production. The case of application of unmanned aerial vehicles (UAVs) for solving problems of agriculture and animal husbandry is among the actively studied issues. The UAV is capable of solving the tasks of monitoring, fertilizing, herbicides, etc. However, the effective use of UAV requires to solve the tasks of flight planning, taking into account the heterogeneity of the available attachments and the problem solved in the process of the overflight. This research investigates the problem of flight planning of a group of heterogeneous UAVs applied to solving the issues of coverage, which may arise both in the course of monitoring and in the process of the implementation of agrotechnical measures. The method of coverage path planning of heterogenic UAVs group based on a genetic algorithm is proposed; this method provides planning of flight by a group of UAVs using a moving ground platform on which UAVs are recharged and refueled (multi heterogenic UAVs coverage path planning with moving ground platform (mhCPPmp)). This method allows calculating a fly by to solve the task of covering fields of different shapes and permits selecting the optimal subset of UAVs from the available set of devices; it also provides a 10% reduction in the cost of a flyby compared to an algorithm that does not use heterogeneous UAVs or a moving platform

    Mass Media as a Mirror of the COVID-19 Pandemic

    No full text
    The media plays an important role in disseminating facts and knowledge to the public at critical times, and the COVID-19 pandemic is a good example of such a period. This research is devoted to performing a comparative analysis of the representation of topics connected with the pandemic in the internet media of Kazakhstan and the Russian Federation. The main goal of the research is to propose a method that would make it possible to analyze the correlation between mass media dynamic indicators and the World Health Organization COVID-19 data. In order to solve the task, three approaches related to the representation of mass media dynamics in numerical form—automatically obtained topics, average sentiment, and dynamic indicators—were proposed and applied according to a manually selected list of search queries. The results of the analysis indicate similarities and differences in the ways in which the epidemiological situation is reflected in publications in Russia and in Kazakhstan. In particular, the publication activity in both countries correlates with the absolute indicators, such as the daily number of new infections, and the daily number of deaths. However, mass media tend to ignore the positive rate of confirmed cases and the virus reproduction rate. If we consider strictness of quarantine measures, mass media in Russia show a rather high correlation, while in Kazakhstan, the correlation is much lower. Analysis of search queries revealed that in Kazakhstan the problem of fake news and disinformation is more acute during periods of deterioration of the epidemiological situation, when the level of crime and poverty increase. The novelty of this work is the proposal and implementation of a method that allows the performing of a comparative analysis of objective COVID-19 statistics and several mass media indicators. In addition, it is the first time that such a comparative analysis, between different countries, has been performed on a corpus in a language other than English

    Analysis of the Correlation between Mass-Media Publication Activity and COVID-19 Epidemiological Situation in Early 2022

    No full text
    The paper presents the results of a correlation analysis between the information trends in the electronic media of Kazakhstan and indicators of the epidemiological situation of COVID-19 according to the World Health Organization (WHO). The developed method is based on topic modeling and some other methods of processing natural language texts. The method allows for calculating the correlations between media topics, moods, the results of full-text search queries, and objective WHO data. The analysis of the results shows how the attitudes of society towards the problems of COVID-19 changed from 2021–2022. Firstly, the results reflect a steady trend of decreasing interest of electronic media in the topic of the pandemic, although to an unequal extent for different thematic groups. Secondly, there has been a tendency to shift the focus of attention to more pragmatic issues, such as remote learning problems, remote work, the impact of quarantine restrictions on the economy, etc

    Review of Some Applications of Unmanned Aerial Vehicles Technology in the Resource-Rich Country

    No full text
    The use of unmanned aerial vehicles (UAVs) in various spheres of human activity is a promising direction for countries with very different types of economies. This statement refers to resource-rich economies as well. The peculiarities of such countries are associated with the dependence on resource prices since their economies present low diversification. Therefore, the employment of new technologies is one of the ways of increasing the sustainability of such economy development. In this context, the use of UAVs is a prospect direction, since they are relatively cheap, reliable, and their use does not require a high-tech background. The most common use of UAVs is associated with various types of monitoring tasks. In addition, UAVs can be used for organizing communication, search, cargo delivery, field processing, etc. Using additional elements of artificial intelligence (AI) together with UAVs helps to solve the problems in automatic or semi-automatic mode. Such UAV is named intelligent unmanned aerial vehicle technology (IUAVT), and its employment allows increasing the UAV-based technology efficiency. However, in order to adapt IUAVT in the sectors of economy, it is necessary to overcome a range of limitations. The research is devoted to the analysis of opportunities and obstacles to the adaptation of IUAVT in the economy. The possible economic effect is estimated for Kazakhstan as one of the resource-rich countries. The review consists of three main parts. The first part describes the IUAVT application areas and the tasks it can solve. The following areas of application are considered: precision agriculture, the hazardous geophysical processes monitoring, environmental pollution monitoring, exploration of minerals, wild animals monitoring, technical and engineering structures monitoring, and traffic monitoring. The economic potential is estimated by the areas of application of IUAVT in Kazakhstan. The second part contains the review of the technical, legal, and software-algorithmic limitations of IUAVT and modern approaches aimed at overcoming these limitations. The third part—discussion—comprises the consideration of the impact of these limitations and unsolved tasks of the IUAVT employment in the areas of activity under consideration, and assessment of the overall economic effect
    corecore