15 research outputs found

    Building an Associative Classification Data Model Based on the Apriori Method

    Get PDF
    The purpose of the work is to explore the current problems and prospects of mining solution, big web data in real time, as well as the possibility of practical implementation of Web Mining technology for big web data on a practical example. Materials and methods. The study included a review of bibliographic sources on big data mining. We used Web Mining technology for associative analysis of large web data, as well as computer modeling of the practical task of transaction analysis using a general-purpose scripting language (PHP). Results. During the work, the specifics of the Data Mining technology were described, and a modern approach to the analysis of large web data –Web Mining was analyzed. A brief classification of tasks solved using Web Mining technology is given. The problem of data mining of large web data in a general-purpose scripting language (PHP) has been solved: the lack of libraries for data mining, the difficult normalization of data to the form necessary for associative analysis, interaction with the database management system. Also, an example showing an approach to the mining of large web data was implemented. Based on the understanding of Web Mining technology and the described difficulties of analyzing web data in the PHP language, methods for effectively solving the practical problem of analyzing web data based on transactions committed in a dynamic web application have been proposed. A module for associative analysis of customer transactions in the programming language PHP was developed. The module includes an intelligent data processing class. The structural scheme of the module and system architecture were developed. The constructed module allows us to solve the main part of the problem of associative analysis of large web data using Web Mining technology in order to solve the problem of identifying patterns in a large array of web data. Associative analysis of web data is much faster because of the combination of a general-purpose scripting language and an object-oriented approach. Conclusion. According to the results of the study, it can be argued that the current state of the technology for the analysis of large web data allows efficiently process data objects, identify patterns, obtain hidden data and receive complete statistical data in real time. The results can be used both for the purpose of the initial research of technologies for analyzing large web data, and as an addition to the content management system for the intelligent analysis of web data. The usage of the technology of associative analysis and the created universal handler class makes the created module flexible, while the possibility of manual integration makes this module universal. With manual integration, the database management system is not important. Algorithm methods work with selected data. This factor greatly simplifies the further development of program code

    The problem of analysis of big web data and the use of data mining technology for processing and searching patterns in big web data on a practical example

    Get PDF
    The purpose of the work is to study the current problems and prospects of the solution for processing big data received or stored in the Internet (web data), as well as the possibility of practical realization of Data Mining technology for big web data on practical example. Materials and methods. The study included a review of bibliographic sources on big data analysis problems.Data Mining technology was used to analyze large web data, as well as computer modeling of a practical problem using the C # programming language and creating a DDL database structure for accumulating web data.Results. In the course of the work, the specifics of big data were described, the main characteristics of big data were highlighted, and modern approaches to processing big data were analyzed. A brief description of the horizontal-scalable architecture and the BI-solution architecture for big data processing is given. The problems of processing large web data are formulated: limiting the speed of access to data, providing access via network protocols through general-purpose networks.An example showing the approach to processing large web data was also implemented. Based on the idea of big data, the described complexities of web data processing and the methods of Data Mining, techniques were proposed for effectively solving the practical problem of processing and searching patterns in a large data array.The following classes have been developed in the C # programming language:Class of receiving web data via the Internet; Data conversion class;Intelligent data processing class;Created DDL script that creates a structure for the accumulation of web data.A single UML class diagram has been developed.The constructed system of data and classes allows to solve the main part of the problems of processing large web data and perform intelligent processing using Data Mining technology in order to solve the problem posed of identifying certain records in a large array. The combination of object-oriented approach, neural networks and BI-analysis to filter data will speed up the process of data processing and obtaining the result of the studyConclusion. According to the results of the study, it can be argued that the current state of technology for analyzing large web data allows you to efficiently process data objects, identify patterns, get hidden data and get full-fledged statistical data.The obtained results can be used both for the purpose of the initial study of big data processing technologies, and as a basis for developing an already real application for analyzing web data. The use of neural networks and the created universal classes-handlers makes the created architecture flexible and self-learning, and the class declarations and the base DDL structure will greatly simplify the development of program code
    corecore