AN AUTOMATIC DEVELOPMENT AND INTEGRATION APPROACH FOR BIG DATA ANALYSIS MODULES

Abstract

随着大数据时代的到来,数据分析需求日趋多样化,大数据分析工具自带的算法库已无法满足个性化的数据分析需求,亟需开发或集成新的算法。但现有的大数据分 析工具算法开发集成学习成本高,给新算法的开发集成带来一定困难。提出一种针对大数据分析工具自动化开发集成算法的方法,算法以组件的形式集成到分析工具 中。首先定义组件模型,其次给出组件模型自动化生成流程,最后重点分析组件代码的自动生成和代码检测问题,给出基于元信息的代码生成方案和基于Soot控 制流的静态代码检测方法。实验表明,该方法可以完成大数据分析组件的自动化开发集成。As the coming of big data era,the need of data analysis is becoming increasingly diverse,this results in the incapability of big data analysis tools to meet the customised data analysis requirements by using its own build-in algorithm libraries,to develop or integrate new algorithm is urgently necessary. But existing big data analysis tools algorithm has high learning cost in development and integration,and makes it difficult to develop and integrate a new one. This paper proposes an approach targeted at the automatic algorithms development and integration for big data analysis tools,the algorithms are integrated into analysis tools as modules. The approach first defines the module model,and then presents the automatic generation flow of the module model,finally it puts the emphasis on analysing the automatic code generation and code detection method of modules,and proposes the metadata-based code generation scheme and the Soot control flow-based static code detection algorithm. As the experiment shows,this approach can complete the automatic development and integration for big data analysis modules

    Similar works