363,340 research outputs found

    Principles of Green Data Mining

    Get PDF
    This paper develops a set of principles for green data mining, related to the key stages of business un- derstanding, data understanding, data preparation, modeling, evaluation, and deployment. The principles are grounded in a review of the Cross Industry Stand- ard Process for Data mining (CRISP-DM) model and relevant literature on data mining methods and Green IT. We describe how data scientists can contribute to designing environmentally friendly data mining pro- cesses, for instance, by using green energy, choosing between make-or-buy, exploiting approaches to data reduction based on business understanding or pure statistics, or choosing energy friendly models

    A Survey of Parallel Data Mining

    Get PDF
    With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms

    Reuse, Reduce, Support: Design Principles for Green Data Mining

    Get PDF
    This paper reports on a design science research (DSR) study that develops design principles for “green” – more environmentally sustainable – data mining processes. Grounded in the Cross Industry Standard Process for Data Mining (CRISP-DM) and on a review of relevant literature on data mining methods, Green IT, and Green IS, the study identifies eight design principles that fall into the three categories of reuse, reduce, and support. The paper develops an evaluation strategy and provides empirical evidence for the principles’ utility. It suggests that the results can inform the development of a more general approach towards Green Data Science and provide a suitable lens to study sustainable computing

    The use of big data and data mining in the investigation of criminal offences

    Get PDF
    The aim of this study was to determine the features and prospects of using Big Data and Data Mining in criminal proceedings. The research involved the methods of a systematic approach, descriptive analysis, systematic sampling, formal legal approach and forecasting. The object of using Big Data and Data Mining are various crimes, the common features of which are the seriousness and complexity of the investigation. The common tools of Big Data and Data Mining in crime investigation and crime forecasting as interrelated tasks were identified. The creation of databases is the result of the processing of data sources by Data Mining methods, each being distinguished by the specifics of use. The main risks of implementing Big Data and Data Mining are violations of human rights and freedoms. Improving the use of Big Data and Data Mining requires standardization of procedures with strict adherence to the fundamental ethical, organizational and procedural rules. The use of Big Data and Data Mining is a forensic innovation in the investigation of serious crimes and the creation of an evidence base for criminal justice. The prospects for widespread use of these methods involve the standardization of procedures based on ethical, organizational and procedural principles. It is appropriate to outline these procedures in framework practical recommendations, emphasizing the responsibility of officials in case of violation of the specified principles. The area of further research is the improvement of innovative technologies and legal regulation of their application

    Evolutionary Data Mining Design to Visualize the Examination Timetabling Data at a University: A First Round Development.

    Get PDF
    Examination scheduling ("timetabling") at a University is a determined challenge. Allocating exam stipulate “time slots" requires most advanced quantitative techniques. This study takes an alternate approach of applying the principles of data mining (DM) explicitly using undirected data mining, data preprocessing to get the patterns in data then understand the relationship between them

    On the design of a secure data warehouse

    Full text link
    The data warehouse is the combination of subject-oriented, authoritative, integrated databases designed to support the DSS (decision support). Data mining means to extract useful information which is previously unknown from information sources. The topic of data warehousing and data mining encompasses architectures, algorithms and tools for collecting selected data from multiple databases or other information sources into a single repository known as a data warehouse which is structured to facilitate query or analysis in supporting decision making; In this thesis, we will discuss the principles, architecture, and implementation of data warehouses. We will also discuss construction of data mining and some algorithms applied to data mining. We will do some research on the security of data warehouses

    Data Driven Data Mining to Domain Driven Data Mining

    Get PDF
    In the preceding decade data mining has came into sight as one of the largely energetic areas in information technology Traditional data mining is seriously dependent on data itself and relies on data oriented methodologies So there is a universal necessity in bridging the space among academia and trade is to provide all-purpose domain-related matters in surrounding real-life applications Domain-Driven Data Mining try to build up general principles methodologies and techniques for modelling and reconciling wide-ranging domain-related factors and synthesized ubiquitous intelligence adjacent problem domains with the data mining course of action and discovering knowledge to hold up business decision-makin

    Toward standardization in privacy-preserving data mining.

    Get PDF
    Introduction. Problems in defining privacy. Privacy-preserving data mining. Privacy violation in data mining. Defining privacy preservation in data mining. Characterizing scenarios in PPDM. Principles and policies for PPDM. The OECD privacy guidelines. The implications of the OECD privacy guidelines in PPDM. Adopting PPDM policies from the OECD privacy guidelines. Requirements for PPDM. Requirements for the development of technical solutions. Requirements to guide the deployment of technical solutions. Related work. Conclusions.Na publicação: Stanley R. M. Oliveira
    corecore