421,223 research outputs found

    Mining unstructured software data

    Get PDF
    Our thesis is that the analysis of unstructured data supports software understanding and evolution analysis, and complements the data mined from structured sources. To this aim, we implemented the necessary toolset and investigated methods for exploring, exposing, and exploiting unstructured data.To validate our thesis, we focused on development email data. We found two main challenges in using it to support program comprehension and software development: The disconnection between emails and code artifacts and the noisy and mixed-language nature of email content. We tackle these challenges proposing novel approaches. First, we devise lightweight techniques for linking email data to code artifacts. We use these techniques for creating a tool to support program comprehension with email data, and to create a new set of email based metrics to improve existing defect prediction approaches. Subsequently, we devise techniques for giving a structure to the content of email and we use this structure to conduct novel software analyses to support program comprehension. In this dissertation we show that unstructured data, in the form of development emails, is a valuable addition to structured data and, if correctly mined, can be used successfully to support software engineering activities

    RESEARCH ISSUES CONCERNING ALGORITHMS USED FOR OPTIMIZING THE DATA MINING PROCESS

    Get PDF
    In this paper, we depict some of the most widely used data mining algorithms that have an overwhelming utility and influence in the research community. A data mining algorithm can be regarded as a tool that creates a data mining model. After analyzing a set of data, an algorithm searches for specific trends and patterns, then defines the parameters of the mining model based on the results of this analysis. The above defined parameters play a significant role in identifying and extracting actionable patterns and detailed statistics. The most important algorithms within this research refer to topics like clustering, classification, association analysis, statistical learning, link mining. In the following, after a brief description of each algorithm, we analyze its application potential and research issues concerning the optimization of the data mining process. After the presentation of the data mining algorithms, we will depict the most important data mining algorithms included in Microsoft and Oracle software products, useful suggestions and criteria in choosing the most recommended algorithm for solving a mentioned task, advantages offered by these software products.data mining optimization, data mining algorithms, software solutions

    DATA MINING AND THE PROCESS OF TAKING DECISIONS IN EBUSINESS

    Get PDF
    Data mining software allows users to analyze large databases to solve business decision problems. Data mining is, in some ways, an extension of statistics, with a few artificial intelligence and machine learning twists thrown in. Like statistics, data mining is not a business solution, it is just a technology. For example, consider a catalog retailer who needs to decide who should receive information about a new product. The information operated on by the data mining process is contained in a historical database of previous interactions with customers and the features associated with the customers, such as age, zip code, their responses. The data mining software would use this historical information to build a model of customer behavior that could be used to predict which customers would be likely to respond to the new product. By using this information a marketing manager can select only the customers who are most likely to respond. The operational business software can then feed the results of the decision to the appropriate touch point systems (call centers, direct mail, web servers, email systems, etc.) so that the right customers receive the right offers.data mining, business decisions, data analysis, cluster analysis, decision strategy

    DATA MINING TECHNOLOGIES

    Get PDF
    Knowledge discovery and data mining software (Knowledge Discovery and Data Mining - KDD) as an interdisciplinary field emersion have been in rapid growth to merge databases, statistics, industries closely related to the desire to extract valuable information and knowledge in a volume as possible.There is a difference in understanding of "knowledge discovery" and "data mining." Discovery information (Knowledge Discovery) in the database is a process to identify patterns / templates of valid data, innovative, useful and, in the last measure, understandable.data mining, knowledge discovery, data warehouse, data mining tools, data mining applications

    A customizable multi-agent system for distributed data mining

    Get PDF
    We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances

    Research on the Application of Data Mining Technology in Software Engineering

    Get PDF
    With the development of computer science and software engineering, software systems are becoming larger and more complex in scale and function. How to eff ectively manage and utilize data during development, testing, and maintenance, improve software quality, reduce development costs, and increase productivity has become an important research topic in the fi eld of software engineering. As an eff ective data analysis method, data mining technology has been widely used in the fi eld of software engineering. Data mining technology can help software engineers mine useful information and knowledge from data, improve the quality and performance of software systems, reduce development costs, and accelerate the software development process. This article introduces the research status and development trend of applying data mining technology in software engineering. Firstly, it introduces the application scenarios and objectives of data mining in the fi eld of software engineering, including defect prediction, demand analysis, and software quality evaluation. It discusses the research hotspots and future development trends of data mining technology in software engineering, including deep learning, interpretable data mining, and cross domain data mining

    From zero to hero: A process mining tutorial

    Get PDF
    Process mining is an emerging area that synergically combines model-based and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. This tutorial aims at providing an introduction to the key analysis techniques in process mining that allow decision makers to discover process models from data, compare expected and actual behaviors, and enrich models with key information about the actual process executions. In addition, the tutorial will present concrete tools and will provide practical skills for applying process mining in a variety of application domains, including the one of software development

    Integrating E-Commerce and Data Mining: Architecture and Challenges

    Full text link
    We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200
    corecore