Search CORE

421,223 research outputs found

Mining unstructured software data

Author: Bacchelli Alberto
Lanza Michele
Publication venue
Publication date: 14/10/2013
Field of study

Our thesis is that the analysis of unstructured data supports software understanding and evolution analysis, and complements the data mined from structured sources. To this aim, we implemented the necessary toolset and investigated methods for exploring, exposing, and exploiting unstructured data.To validate our thesis, we focused on development email data. We found two main challenges in using it to support program comprehension and software development: The disconnection between emails and code artifacts and the noisy and mixed-language nature of email content. We tackle these challenges proposing novel approaches. First, we devise lightweight techniques for linking email data to code artifacts. We use these techniques for creating a tool to support program comprehension with email data, and to create a new set of email based metrics to improve existing defect prediction approaches. Subsequently, we devise techniques for giving a structure to the content of email and we use this structure to conduct novel software analyses to support program comprehension. In this dissertation we show that unstructured data, in the form of development emails, is a valuable addition to structured data and, if correctly mined, can be used successfully to support software engineering activities

RERO DOC Digital Library

RESEARCH ISSUES CONCERNING ALGORITHMS USED FOR OPTIMIZING THE DATA MINING PROCESS

Author: Alexandru Pîrjan
Ion Lungu
Publication venue
Publication date
Field of study

In this paper, we depict some of the most widely used data mining algorithms that have an overwhelming utility and influence in the research community. A data mining algorithm can be regarded as a tool that creates a data mining model. After analyzing a set of data, an algorithm searches for specific trends and patterns, then defines the parameters of the mining model based on the results of this analysis. The above defined parameters play a significant role in identifying and extracting actionable patterns and detailed statistics. The most important algorithms within this research refer to topics like clustering, classification, association analysis, statistical learning, link mining. In the following, after a brief description of each algorithm, we analyze its application potential and research issues concerning the optimization of the data mining process. After the presentation of the data mining algorithms, we will depict the most important data mining algorithms included in Microsoft and Oracle software products, useful suggestions and criteria in choosing the most recommended algorithm for solving a mentioned task, advantages offered by these software products.data mining optimization, data mining algorithms, software solutions

Research Papers in Economics

Recommended from our members

Maleku: an evolutionary visual software analytics tool for providing insights into software evolution

Author: García-Peñalvo Francisco
González Antonio
Therón Roberto
Wermelinger Michel
Yu Yijun
Publication venue
Publication date: 01/01/2011
Field of study

Software maintenance is a complex process that requires the understanding and comprehension of software project details. It involves the understanding of the evolution of the software project, hundreds of software components and the relationships among software items in the form of inheritance, interface implementation, coupling and cohesion. Consequently, the aim of evolutionary visual software analytics is to support software project managers and developers during software maintenance. It takes into account the mining of evolutionary data, the subsequent analysis of the results produced by the mining process for producing evolution facts, the use of visualizations supported by interaction techniques and the active participation of users. Hence, this paper proposes an evolutionary visual software analytics tool for the exploration and comparison of project structural, interface implementation and class hierarchy data, and the correlation of structural data with metrics, as well as socio-technical relationships. Its main contribution is a tool that automatically retrieves evolutionary software facts and represent them using a scalable visualization design

Open Research Online (The Open University)

DATA MINING AND THE PROCESS OF TAKING DECISIONS IN EBUSINESS

Author: Ana Maria Mihaela Tudorache
Publication venue
Publication date
Field of study

Data mining software allows users to analyze large databases to solve business decision problems. Data mining is, in some ways, an extension of statistics, with a few artificial intelligence and machine learning twists thrown in. Like statistics, data mining is not a business solution, it is just a technology. For example, consider a catalog retailer who needs to decide who should receive information about a new product. The information operated on by the data mining process is contained in a historical database of previous interactions with customers and the features associated with the customers, such as age, zip code, their responses. The data mining software would use this historical information to build a model of customer behavior that could be used to predict which customers would be likely to respond to the new product. By using this information a marketing manager can select only the customers who are most likely to respond. The operational business software can then feed the results of the decision to the appropriate touch point systems (call centers, direct mail, web servers, email systems, etc.) so that the right customers receive the right offers.data mining, business decisions, data analysis, cluster analysis, decision strategy

Research Papers in Economics

DATA MINING TECHNOLOGIES

Author: Titrade Cristina-Maria
Publication venue
Publication date
Field of study

Knowledge discovery and data mining software (Knowledge Discovery and Data Mining - KDD) as an interdisciplinary field emersion have been in rapid growth to merge databases, statistics, industries closely related to the desire to extract valuable information and knowledge in a volume as possible.There is a difference in understanding of "knowledge discovery" and "data mining." Discovery information (Knowledge Discovery) in the database is a process to identify patterns / templates of valid data, innovative, useful and, in the last measure, understandable.data mining, knowledge discovery, data warehouse, data mining tools, data mining applications

Research Papers in Economics

A customizable multi-agent system for distributed data mining

Author: Di Fatta Giuseppe
Fortino Giancarlo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances

Central Archive at the University of Reading

CiteSeerX

Crossref

Research on the Application of Data Mining Technology in Software Engineering

Author: Luo Ruize
Publication venue: 'Universe Scientific Publishing Pte. Ltd.'
Publication date: 29/05/2023
Field of study

With the development of computer science and software engineering, software systems are becoming larger and more complex in scale and function. How to eff ectively manage and utilize data during development, testing, and maintenance, improve software quality, reduce development costs, and increase productivity has become an important research topic in the fi eld of software engineering. As an eff ective data analysis method, data mining technology has been widely used in the fi eld of software engineering. Data mining technology can help software engineers mine useful information and knowledge from data, improve the quality and performance of software systems, reduce development costs, and accelerate the software development process. This article introduces the research status and development trend of applying data mining technology in software engineering. Firstly, it introduces the application scenarios and objectives of data mining in the fi eld of software engineering, including defect prediction, demand analysis, and software quality evaluation. It discusses the research hotspots and future development trends of data mining technology in software engineering, including deep learning, interpretable data mining, and cross domain data mining

Electronics Science Technology and Application (E-Journal)

From zero to hero: A process mining tutorial

Author: Janes Andrea
Maggi Fabrizio Maria
Marrella Andrea
Montali Marco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Process mining is an emerging area that synergically combines model-based and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. This tutorial aims at providing an introduction to the key analysis techniques in process mining that allow decision makers to discover process models from data, compare expected and actual behaviors, and enrich models with key information about the actual process executions. In addition, the tutorial will present concrete tools and will provide practical skills for applying process mining in a variety of application domains, including the one of software development

Archivio della ricerca- Università di Roma La Sapienza

Integrating E-Commerce and Data Mining: Architecture and Challenges

Author: Ansari Suhail
Kohavi Ron
Mason Llew
Zheng Zijian
Publication venue
Publication date: 01/01/2000
Field of study

We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200

arXiv.org e-Print Archive

CiteSeerX