133,747 research outputs found
An Algorithm for the Generation of Segmented Parametric Software Estimation Models and Its Empirical Evaluation
Parametric software effort estimation techniques use mathematical cost-estimation relationships derived from historical project databases, usually obtained through standard curve regression techniques. Nonetheless, project databases -- especially in the case of consortium-created compilations like the ISBSG --, collect highly heterogeneous data, coming from projects that diverge in size, process and personnel skills, among other factors. This results in that a single parametric model is seldom able to capture the diversity of the sources, in turn resulting in poor overall quality. Segmented parametric estimation models use local regression to derive one model per each segment of data with similar characteristics, improving the overall predictive quality of parametrics. Further, the process of obtaining segmented models can be expressed in the form of a generic algorithm that can be used to produce candidate models in an automated process of calibration from the project database at hand. This paper describes the rationale for such algorithmic scheme along with the empirical evaluation of a concrete version that uses the EM clustering algorithm combined with the common parametric exponential model of size-effort, and standard quality-of-adjustment criteria. Results point out to the adequacy of the technique as an extension of existing single-relation models
Planning as Optimization: Dynamically Discovering Optimal Configurations for Runtime Situations
The large number of possible configurations of modern software-based systems,
combined with the large number of possible environmental situations of such
systems, prohibits enumerating all adaptation options at design time and
necessitates planning at run time to dynamically identify an appropriate
configuration for a situation. While numerous planning techniques exist, they
typically assume a detailed state-based model of the system and that the
situations that warrant adaptations are known. Both of these assumptions can be
violated in complex, real-world systems. As a result, adaptation planning must
rely on simple models that capture what can be changed (input parameters) and
observed in the system and environment (output and context parameters). We
therefore propose planning as optimization: the use of optimization strategies
to discover optimal system configurations at runtime for each distinct
situation that is also dynamically identified at runtime. We apply our approach
to CrowdNav, an open-source traffic routing system with the characteristics of
a real-world system. We identify situations via clustering and conduct an
empirical study that compares Bayesian optimization and two types of
evolutionary optimization (NSGA-II and novelty search) in CrowdNav
A unified approach to mapping and clustering of bibliometric networks
In the analysis of bibliometric networks, researchers often use mapping and
clustering techniques in a combined fashion. Typically, however, mapping and
clustering techniques that are used together rely on very different ideas and
assumptions. We propose a unified approach to mapping and clustering of
bibliometric networks. We show that the VOS mapping technique and a weighted
and parameterized variant of modularity-based clustering can both be derived
from the same underlying principle. We illustrate our proposed approach by
producing a combined mapping and clustering of the most frequently cited
publications that appeared in the field of information science in the period
1999-2008
A Comparison of Clustering Techniques for Malware Analysis
In this research, we apply clustering techniques to the malware detection problem. Our goal is to classify malware as part of a fully automated detection strategy. We compute clusters using the well-known �-means and EM clustering algorithms, with scores obtained from Hidden Markov Models (HMM). The previous work in this area consists of using HMM and �-means clustering technique to achieve the same. The current effort aims to extend it to use EM clustering technique for detection and also compare this technique with the �-means clustering
- …