8,176 research outputs found
NCeSS Project : Data mining for social scientists
We will discuss the work being undertaken on the NCeSS data mining project, a one year project at the University of Manchester which began at the start of 2007, to develop data mining tools of value to the social science community. Our primary goal is to produce a
suite of data mining codes, supported by a web interface, to allow social scientists to mine their datasets in a straightforward way and hence, gain new insights into their data. In order to fully define the requirements, we are looking at a range of typical datasets to find out what
forms they take and the applications and algorithms that will be required. In this paper, we will describe a number of these datasets and will discuss how easily data mining techniques can be used to extract information from the data that would either not be possible or would be
too time consuming by more standard methods
Assessing architectural evolution: A case study
This is the post-print version of the Article. The official published can be accessed from the link below - Copyright @ 2011 SpringerThis paper proposes to use a historical perspective on generic laws, principles,
and guidelines, like Lehmanâs software evolution laws and Martinâs design principles, in order to achieve a multi-faceted process and structural assessment of a systemâs architectural evolution. We present a simple structural model with associated historical metrics and
visualizations that could form part of an architectâs dashboard. We perform such an assessment for the Eclipse SDK, as a case study of a large, complex, and long-lived system for which sustained effective architectural evolution is paramount. The twofold aim of checking generic principles on a well-know system is, on the one hand,
to see whether there are certain lessons that could be learned for best practice of architectural evolution, and on the other hand to get more insights about the applicability of such principles. We find that while the Eclipse SDK does follow several of the laws and principles, there are some deviations, and we discuss areas of architectural improvement and limitations of the assessment approach
Spatial and multidimensional analysis of the Dutch housing market using the Kohonen Map and GIS
In this work the idea is to analyse general spatially identifiable housing market related data on Dutch districts (wijken) with the SOM (Kohonen Map) and a GIS. One of the authors has earlier carried out purely visual SOM analysis of that data, where patterns formed on a larger âmapâ (the output matrix of the SOM) were used as a basis for classification of the Dutch housing market segments on a nationwide level. This way the SOM was used as a method for exploratory data analysis. Now we attempt a more rigorous method of determining the segmentation using a smaller âmapâ size, in order to be able to export the SOM-output directly to a GIS-system to analyse it further. Two technical issues interest us: one, the robustness of the results â do the five basic housing market segments found in the earlier analysis prevail (we call these urban, urban periphery, pseudo-rural, traditional, and low-income segments); and two, which classes fit the real situation better and which worse, when using the RMSE for a measure of goodness? We also keep an eye on policy implications and aim at comparing our classifications with the âactualâ ones used in official discourse.
Remodularization Analysis Using Semantic Clustering
International audienceIn this paper, we report an experience on using and adapting Semantic Clustering to evaluate software remodularizations. Semantic Clustering is an approach that relies on information retrieval and clustering techniques to extract sets of similar classes in a system, according to their vocabularies. We adapted Semantic Clustering to support remodularization analysis. We evaluate our adaptation using six real-world remodularizations of four software systems. We report that Semantic Clustering and conceptual metrics can be used to express and explain the intention of the architects when performing common modularization operators, such as module decomposition
Analysis of Software Binaries for Reengineering-Driven Product Line Architecture\^aAn Industrial Case Study
This paper describes a method for the recovering of software architectures
from a set of similar (but unrelated) software products in binary form. One
intention is to drive refactoring into software product lines and combine
architecture recovery with run time binary analysis and existing clustering
methods. Using our runtime binary analysis, we create graphs that capture the
dependencies between different software parts. These are clustered into smaller
component graphs, that group software parts with high interactions into larger
entities. The component graphs serve as a basis for further software product
line work. In this paper, we concentrate on the analysis part of the method and
the graph clustering. We apply the graph clustering method to a real
application in the context of automation / robot configuration software tools.Comment: In Proceedings FMSPLE 2015, arXiv:1504.0301
- âŠ