8,176 research outputs found

    NCeSS Project : Data mining for social scientists

    Get PDF
    We will discuss the work being undertaken on the NCeSS data mining project, a one year project at the University of Manchester which began at the start of 2007, to develop data mining tools of value to the social science community. Our primary goal is to produce a suite of data mining codes, supported by a web interface, to allow social scientists to mine their datasets in a straightforward way and hence, gain new insights into their data. In order to fully define the requirements, we are looking at a range of typical datasets to find out what forms they take and the applications and algorithms that will be required. In this paper, we will describe a number of these datasets and will discuss how easily data mining techniques can be used to extract information from the data that would either not be possible or would be too time consuming by more standard methods

    Assessing architectural evolution: A case study

    Get PDF
    This is the post-print version of the Article. The official published can be accessed from the link below - Copyright @ 2011 SpringerThis paper proposes to use a historical perspective on generic laws, principles, and guidelines, like Lehman’s software evolution laws and Martin’s design principles, in order to achieve a multi-faceted process and structural assessment of a system’s architectural evolution. We present a simple structural model with associated historical metrics and visualizations that could form part of an architect’s dashboard. We perform such an assessment for the Eclipse SDK, as a case study of a large, complex, and long-lived system for which sustained effective architectural evolution is paramount. The twofold aim of checking generic principles on a well-know system is, on the one hand, to see whether there are certain lessons that could be learned for best practice of architectural evolution, and on the other hand to get more insights about the applicability of such principles. We find that while the Eclipse SDK does follow several of the laws and principles, there are some deviations, and we discuss areas of architectural improvement and limitations of the assessment approach

    Spatial and multidimensional analysis of the Dutch housing market using the Kohonen Map and GIS

    Get PDF
    In this work the idea is to analyse general spatially identifiable housing market related data on Dutch districts (wijken) with the SOM (Kohonen Map) and a GIS. One of the authors has earlier carried out purely visual SOM analysis of that data, where patterns formed on a larger ‘map’ (the output matrix of the SOM) were used as a basis for classification of the Dutch housing market segments on a nationwide level. This way the SOM was used as a method for exploratory data analysis. Now we attempt a more rigorous method of determining the segmentation using a smaller ‘map’ size, in order to be able to export the SOM-output directly to a GIS-system to analyse it further. Two technical issues interest us: one, the robustness of the results – do the five basic housing market segments found in the earlier analysis prevail (we call these urban, urban periphery, pseudo-rural, traditional, and low-income segments); and two, which classes fit the real situation better and which worse, when using the RMSE for a measure of goodness? We also keep an eye on policy implications and aim at comparing our classifications with the ‘actual’ ones used in official discourse.

    Remodularization Analysis Using Semantic Clustering

    Get PDF
    International audienceIn this paper, we report an experience on using and adapting Semantic Clustering to evaluate software remodularizations. Semantic Clustering is an approach that relies on information retrieval and clustering techniques to extract sets of similar classes in a system, according to their vocabularies. We adapted Semantic Clustering to support remodularization analysis. We evaluate our adaptation using six real-world remodularizations of four software systems. We report that Semantic Clustering and conceptual metrics can be used to express and explain the intention of the architects when performing common modularization operators, such as module decomposition

    Analysis of Software Binaries for Reengineering-Driven Product Line Architecture\^aAn Industrial Case Study

    Full text link
    This paper describes a method for the recovering of software architectures from a set of similar (but unrelated) software products in binary form. One intention is to drive refactoring into software product lines and combine architecture recovery with run time binary analysis and existing clustering methods. Using our runtime binary analysis, we create graphs that capture the dependencies between different software parts. These are clustered into smaller component graphs, that group software parts with high interactions into larger entities. The component graphs serve as a basis for further software product line work. In this paper, we concentrate on the analysis part of the method and the graph clustering. We apply the graph clustering method to a real application in the context of automation / robot configuration software tools.Comment: In Proceedings FMSPLE 2015, arXiv:1504.0301
    • 

    corecore