3 research outputs found

    Identifying Cloned Navigational Patterns in Web Applications

    Get PDF
    Web Applications are subject to continuous and rapid evolution. Often programmers indiscriminately duplicate Web pages without considering systematic development and maintenance methods. This practice creates code clones that make Web Applications hard to maintain and reuse. We present an approach to identify duplicated functionalities in Web Applications through cloned navigational pattern analysis. Cloned patterns can be generalized in a reengineering process, thus to simplify the structure and future maintenance of the Web Applications. The proposed method first identifies pairs of cloned pages by analyzing similarity at structure, content, and scripting code. Two pages are considered clones if their similarity is greater than a given threshold. Cloned pages are then grouped into clusters and the links connecting pages of two clusters are grouped too. An interconnection metric has been defined on the links between two clusters to express the effort required to reengineer them as well as to select the patterns of interest. To further reduce the comprehension effort, we filter out links and nodes of the clustered navigational schema that do not contribute to the identification of cloned navigational patterns. A tool supporting the proposed approach has been developed and validated in a case study

    An Investigation of Clustering Algorithms in the Identification of Similar Web Pages

    Get PDF
    In this paper we investigate the effect of using clustering algorithms in the reverse engineering field to identify pages that are similar either at the structural level or at the content level. To this end, we have used two instances of a general process that only differ for the measure used to compare web pages. In particular, two web pages at the structural level and at the content level are compared by using the Levenshtein edit distances and Latent Semantic Indexing, respectively. The static pages of two web applications and one static web site have been used to compare the results achieved by using the considered clustering algorithms both at the structural and content level. On these applications we generally achieved comparable results. However, the investigation has also suggested some heuristics to quickly identify the best partition of web pages into clusters among the possible partitions both at the structural and at the content level

    Feature Detection in Ajax-enabled Web Applications

    Get PDF
    Abstract-In this paper we propose a method for reverse engineering the features of Ajax-enabled web applications. The method first collects instances of the DOM trees underlying the application web pages, using a state-of-the-art crawling framework. Then, it clusters these instances into groups, corresponding to distinct features of the application. The contribution of this paper lies in the novel DOM-tree similarity metric of the clustering step, which makes a distinction between simple and composite structural changes. We have evaluated our method on three real web applications. In all three cases, the proposed distance metric leads to a number of clusters that is closer to the actual number of features and classifies web page instances into these feature-specific clusters more accurately than other traditional distance metrics. We therefore conclude that it is a reliable distance metric for reverse engineering the features of Ajax-enabled web applications
    corecore