28 research outputs found

    An Approach and an Eclipse Based Environment for Enhancing the Navigation Structure of Web Sites

    Get PDF
    This paper presents an approach based on information retrieval and clustering techniques for automatically enhancing the navigation structure of a Web site for improving navigability. The approach increments the set of navigation links provided in each page of the site with a semantic navigation map, i.e., a set of links enabling navigating from a given page to other pages of the site showing similar or related content. The approach uses Latent Semantic Indexing to compute a dissimilarity measure between the pages of the site and a graph-theoretic clustering algorithm to group pages showing similar or related content according to the calculated dissimilarity measure. AJAX code is finally used to extend each Web page with an associated semantic navigation map. The paper also presents a prototype of a tool developed to support the approach and the results from a case study conducted to assess the validity and feasibility of the proposal

    Identifying Cloned Navigational Patterns in Web Applications

    Get PDF
    Web Applications are subject to continuous and rapid evolution. Often programmers indiscriminately duplicate Web pages without considering systematic development and maintenance methods. This practice creates code clones that make Web Applications hard to maintain and reuse. We present an approach to identify duplicated functionalities in Web Applications through cloned navigational pattern analysis. Cloned patterns can be generalized in a reengineering process, thus to simplify the structure and future maintenance of the Web Applications. The proposed method first identifies pairs of cloned pages by analyzing similarity at structure, content, and scripting code. Two pages are considered clones if their similarity is greater than a given threshold. Cloned pages are then grouped into clusters and the links connecting pages of two clusters are grouped too. An interconnection metric has been defined on the links between two clusters to express the effort required to reengineer them as well as to select the patterns of interest. To further reduce the comprehension effort, we filter out links and nodes of the clustered navigational schema that do not contribute to the identification of cloned navigational patterns. A tool supporting the proposed approach has been developed and validated in a case study

    An Investigation of Clustering Algorithms in the Identification of Similar Web Pages

    Get PDF
    In this paper we investigate the effect of using clustering algorithms in the reverse engineering field to identify pages that are similar either at the structural level or at the content level. To this end, we have used two instances of a general process that only differ for the measure used to compare web pages. In particular, two web pages at the structural level and at the content level are compared by using the Levenshtein edit distances and Latent Semantic Indexing, respectively. The static pages of two web applications and one static web site have been used to compare the results achieved by using the considered clustering algorithms both at the structural and content level. On these applications we generally achieved comparable results. However, the investigation has also suggested some heuristics to quickly identify the best partition of web pages into clusters among the possible partitions both at the structural and at the content level

    Identifying Similar Pages in Web Applications using a Competitive Clustering Algorithm

    Get PDF
    We present an approach based on Winner Takes All (WTA), a competitive clustering algorithm, to support the comprehension of static and dynamic Web applications during Web application reengineering. This approach adopts a process that first computes the distance between Web pages and then identifies and groups similar pages using the considered clustering algorithm. We present an instance of application of the clustering process to identify similar pages at the structural level. The page structure is encoded into a string of HTML tags and then the distance between Web pages at the structural level is computed using the Levenshtein string edit distance algorithm. A prototype to automate the clustering process has been implemented that can be extended to other instances of the process, such as the identification of groups of similar pages at content level. The approach and the tool have been evaluated in two case studies. The results have shown that the WTA clustering algorithm suggests heuristics to easily identify the best partition of Web pages into clusters among the possible partitions

    Rétro ingénierie d'applications web javascript pour aider à la compréhension et à la documentation

    Get PDF
    Ce mémoire s’intéresse à la rétro-ingénierie comme solution pour aider les développeurs à comprendre, modifier et documenter la structure de leurs applications web. Pour retrouver la structure d’une application, il faut souvent recourir à de l’analyse statique du code source pour retrouver les différents éléments et les différentes relations qui composent l’application. Le développement web présente ici des défis particuliers puisqu’il fait intervenir plusieurs langages. Certains de ces langages, comme HTML et CSS sont relativement simples; d’autres le sont moins. En particulier, JavaScript, un langage clé de la technologie Web, présente des aspects dynamiques importants (p. ex.: typage dynamique, évaluation dynamique de chaines de caractères) qui pourraient rendre très inefficace une analyse statique du code source. En effet, la récupération des éléments constituant l’application et de leurs liens pourrait devoir nécessiter une analyse dite dynamique qui se ferait sur des scénarios d’exécution. Cependant, de telles analyses dynamiques ne garantissent pas une couverture complète de l’application et ne peuvent se faire que si le code est exécutable. Nous avons donc conduit une étude empirique sur la viabilité de l’analyse statique pour la rétro-ingénierie de JavaScript. Forts de ces résultats ainsi que des constats sur les techniques et outils existants, nous proposons nos propres pistes de solutions sous forme d’une nouvelle approche de rétro-ingénierie (Web application Viewer). Cet outil est subséquemment utilisé pour performer des expérimentations de visualisation de structure à l’aide de diagrammes de force dirigée et diagrammes de classes. L’outil de rétro-ingénierie créé permet d’extraire les principaux éléments de la structure d’une application web pour les langages JavaScript, Node.js, HTML et CSS. Les résultats sont satisfaisants et permettent au développeur de documenter leurs applications rapidement à l’aide de diagrammes

    Are PHP applications ready for Hack?

    Full text link

    Program Transformations for Information Personalization

    Get PDF
    Personalization constitutes the mechanisms necessary to automatically customize information content, structure, and presentation to the end user to reduce information overload. Unlike traditional approaches to personalization, the central theme of our approach is to model a website as a program and conduct website transformation for personalization by program transformation (e.g., partial evaluation, program slicing). The goal of this paper is study personalization through a program transformation lens and develop a formal model, based on program transformations, for personalized interaction with hierarchical hypermedia. The specific research issues addressed involve identifying and developing program representations and transformations suitable for classes of hierarchical hypermedia and providing supplemental interactions for improving the personalized experience. The primary form of personalization discussed is out-of-turn interaction—a technique that empowers a user navigating a hierarchical website to postpone clicking on any of the hyperlinks presented on the current page and, instead, communicate the label of a hyperlink nested deeper in the hierarchy. When the user supplies out-of-turn input, we personalize the hierarchy to reflect the user\u27s informational need. While viewing a website as a program and site transformation as program transformation is non-traditional, it offers a new way of thinking about personalized interaction, especially with hierarchical hypermedia. Our use of program transformations casts personalization in a formal setting and provides a systematic and implementation-neutral approach to designing systems. Moreover, this approach helped connect our work to human-computer dialog management and, in particular, mixed-initiative interaction. Putting personalized web interaction on a fundamentally different landscape gave birth to this new line of research. Relating concepts in the web domain (e.g., sites, interactions) to notions in the program-theoretic domain (e.g., programs, transformations) constitutes the creativity in this work

    Managing Schema Change in an Heterogeneous Environment

    Get PDF
    Change is inevitable even for persistent information. Effectively managing change of persistent information, which includes the specification, execution and the maintenance of any derived information, is critical and must be addressed by all database systems. Today, for every data model there exists a well-defined set of change primitives that can alter both the structure (the schema) and the data. Several proposals also exist for incrementally propagating a primitive change to any derived information (or view). However, existing support is lacking in two ways. First, change primitives as presented in literature are very limiting in terms of their capabilities allowing users to simply add or remove schema elements. More complex types of changes such the merging or splitting of schema elements are not supported in a principled manner. Second, algorithms for maintaining derived information often do not account for the potential heterogeneity between the source and the target. The goal of this dissertation is to provide solutions that address these two key issues. The first part of this dissertation addresses the challenge of expressing a rich complex set of changes. We propose the SERF (Schema Evolution through an Extensible, Re-usable and Flexible) framework that allows users to perform a wide range of complex user-defined schema transformations. Our approach combines existing schema evolution primitives using OQL (object query language) as the glue logic. Within the context of this work, we look at the different domains in which SERF can be applied, including web site management. To further enrich our framework, we also investigate the optimization and verification of SERF transformations. The second part of this dissertation addresses the problem of maintaining views in the face of source changes when the source and the view are not in the same data model. With today\u27s increasing heterogeneity in information structure, it is critical that maintenance of views addresses the data model boundaries. However, view definitions that go across data models are limited to hard-coded algorithms, thereby making it difficult to develop general maintenance algorithms. We provide a two-step solution for this problem. We have developed a cross algebra, that defines views such that there is no restriction that forces the view and the source data models to be the same. We then define update propagation algorithms that can propagate changes from source to target irrespective of the exact translation and the data models. We validate our ideas by applying them to translation and change propagation between the XML and relational data models

    Towards privacy-aware identity management

    Get PDF
    The overall goal of the PRIME project (Privacy and Identity Management for Europe) is the development of a privacy-enhanced identity management system that allows users to control the release of their personal information. The PRIME architecture includes an Access Control component allowing the enforcement of protection requirements on personal identifiable information (PII). The overall goal of the PRIME project (Privacy and Identity Management for Europe) is the development of a privacy-enhanced identity management system that allows users to control the release of their personal information. The PRIME architecture includes an Access Control component allowing the enforcement of protection requirements on personal identifiable information (PII)
    corecore