72 research outputs found

    An Extended Stable Marriage Problem Algorithm for Clone Detection

    Full text link
    Code cloning negatively affects industrial software and threatens intellectual property. This paper presents a novel approach to detecting cloned software by using a bijective matching technique. The proposed approach focuses on increasing the range of similarity measures and thus enhancing the precision of the detection. This is achieved by extending a well-known stable-marriage problem (SMP) and demonstrating how matches between code fragments of different files can be expressed. A prototype of the proposed approach is provided using a proper scenario, which shows a noticeable improvement in several features of clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table

    Shape-based cost analysis of skeletal parallel programs

    Get PDF
    Institute for Computing Systems ArchitectureThis work presents an automatic cost-analysis system for an implicitly parallel skeletal programming language. Although deducing interesting dynamic characteristics of parallel programs (and in particular, run time) is well known to be an intractable problem in the general case, it can be alleviated by placing restrictions upon the programs which can be expressed. By combining two research threads, the “skeletal” and “shapely” paradigms which take this route, we produce a completely automated, computation and communication sensitive cost analysis system. This builds on earlier work in the area by quantifying communication as well as computation costs, with the former being derived for the Bulk Synchronous Parallel (BSP) model. We present details of our shapely skeletal language and its BSP implementation strategy together with an account of the analysis mechanism by which program behaviour information (such as shape and cost) is statically deduced. This information can be used at compile-time to optimise a BSP implementation and to analyse computation and communication costs. The analysis has been implemented in Haskell. We consider different algorithms expressed in our language for some example problems and illustrate each BSP implementation, contrasting the analysis of their efficiency by traditional, intuitive methods with that achieved by our cost calculator. The accuracy of cost predictions by our cost calculator against the run time of real parallel programs is tested experimentally. Previous shape-based cost analysis required all elements of a vector (our nestable bulk data structure) to have the same shape. We partially relax this strict requirement on data structure regularity by introducing new shape expressions in our analysis framework. We demonstrate that this allows us to achieve the first automated analysis of a complete derivation, the well known maximum segment sum algorithm of Skillicorn and Cai

    Bidirectional data transformation by calculation

    Get PDF
    MAPi Doctoral Programme in Computer ScienceThe advent of bidirectional programming, in recent years, has led to the development of a vast number of approaches from various computer science disciplines. These are often based on domain-specific languages in which a program can be read both as a forward and a backward transformation that satisfy some desirable consistency properties. Despite the high demand and recognized potential of intrinsically bidirectional languages, they have still not matured to the point of mainstream adoption. This dissertation contemplates some usually disregarded features of bidirectional transformation languages that are vital for deployment at a larger scale. The first concerns efficiency. Most of these languages provide a rich set of primitive combinators that can be composed to build more sophisticated transformations. Although convenient, such compositional languages are plagued by inefficiency and their optimization is mandatory for a serious application. The second relates to configurability. As update translation is inherently ambiguous, users shall be allowed to control the choice of a suitable strategy. The third regards genericity. Writing a bidirectional transformation typically implies describing the concrete steps that convert values in a source schema to values a target schema, making it impractical to express very complex transformations, and practical tools shall support concise and generic coding patterns. We first define a point-free language of bidirectional transformations (called lenses), characterized by a powerful set of algebraic laws. Then, we tailor it to consider additional parameters that describe updates, and use them to refine the behavior of intricate lenses between arbitrary data structures. On top, we propose the Multifocal framework for the evolution of XML schemas. A Multifocal program describes a generic schema-level transformation, and has a value-level semantics defined using the point-free lens language. Its optimization employs the novel algebraic lens calculus.O advento da programação bidirecional, nos últimos anos, fez surgir inúmeras abordagens em diversas disciplinas de ciências da computação, geralmente baseadas em linguagens de domínio específico em que um programa representa uma transformação para a frente ou para trás, satisfazendo certas propriedades de consistência desejáveis. Apesar do elevado potencial de linguagens intrinsicamente bidirecionais, estas ainda não amadureceram o suficiente para serem correntemente utilizadas. Esta dissertação contempla algumas características de linguagens bidirecionais usualmente negligenciadas, mas vitais para um desenvolvimento em mais larga escala. A primeira refere-se à eficiência. A maioria destas linguagens fornece um conjunto rico de combinadores primitivos que podem ser utilizados para construir transformações mais sofisticadas que, embora convenientes, são cronicamente ineficientes, exigindo ser otimizadas para uma aplicação séria. A segunda diz respeito à configurabilidade. Sendo a tradução de modificações inerentemente ambígua, os utilizadores devem poder controlar a escolha de uma estratégia adequada. A terceira prende-se com a genericidade. Escrever uma transformação bidirecional implica tipicamente descrever os passos que convertem um modelo noutro diferente, enquanto que ferramentas práticas devem suportar padrões concisos e genéricos de forma a poderem expressar transformações muito complexas. Primeiro, definimos uma linguagem de transformações bidirecionais (intituladas de lentes), livre de variáveis, caracterizada por um poderoso conjunto de leis algébricas. De seguida, adaptamo-la para receber parâmetros que descrevem modificações, e usamo-los para refinar lentes intrincadas entre estruturas de dados arbitrárias. Por cima, propomos a plataforma Multifocal para a evolução de modelos XML. Um programa Multifocal descreve uma transformação genérica de modelos, cuja semântica ao nível dos valores e consequente otimização é definida em função da linguagem de lentes

    Dublin Smart City Data Integration, Analysis and Visualisation

    Get PDF
    Data is an important resource for any organisation, to understand the in-depth working and identifying the unseen trends with in the data. When this data is efficiently processed and analysed it helps the authorities to take appropriate decisions based on the derived insights and knowledge, through these decisions the service quality can be improved and enhance the customer experience. A massive growth in the data generation has been observed since two decades. The significant part of this generated data is generated from the dumb and smart sensors. If this raw data is processed in an efficient manner it could uplift the quality levels towards areas such as data mining, data analytics, business intelligence and data visualisation

    A Conceptual Model of Exploration Wayfinding: An Integrated Theoretical Framework and Computational Methodology

    Get PDF
    This thesis is an attempt to integrate contending cognitive approaches to modeling wayfinding behavior. The primary goal is to create a plausible model for exploration tasks within indoor environments. This conceptual model can be extended for practical applications in the design, planning, and Social sciences. Using empirical evidence a cognitive schema is designed that accounts for perceptual and behavioral preferences in pedestrian navigation. Using this created schema, as a guiding framework, the use of network analysis and space syntax act as a computational methods to simulate human exploration wayfinding in unfamiliar indoor environments. The conceptual model provided is then implemented in two ways. First of which is by updating an existing agent-based modeling software directly. The second means of deploying the model is using a spatial interaction model that distributed visual attraction and movement permeability across a graph-representation of building floor plans

    Stable Marriage Problem Based Adaptation for Clone Detection and Service Selection

    Get PDF
    Current software engineering topics such as clone detection and service selection need to improve the capability of detection process and selection process. The clone detection is the process of finding duplicated code through the system for several purposes such as removal of repeated portions as maintenance part of legacy system. Service selection is the process of finding the appropriate web service which meets the consumer’s request. Both problems can be converted into a matching problem. Matching process forms an essential part of software engineering activities. In this research, a well-known mathematical algorithm Stable Marriage Problem (SMP) and its variations are investigated to fulfil the purposes of matching processes in software engineering area. We aim to provide a competitive matching algorithm that can help to detect cloned software accurately and ensure high scalability, precision and recall. We also aim to apply matching algorithm on incoming request and service profile to deal with the web service as a clever independent object so that we can allow the services to accept or decline requests (equal opportunity) rather than the current state of service selection (search-based), in which service lacks of interacting as an independent candidate. In order to meet the above aims, the traditional SMP algorithm has been extended to achieve the cardinality of many-to-many. This adaptation is achieved by defining the selective strategy which is the main engine of the new adaptations. Two adaptations, Dual-Proposed and Dual-Multi-Allocation, have been proposed to both service selection and clone detection process. The proposed approach (SMP-based) shows very competitive results compare to existing software clone approaches, especially in identifying type 3 (copy with further modifications such update, add and delete statements) of cloned software. It performs the detection process with a relatively high precision and recall compare to the CloneDR tool and shows good scalability on a middle sized program. For service selection, the proposed approach has several advantages such as service protection and service quality. The services gain equal opportunity against the incoming requests. Therefore, the intelligent service interaction is achieved, and both stability and satisfaction of the candidates are ensured. This dissertation contributes to several contributions firstly, the new extended SMP algorithm by introducing selective strategy to accommodate many-to-many matching problems, to improve overall features. Secondly, a new SMP-based clone detection approach to detect cloned software accurately and ensures high precision and recall. Ultimately, a new SMPbased service selection approach allows equal opportunity between services and requests. This led to improve service protection and service quality. Case studies are carried out for experiments with the proposed approach, which show that the new adaptations can be applied effectively to clone detection and service selection processes with several features (e.g. accuracy). It can be concluded that the match based approach is feasible and promising in software engineering domain.Royal Embassy of Saudi Arabi

    Detection and prediction of urban archetypes at the pedestrian scale: computational toolsets, morphological metrics, and machine learning methods

    Get PDF
    Granular, dense, and mixed-use urban morphologies are hallmarks of walkable and vibrant streets. However, urban systems are notoriously complex and planned urban development, which grapples with varied interdependent and oft conflicting criteria, may — despite best intentions — yield aberrant morphologies fundamentally at odds with the needs of pedestrians and the resiliency of neighbourhoods. This work addresses the measurement, detection, and prediction of pedestrian-friendly urban archetypes by developing techniques for high-resolution urban analytics at the pedestrian scale. A spatial-analytic computational toolset, the cityseer-api Python package, is created to assess localised centrality, land-use, and statistical metrics using contextually sensitive workflows applied directly over the street network. cityseer-api subsequently facilitates a review of mixed-use and street network centrality methods to improve their utility concerning granular urban analysis. Unsupervised machine learning methods are applied to recover ‘signatures’ — urban archetypes — using Principal Component Analysis, Variational Autoencoders, and clustering methods from a high-resolution multi-variable and multi-scalar dataset consisting of centralities, land-uses, and population densities for Greater London. Supervised deep-learning methods applied to a similar dataset developed for 931 towns and cities in Great Britain demonstrate how, with the aid of domain knowledge, machine-learning classifiers can learn to discriminate between ‘artificial’ and ‘historical’ urban archetypes. These methods use complex systems thinking as a departure point and illustrate how high-resolution spatial-analytic quantitative methods can be combined with machine learning to extrapolate benchmarks in keeping with more qualitatively framed urban morphological conceptions. Such tools may aid urban design professionals in better anticipating the outcomes of varied design scenarios as part of iterative and scalable workflows. These techniques may likewise provide robust and demonstrable feedback as part of planning review and approvals processes

    Automated Amortised Analysis

    Get PDF
    Steffen Jost researched a novel static program analysis that automatically infers formally guaranteed upper bounds on the use of compositional quantitative resources. The technique is based on the manual amortised complexity analysis. Inference is achieved through a type system annotated with linear constraints. Any solution to the collected constraints yields the coefficients of a formula, that expresses an upper bound on the resource consumption of a program through the sizes of its various inputs. The main result is the formal soundness proof of the proposed analysis for a functional language. The strictly evaluated language features higher-order types, full mutual recursion, nested data types, suspension of evaluation, and can deal with aliased data. The presentation focuses on heap space bounds. Extensions allowing the inference of bounds on stack space usage and worst-case execution time are demonstrated for several realistic program examples. These bounds were inferred by the created generic implementation of the technique. The implementation is highly efficient, and solves even large examples within seconds.Steffen Jost stellt eine neuartige statische Programmanalyse vor, welche vollautomatisch Schranken an den Verbrauch quantitativer Ressourcen berechnet. Die Grundidee basiert auf der Technik der Amortisierten Komplexitätsanalyse, deren nicht-triviale Automatisierung durch ein erweitertes Typsystem erreicht wird. Das Typsystem berechnet als Nebenprodukt ein lineares Gleichungssystem, dessen Lösungen Koeffizienten für lineare Formeln liefern. Diese Formeln stellen garantierte obere Schranken an den Speicher- oder Zeitverbrauch des analysierten Programms dar, in Abhängigkeit von den verschiedenen Eingabegrößen des Programms. Die Relevanz der einzelnen Eingabegrößen auf den Ressourcenverbrauch wird so deutlich beziffert. Die formale Korrektheit der Analyse wird für eine funktionale Programmiersprache bewiesen. Die strikte Sprache erlaubt: Typen höherer Ordnung, volle Rekursion, verschachtelte Datentypen, explizites Aufschieben der Auswertung und Aliasing. Die formale Beschreibung der Analyse befasst sich primär mit dem Verbrauch von dynamischen Speicherplatz. Für eine Reihe von realistischen Programmbeispielen wird demonstriert, dass die angefertigte generische Implementation auch gute Schranken an den Verbrauch von Stapelspeicher und der maximalen Ausführungszeit ermitteln kann. Die Analyse ist sehr effizient implementierbar, und behandelt auch größere Beispielprogramme vollständig in wenigen Sekunden

    Laying Tiles Ornamentally: An approach to structuring container traversals

    Get PDF
    Having hardware more capable of parallel execution means that more program scheduling decisions have to be taken to utilize that hardware efficiently. To this end, compilers implement coarse-grained loop transformations in addition to traditionally used fine-grained instruction reordering. Implementors of embedded domain specific languages have to face a difficult choice: to translate operations on collections to a low-level language naively hoping that its optimizer will do the job, or to implement their own optimizer as a part of the EDSL.<br /><br />We turn ourselves to the concept of loop tiling from the imperative world and find its equivalent for recursive functions. We show the construction of a <em>tiled</em> functorial map over containers that can be naively translated to a corresponding nested loop.<br /><br />We illustrate the connection between <em>untiled</em> and tiled functorial maps by means of a type-theoretic notion of <em>algebraic ornament</em>. This approach produces an family of container traversals indexed by <em>tile sizes</em> and serves as a basis of a proof that untiled and tiled functorial maps have the same semantics.<br /><br />We evaluate our approach by designing a language of tree traversals as a DSL embedded into Haskell which compiles into C code. We use this language to implement tiled and untiled tree traversals which we benchmark under varying choices of tile sizes and shapes of input trees. For some tree shapes, we show that a tiled tree traversal can be up to 50% faster than an untiled one under a good choice of the tile size
    corecore