72 research outputs found
An Extended Stable Marriage Problem Algorithm for Clone Detection
Code cloning negatively affects industrial software and threatens
intellectual property. This paper presents a novel approach to detecting cloned
software by using a bijective matching technique. The proposed approach focuses
on increasing the range of similarity measures and thus enhancing the precision
of the detection. This is achieved by extending a well-known stable-marriage
problem (SMP) and demonstrating how matches between code fragments of different
files can be expressed. A prototype of the proposed approach is provided using
a proper scenario, which shows a noticeable improvement in several features of
clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table
Shape-based cost analysis of skeletal parallel programs
Institute for Computing Systems ArchitectureThis work presents an automatic cost-analysis system for an implicitly parallel skeletal
programming language.
Although deducing interesting dynamic characteristics of parallel programs (and in
particular, run time) is well known to be an intractable problem in the general case, it
can be alleviated by placing restrictions upon the programs which can be expressed.
By combining two research threads, the “skeletal” and “shapely” paradigms which
take this route, we produce a completely automated, computation and communication
sensitive cost analysis system. This builds on earlier work in the area by quantifying
communication as well as computation costs, with the former being derived for the
Bulk Synchronous Parallel (BSP) model.
We present details of our shapely skeletal language and its BSP implementation strategy
together with an account of the analysis mechanism by which program behaviour
information (such as shape and cost) is statically deduced. This information can be
used at compile-time to optimise a BSP implementation and to analyse computation
and communication costs. The analysis has been implemented in Haskell. We consider
different algorithms expressed in our language for some example problems and
illustrate each BSP implementation, contrasting the analysis of their efficiency by traditional,
intuitive methods with that achieved by our cost calculator. The accuracy of
cost predictions by our cost calculator against the run time of real parallel programs is
tested experimentally.
Previous shape-based cost analysis required all elements of a vector (our nestable bulk
data structure) to have the same shape. We partially relax this strict requirement on data
structure regularity by introducing new shape expressions in our analysis framework.
We demonstrate that this allows us to achieve the first automated analysis of a complete
derivation, the well known maximum segment sum algorithm of Skillicorn and Cai
Bidirectional data transformation by calculation
MAPi Doctoral Programme in Computer ScienceThe advent of bidirectional programming, in recent years, has led to the development of
a vast number of approaches from various computer science disciplines. These are often
based on domain-specific languages in which a program can be read both as a forward
and a backward transformation that satisfy some desirable consistency properties.
Despite the high demand and recognized potential of intrinsically bidirectional
languages, they have still not matured to the point of mainstream adoption. This
dissertation contemplates some usually disregarded features of bidirectional transformation
languages that are vital for deployment at a larger scale. The first concerns
efficiency. Most of these languages provide a rich set of primitive combinators that
can be composed to build more sophisticated transformations. Although convenient,
such compositional languages are plagued by inefficiency and their optimization is
mandatory for a serious application. The second relates to configurability. As update
translation is inherently ambiguous, users shall be allowed to control the choice of a
suitable strategy. The third regards genericity. Writing a bidirectional transformation
typically implies describing the concrete steps that convert values in a source schema to values a target schema, making it impractical to express very complex transformations,
and practical tools shall support concise and generic coding patterns.
We first define a point-free language of bidirectional transformations (called lenses),
characterized by a powerful set of algebraic laws. Then, we tailor it to consider
additional parameters that describe updates, and use them to refine the behavior of
intricate lenses between arbitrary data structures. On top, we propose the Multifocal
framework for the evolution of XML schemas. A Multifocal program describes a
generic schema-level transformation, and has a value-level semantics defined using the
point-free lens language. Its optimization employs the novel algebraic lens calculus.O advento da programação bidirecional, nos últimos anos, fez surgir inúmeras abordagens
em diversas disciplinas de ciências da computação, geralmente baseadas em
linguagens de domínio específico em que um programa representa uma transformação
para a frente ou para trás, satisfazendo certas propriedades de consistência desejáveis.
Apesar do elevado potencial de linguagens intrinsicamente bidirecionais, estas ainda
não amadureceram o suficiente para serem correntemente utilizadas. Esta dissertação
contempla algumas características de linguagens bidirecionais usualmente negligenciadas,
mas vitais para um desenvolvimento em mais larga escala. A primeira refere-se
à eficiência. A maioria destas linguagens fornece um conjunto rico de combinadores
primitivos que podem ser utilizados para construir transformações mais sofisticadas
que, embora convenientes, são cronicamente ineficientes, exigindo ser otimizadas para
uma aplicação séria. A segunda diz respeito à configurabilidade. Sendo a tradução de
modificações inerentemente ambígua, os utilizadores devem poder controlar a escolha
de uma estratégia adequada. A terceira prende-se com a genericidade. Escrever uma
transformação bidirecional implica tipicamente descrever os passos que convertem um
modelo noutro diferente, enquanto que ferramentas práticas devem suportar padrões
concisos e genéricos de forma a poderem expressar transformações muito complexas. Primeiro, definimos uma linguagem de transformações bidirecionais (intituladas de
lentes), livre de variáveis, caracterizada por um poderoso conjunto de leis algébricas. De
seguida, adaptamo-la para receber parâmetros que descrevem modificações, e usamo-los
para refinar lentes intrincadas entre estruturas de dados arbitrárias. Por cima, propomos
a plataforma Multifocal para a evolução de modelos XML. Um programa Multifocal
descreve uma transformação genérica de modelos, cuja semântica ao nível dos valores
e consequente otimização é definida em função da linguagem de lentes
Dublin Smart City Data Integration, Analysis and Visualisation
Data is an important resource for any organisation, to understand the in-depth working and identifying the unseen trends with in the data. When this data is efficiently processed and analysed it helps the authorities to take appropriate decisions based on the derived insights and knowledge, through these decisions the service quality can be improved and enhance the customer experience. A massive growth in the data generation has been observed since two decades. The significant part of this generated data is generated from the dumb and smart sensors. If this raw data is processed in an efficient manner it could uplift the quality levels towards areas such as data mining, data analytics, business intelligence and data visualisation
A Conceptual Model of Exploration Wayfinding: An Integrated Theoretical Framework and Computational Methodology
This thesis is an attempt to integrate contending cognitive approaches to modeling wayfinding behavior. The primary goal is to create a plausible model for exploration tasks within indoor environments. This conceptual model can be extended for practical applications in the design, planning, and Social sciences. Using empirical evidence a cognitive schema is designed that accounts for perceptual and behavioral preferences in pedestrian navigation. Using this created schema, as a guiding framework, the use of network analysis and space syntax act as a computational methods to simulate human exploration wayfinding in unfamiliar indoor environments. The conceptual model provided is then implemented in two ways. First of which is by updating an existing agent-based modeling software directly. The second means of deploying the model is using a spatial interaction model that distributed visual attraction and movement permeability across a graph-representation of building floor plans
Stable Marriage Problem Based Adaptation for Clone Detection and Service Selection
Current software engineering topics such as clone detection and service selection need to
improve the capability of detection process and selection process. The clone detection is the
process of finding duplicated code through the system for several purposes such as removal
of repeated portions as maintenance part of legacy system. Service selection is the process of
finding the appropriate web service which meets the consumer’s request. Both problems can
be converted into a matching problem.
Matching process forms an essential part of software engineering activities. In this
research, a well-known mathematical algorithm Stable Marriage Problem (SMP) and its
variations are investigated to fulfil the purposes of matching processes in software engineering
area. We aim to provide a competitive matching algorithm that can help to detect cloned
software accurately and ensure high scalability, precision and recall. We also aim to apply
matching algorithm on incoming request and service profile to deal with the web service as
a clever independent object so that we can allow the services to accept or decline requests
(equal opportunity) rather than the current state of service selection (search-based), in which
service lacks of interacting as an independent candidate.
In order to meet the above aims, the traditional SMP algorithm has been extended to
achieve the cardinality of many-to-many. This adaptation is achieved by defining the selective
strategy which is the main engine of the new adaptations. Two adaptations, Dual-Proposed
and Dual-Multi-Allocation, have been proposed to both service selection and clone detection
process. The proposed approach (SMP-based) shows very competitive results compare
to existing software clone approaches, especially in identifying type 3 (copy with further
modifications such update, add and delete statements) of cloned software. It performs the
detection process with a relatively high precision and recall compare to the CloneDR tool
and shows good scalability on a middle sized program. For service selection, the proposed
approach has several advantages such as service protection and service quality. The services
gain equal opportunity against the incoming requests. Therefore, the intelligent service
interaction is achieved, and both stability and satisfaction of the candidates are ensured.
This dissertation contributes to several contributions firstly, the new extended SMP algorithm
by introducing selective strategy to accommodate many-to-many matching problems,
to improve overall features. Secondly, a new SMP-based clone detection approach to detect
cloned software accurately and ensures high precision and recall. Ultimately, a new SMPbased
service selection approach allows equal opportunity between services and requests.
This led to improve service protection and service quality.
Case studies are carried out for experiments with the proposed approach, which show
that the new adaptations can be applied effectively to clone detection and service selection
processes with several features (e.g. accuracy). It can be concluded that the match based
approach is feasible and promising in software engineering domain.Royal Embassy of Saudi Arabi
Detection and prediction of urban archetypes at the pedestrian scale: computational toolsets, morphological metrics, and machine learning methods
Granular, dense, and mixed-use urban morphologies are hallmarks of walkable and vibrant streets. However, urban systems are notoriously complex and planned urban development, which grapples with varied interdependent and oft conflicting criteria, may — despite best intentions — yield aberrant morphologies fundamentally at odds with the needs of pedestrians and the resiliency of neighbourhoods. This work addresses the measurement, detection, and prediction of pedestrian-friendly urban archetypes by developing techniques for high-resolution urban analytics at the pedestrian scale. A spatial-analytic computational toolset, the cityseer-api Python package, is created to assess localised centrality, land-use, and statistical metrics using contextually sensitive workflows applied directly over the street network. cityseer-api subsequently facilitates a review of mixed-use and street network centrality methods to improve their utility concerning granular urban analysis. Unsupervised machine learning methods are applied to recover ‘signatures’ — urban archetypes — using Principal Component Analysis, Variational Autoencoders, and clustering methods from a high-resolution multi-variable and multi-scalar dataset consisting of centralities, land-uses, and population densities for Greater London. Supervised deep-learning methods applied to a similar dataset developed for 931 towns and cities in Great Britain demonstrate how, with the aid of domain knowledge, machine-learning classifiers can learn to discriminate between ‘artificial’ and ‘historical’ urban archetypes. These methods use complex systems thinking as a departure point and illustrate how high-resolution spatial-analytic quantitative methods can be combined with machine learning to extrapolate benchmarks in keeping with more qualitatively framed urban morphological conceptions. Such tools may aid urban design professionals in better anticipating the outcomes of varied design scenarios as part of iterative and scalable workflows. These techniques may likewise provide robust and demonstrable feedback as part of planning review and approvals processes
Automated Amortised Analysis
Steffen Jost researched a novel static program analysis that automatically infers formally guaranteed upper bounds on the use of compositional quantitative resources. The technique is based on the manual amortised complexity analysis. Inference is achieved through a type system
annotated with linear constraints. Any solution to the collected constraints yields the coefficients of a formula, that expresses an upper bound on the resource consumption of a program through the sizes of its various inputs.
The main result is the formal soundness proof of the proposed analysis for a functional language. The strictly evaluated language features higher-order types, full mutual recursion, nested data types, suspension of evaluation, and can deal with aliased data. The presentation focuses on heap space bounds. Extensions allowing the inference of bounds on stack space usage and worst-case execution time
are demonstrated for several realistic program examples. These bounds were inferred by the created generic implementation of the technique. The implementation is highly efficient, and solves even large examples within seconds.Steffen Jost stellt eine neuartige statische Programmanalyse vor, welche vollautomatisch Schranken an den Verbrauch quantitativer Ressourcen berechnet. Die Grundidee basiert auf der Technik der Amortisierten Komplexitätsanalyse, deren nicht-triviale Automatisierung durch ein erweitertes Typsystem erreicht wird. Das Typsystem berechnet als Nebenprodukt ein lineares Gleichungssystem, dessen Lösungen Koeffizienten für lineare Formeln liefern. Diese Formeln stellen garantierte obere Schranken an den Speicher- oder Zeitverbrauch des analysierten Programms dar, in Abhängigkeit von den verschiedenen Eingabegrößen des Programms. Die Relevanz der einzelnen Eingabegrößen auf den Ressourcenverbrauch
wird so deutlich beziffert.
Die formale Korrektheit der Analyse wird für eine funktionale Programmiersprache bewiesen. Die strikte Sprache erlaubt: Typen höherer Ordnung, volle Rekursion, verschachtelte Datentypen, explizites Aufschieben der Auswertung und Aliasing. Die formale Beschreibung der Analyse befasst sich primär mit dem Verbrauch von dynamischen Speicherplatz. Für eine Reihe von realistischen Programmbeispielen wird demonstriert, dass die angefertigte generische Implementation auch gute Schranken an den Verbrauch von Stapelspeicher und der maximalen Ausführungszeit ermitteln kann. Die Analyse ist sehr effizient implementierbar, und behandelt auch größere Beispielprogramme vollständig in wenigen Sekunden
Laying Tiles Ornamentally: An approach to structuring container traversals
Having hardware more capable of parallel execution means that more program scheduling decisions have to be taken to utilize that hardware efficiently. To this end, compilers implement coarse-grained loop transformations in addition to traditionally used fine-grained instruction reordering. Implementors of embedded domain specific languages have to face a difficult choice: to translate operations on collections to a low-level language naively hoping that its optimizer will do the job, or to implement their own optimizer as a part of the EDSL.<br /><br />We turn ourselves to the concept of loop tiling from the imperative world and find its equivalent for recursive functions. We show the construction of a <em>tiled</em> functorial map over containers that can be naively translated to a corresponding nested loop.<br /><br />We illustrate the connection between <em>untiled</em> and tiled functorial maps by means of a type-theoretic notion of <em>algebraic ornament</em>. This approach produces an family of container traversals indexed by <em>tile sizes</em> and serves as a basis of a proof that untiled and tiled functorial maps have the same semantics.<br /><br />We evaluate our approach by designing a language of tree traversals as a DSL embedded into Haskell which compiles into C code. We use this language to implement tiled and untiled tree traversals which we benchmark under varying choices of tile sizes and shapes of input trees. For some tree shapes, we show that a tiled tree traversal can be up to 50% faster than an untiled one under a good choice of the tile size
- …