Search CORE

8,195 research outputs found

Effective similarity measures in electronic testing at programming languages

Author: Akinwale Adio
Niewiadomski Adam
Publication venue: Lodz University of Technology. Press
Publication date: 01/01/2012
Field of study

The purpose of this study is to explore the grammatical proper ties and features of generalized n-gram matching technique in electronic test at programming languages. N-gram matching technique has been success fully employed in information handling and decision support system dealing with texts but its side effect is size n which tends to be rather large. Two new methods of odd gram and sumsquare gram have been proposed for the improvement of generalized n-gram matching together with the modification of existing methods. While generalized n-grams matching is easy to generate and manage, they do require quadratic time and space complexity and are therefore ill-suited to the proposed and modified methods which work in quadratic in nature. Experiments have been conducted with the two new methods and modified ones using real life programming code assignments as pattern and text matches and the derived results were compared with the existing methods which are among the best in practice. The results obtained experimentally are very positive and suggested that the proposed methods can be successfully applied in electronic test at programming languages

Lodz University of Technology Repository

Efficient Similarity Measures for Texts Matching

Author: Akinwale Adio
Niewiadomski Adam
Publication venue: Lodz University of Technology. Press
Publication date: 01/01/2015
Field of study

Calculation of similarity measures of exact matching texts is a critical task in the area of pattern matching that needs a great attention. There are many existing similarity measures in literature but the best methods do not exist for closeness measurement of two strings. The objective of this paper is to explore the grammatical properties and features of generalized n-gram matching technique of similarity measures to find exact text in electronic computer applications. Three new similarity measures have been proposed to improve the performance of generalized n-gram method. The new methods assigned high values of similarity measures and performance to price with low values of running time. The experiment with the new methods demonstrated that they are universal and very useful in words that could be derived from the word list as a group and retrieve relevant medical terms from database . One of the methods achieved best correlation of values for the evaluation of subjective examination

Lodz University of Technology Repository

Simplifying Deep-Learning-Based Model for Code Search

Author: Hassan Ahmed E.
Li Shanping
Liu Chao
Liu Zhiwei
Lo David
Xia Xin
Publication venue
Publication date: 28/05/2020
Field of study

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR) based models for code search, which match keywords in query with code text. But they fail to connect the semantic gap between query and code. To conquer this challenge, Gu et al. proposed a deep-learning-based model named DeepCS. It jointly embeds method code and natural language description into a shared vector space, where methods related to a natural language query are retrieved according to their vector similarities. However, DeepCS' working process is complicated and time-consuming. To overcome this issue, we proposed a simplified model CodeMatcher that leverages the IR technique but maintains many features in DeepCS. Generally, CodeMatcher combines query keywords with the original order, performs a fuzzy search on name and body strings of methods, and returned the best-matched methods with the longer sequence of used keywords. We verified its effectiveness on a large-scale codebase with about 41k repositories. Experimental results showed the simplified model CodeMatcher outperforms DeepCS by 97% in terms of MRR (a widely used accuracy measure for code search), and it is over 66 times faster than DeepCS. Besides, comparing with the state-of-the-art IR-based model CodeHow, CodeMatcher also improves the MRR by 73%. We also observed that: fusing the advantages of IR-based and deep-learning-based models is promising because they compensate with each other by nature; improving the quality of method naming helps code search, since method name plays an important role in connecting query and code

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

A heuristic-based approach to code-smell detection

Author: Kirk D.
Roper M.
Wood M.
Publication venue: Nova Science Publishers, Inc.
Publication date: 01/01/2007
Field of study

Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

University of Strathclyde Institutional Repository

Extraction and integration of data from semi-structured documents into business applications

Author
Publication venue: Sloan School of Management, Massachusetts Institute of Technology
Publication date: 01/01/1997
Field of study

Cover title.Includes bibliographical references (p. 8).Ph. Bonnet & S. Bressan

DSpace@MIT

Structure and Properties of Traces for Functional Programs

Author: Bakewell
Bernstein
Braßel
Caballero
Chitil
Chitil
Clements
Ennals
Faxén
Gill
Johnsson
Klop
Launchbury
Maessen
Olaf Chitil
Peyton Jones
Plump
Shapiro
Silva
Sparud
Sparud
Yong Luo
Publication venue: 'Elsevier BV'
Publication date: 18/05/2007
Field of study

The tracer Hat records in a detailed trace the computation of a program written in the lazy functional language Haskell. The trace can then be viewed in various ways to support program comprehension and debugging. The trace was named the augmented redex trail. Its structure was inspired by standard graph rewriting implementations of functional languages. Here we describe a model of the trace that captures its essential properties and allows formal reasoning. The trace is a graph constructed by graph rewriting but goes beyond simple term graphs. Although the trace is a graph whose structure is independent of any rewriting strategy, we define the trace inductively, thus giving us a powerful method for proving its properties

Elsevier - Publisher Connector

Crossref

Kent Academic Repository