1,815 research outputs found

    A heuristic-based approach to code-smell detection

    Get PDF
    Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

    Relationship analysis : improving the systems analysis process

    Get PDF
    A significant aspect of systems analysis involves discovering and representing entities and their inter-relationships. Guidelines exist to identify entities but do not provide a rigorous and comprehensive process to explicitly capture the relationship structure of the problem domain. Whereas, other analysis techniques lightly address the relationship discovery process, Relationship Analysis is the only systematic, domain-independent analysis technique focusing exclusively on a domain\u27s relationship structure. The quality of design artifacts, such as class diagrams, and development time necessary to generate these artifacts can be improved by first representing the complete relationship structure of the problem domain. The Relationship Analysis Model is the first theory-based taxonomy to classify relationships. A rigorous evaluation was conducted, including a formal experiment comparing novice and experienced analysts with and without Relationship Analysis. It was shown that the Relationship Analysis Process based on the model does provide a fuller and richer systems analysis, resulting in improved quality of and reduced time in generating class diagrams. It also was shown that Relationship Analysis enables analysts of varying experience levels to achieve a similar level of quality of class diagrams. Relationship Analysis significantly enhances the systems analyst\u27s effectiveness, especially in the area of relationship discovery and documentation resulting in improved analysis and design artifacts

    Natural Language Processing in-and-for Design Research

    Full text link
    We review the scholarly contributions that utilise Natural Language Processing (NLP) methods to support the design process. Using a heuristic approach, we collected 223 articles published in 32 journals and within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research

    UmobiTalk: Ubiquitous Mobile Speech Based Learning Language Translator for Sesotho Language

    Get PDF
    Published ThesisThe need to conserve the under-resourced languages is becoming more urgent as some of them are becoming extinct; natural language processing can be used to redress this. Currently, most initiatives around language processing technologies are focusing on western languages such as English and French, yet resources for such languages are already available. The Sesotho language is one of the under-resourced Bantu languages; it is mostly spoken in Free State province of South Africa and in Lesotho. Like other parts of South Africa, Free State has experienced high number of migrants and non-Sesotho speakers from neighboring provinces and countries; such people are faced with serious language barrier problems especially in the informal settlements where everyone tends to speak only Sesotho. Non-Sesotho speakers refers to the racial groups such as Xhosas, Zulus, Coloureds, Whites and more, in which Sesotho language is not their native language. As a solution to this, we developed a parallel corpus that has English as source and Sesotho as a target language and packaged it in UmobiTalk - Ubiquitous mobile speech based learning translator. UmobiTalk is a mobile-based tool for learning Sesotho for English speakers. The development of this tool was based on the combination of automatic speech recognition, machine translation and speech synthesis

    The Interpretation of Tables in Texts

    Get PDF

    Natural language search of structured documents

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 45-47).This thesis focuses on techniques with which natural language can be used to search for specific elements in a structured document, such as an XML file. The goal is to create a system capable of being trained to identify features, of written English sentence describing (in natural language) part of an XML document, that help identify the sections of said document which were discussed. In particular, this thesis will revolve around the problem of searching through XML documents, each of which describes the play-by-play events of a baseball game. These events are collected from Major League Baseball games between 2004 and 2008, containing information detailing the outcome of every pitch thrown. My techniques are trained and tested on written (newspaper) summaries of these games, which often refer to specific game events and statistics. The choice of these training data makes the task much more complex in two ways. First, these summaries come from multiple authors. Each of these authors has a distinct writing style, which uses language in a unique and often complex way. Secondly, large portions of these summaries discuss facts outside of the context of the play-by-play events of the XML documents. Training the system with these portions of the summary can create a problem due to sparse data, which has the potential to reduce the effectiveness of the system. The end result is the creation of a system capable of building classifiers for natural language search of these XML documents.(cont.) This system is able to overcome the two aforementioned problems, as well as several more subtle challenges. In addition, several limitations of alternative, strictly feature-based, classifiers are also illustrated, and applications of this research to related problems (outside of baseball and sports) are discussed.by Stephen W. Oney.M.Eng

    Developing semantic pathway comparison methods for systems biology

    Get PDF
    Systems biology is an emerging multi-disciplinary field in which the behaviour of complex biological systems is studied by considering the interaction of many cellular and molecular constituents rather than using a “traditional” reductionist approach where constituents are studied individually. Systems are often studied over time with the ultimate goal of developing models which can be used to understand and predict complex biological processes, such as human diseases. To support systems biology, a large number of biological pathways are being derived for many different organisms, and these are stored in various databases. This pathway collection presents an opportunity to compare and contrast pathways, and to utilise the knowledge they represent. This thesis presents some of the first algorithms that are designed to explore this opportunity. It is argued that the methods will be useful to biologists in order to assess the biological plausibility of derived pathways, compare different biological pathways for semantic similarities, and to derive putative pathways that are semantically similar to documented biological pathways. The methods will therefore extend the systems biology toolbox that biologists can use to make new biological discoveries.Knowledge Foundation. Grant No. 2003/0215Information Fusion Research Program (University of Skovde, Sweden) Grant No 2003/010
    corecore