50,796 research outputs found

    On the Effect of Semantically Enriched Context Models on Software Modularization

    Full text link
    Many of the existing approaches for program comprehension rely on the linguistic information found in source code, such as identifier names and comments. Semantic clustering is one such technique for modularization of the system that relies on the informal semantics of the program, encoded in the vocabulary used in the source code. Treating the source code as a collection of tokens loses the semantic information embedded within the identifiers. We try to overcome this problem by introducing context models for source code identifiers to obtain a semantic kernel, which can be used for both deriving the topics that run through the system as well as their clustering. In the first model, we abstract an identifier to its type representation and build on this notion of context to construct contextual vector representation of the source code. The second notion of context is defined based on the flow of data between identifiers to represent a module as a dependency graph where the nodes correspond to identifiers and the edges represent the data dependencies between pairs of identifiers. We have applied our approach to 10 medium-sized open source Java projects, and show that by introducing contexts for identifiers, the quality of the modularization of the software systems is improved. Both of the context models give results that are superior to the plain vector representation of documents. In some cases, the authoritativeness of decompositions is improved by 67%. Furthermore, a more detailed evaluation of our approach on JEdit, an open source editor, demonstrates that inferred topics through performing topic analysis on the contextual representations are more meaningful compared to the plain representation of the documents. The proposed approach in introducing a context model for source code identifiers paves the way for building tools that support developers in program comprehension tasks such as application and domain concept location, software modularization and topic analysis

    Shape matching and clustering

    Get PDF
    Generalising knowledge and matching patterns is a basic human trait in re-using past experiences. We often cluster (group) knowledge of similar attributes as a process of learning and or aid to manage the complexity and re-use of experiential knowledge [1, 2]. In conceptual design, an ill-defined shape may be recognised as more than one type. Resulting in shapes possibly being classified differently when different criteria are applied. This paper outlines the work being carried out to develop a new technique for shape clustering. It highlights the current methods for analysing shapes found in computer aided sketching systems, before a method is proposed that addresses shape clustering and pattern matching. Clustering for vague geometric models and multiple viewpoint support are explored

    Pacifier overuse and conceptual relations of abstract and emotional concepts

    Get PDF
    This study explores the impact of the extensive use of an oral device since infancy (pacifier) on the acquisition of concrete, abstract, and emotional concepts. While recent evidence showed a negative relation between pacifier use and children’s emotional competence (Niedenthal et al., 2012), the possible interaction between use of pacifier and processing of emotional and abstract language has not been investigated. According to recent theories, while all concepts are grounded in sensorimotor experience, abstract concepts activate linguistic and social information more than concrete ones. Specifically, the Words As Social Tools (WAT) proposal predicts that the simulation of their meaning leads to an activation of the mouth (Borghi and Binkofski, 2014; Borghi and Zarcone, 2016). Since the pacifier affects facial mimicry forcing mouth muscles into a static position, we hypothesize its possible interference on acquisition/consolidation of abstract emotional and abstract not-emotional concepts, which aremainly conveyed during social and linguistic interactions, than of concrete concepts. Fifty-nine first grade children, with a history of different frequency of pacifier use, provided oral definitions of the meaning of abstract not-emotional, abstract emotional, and concrete words. Main effect of concept type emerged, with higher accuracy in defining concrete and abstract emotional concepts with respect to abstract not-emotional concepts, independently from pacifier use. Accuracy in definitions was not influenced by the use of pacifier, butcorrespondence and hierarchical clustering analyses suggest that the use of pacifier differently modulates the conceptual relations elicited by abstract emotional and abstract not-emotional. While the majority of the children produced a similar pattern of conceptual relations, analyses on the few (6) children who overused the pacifier (for more than 3 years) showed that they tend to distinguish less clearly between concrete and abstract emotional concepts and between concrete and abstract not-emotional concepts than children who did not use it (5) or used it for short (17). As to the conceptual relations they produced, children who overused the pacifier tended to refer less to their experience and to social and emotional situations, usemore exemplifications and functional relations, and less free associations

    Substructure Discovery Using Minimum Description Length and Background Knowledge

    Full text link
    The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain. Description of Online Appendix: This is a compressed tar file containing the SUBDUE discovery system, written in C. The program accepts as input databases represented in graph form, and will output discovered substructures with their corresponding value.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl
    • …
    corecore