26,176 research outputs found

    Context and Keyword Extraction in Plain Text Using a Graph Representation

    Full text link
    Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

    Best-First Surface Realization

    Get PDF
    Current work in surface realization concentrates on the use of general, abstract algorithms that interpret large, reversible grammars. Only little attention has been paid so far to the many small and simple applications that require coverage of a small sublanguage at different degrees of sophistication. The system TG/2 described in this paper can be smoothly integrated with deep generation processes, it integrates canned text, templates, and context-free rules into a single formalism, it allows for both textual and tabular output, and it can be parameterized according to linguistic preferences. These features are based on suitably restricted production system techniques and on a generic backtracking regime.Comment: 10 pages, LaTeX source, one EPS figur

    Semi-Automated SVG Programming via Direct Manipulation

    Full text link
    Direct manipulation interfaces provide intuitive and interactive features to a broad range of users, but they often exhibit two limitations: the built-in features cannot possibly cover all use cases, and the internal representation of the content is not readily exposed. We believe that if direct manipulation interfaces were to (a) use general-purpose programs as the representation format, and (b) expose those programs to the user, then experts could customize these systems in powerful new ways and non-experts could enjoy some of the benefits of programmable systems. In recent work, we presented a prototype SVG editor called Sketch-n-Sketch that offered a step towards this vision. In that system, the user wrote a program in a general-purpose lambda-calculus to generate a graphic design and could then directly manipulate the output to indirectly change design parameters (i.e. constant literals) in the program in real-time during the manipulation. Unfortunately, the burden of programming the desired relationships rested entirely on the user. In this paper, we design and implement new features for Sketch-n-Sketch that assist in the programming process itself. Like typical direct manipulation systems, our extended Sketch-n-Sketch now provides GUI-based tools for drawing shapes, relating shapes to each other, and grouping shapes together. Unlike typical systems, however, each tool carries out the user's intention by transforming their general-purpose program. This novel, semi-automated programming workflow allows the user to rapidly create high-level, reusable abstractions in the program while at the same time retaining direct manipulation capabilities. In future work, our approach may be extended with more graphic design features or realized for other application domains.Comment: In 29th ACM User Interface Software and Technology Symposium (UIST 2016

    A Domain-Specific Language and Editor for Parallel Particle Methods

    Full text link
    Domain-specific languages (DSLs) are of increasing importance in scientific high-performance computing to reduce development costs, raise the level of abstraction and, thus, ease scientific programming. However, designing and implementing DSLs is not an easy task, as it requires knowledge of the application domain and experience in language engineering and compilers. Consequently, many DSLs follow a weak approach using macros or text generators, which lack many of the features that make a DSL a comfortable for programmers. Some of these features---e.g., syntax highlighting, type inference, error reporting, and code completion---are easily provided by language workbenches, which combine language engineering techniques and tools in a common ecosystem. In this paper, we present the Parallel Particle-Mesh Environment (PPME), a DSL and development environment for numerical simulations based on particle methods and hybrid particle-mesh methods. PPME uses the meta programming system (MPS), a projectional language workbench. PPME is the successor of the Parallel Particle-Mesh Language (PPML), a Fortran-based DSL that used conventional implementation strategies. We analyze and compare both languages and demonstrate how the programmer's experience can be improved using static analyses and projectional editing. Furthermore, we present an explicit domain model for particle abstractions and the first formal type system for particle methods.Comment: Submitted to ACM Transactions on Mathematical Software on Dec. 25, 201

    On the Effect of Semantically Enriched Context Models on Software Modularization

    Full text link
    Many of the existing approaches for program comprehension rely on the linguistic information found in source code, such as identifier names and comments. Semantic clustering is one such technique for modularization of the system that relies on the informal semantics of the program, encoded in the vocabulary used in the source code. Treating the source code as a collection of tokens loses the semantic information embedded within the identifiers. We try to overcome this problem by introducing context models for source code identifiers to obtain a semantic kernel, which can be used for both deriving the topics that run through the system as well as their clustering. In the first model, we abstract an identifier to its type representation and build on this notion of context to construct contextual vector representation of the source code. The second notion of context is defined based on the flow of data between identifiers to represent a module as a dependency graph where the nodes correspond to identifiers and the edges represent the data dependencies between pairs of identifiers. We have applied our approach to 10 medium-sized open source Java projects, and show that by introducing contexts for identifiers, the quality of the modularization of the software systems is improved. Both of the context models give results that are superior to the plain vector representation of documents. In some cases, the authoritativeness of decompositions is improved by 67%. Furthermore, a more detailed evaluation of our approach on JEdit, an open source editor, demonstrates that inferred topics through performing topic analysis on the contextual representations are more meaningful compared to the plain representation of the documents. The proposed approach in introducing a context model for source code identifiers paves the way for building tools that support developers in program comprehension tasks such as application and domain concept location, software modularization and topic analysis
    corecore