26,176 research outputs found
Context and Keyword Extraction in Plain Text Using a Graph Representation
Document indexation is an essential task achieved by archivists or automatic
indexing tools. To retrieve relevant documents to a query, keywords describing
this document have to be carefully chosen. Archivists have to find out the
right topic of a document before starting to extract the keywords. For an
archivist indexing specialized documents, experience plays an important role.
But indexing documents on different topics is much harder. This article
proposes an innovative method for an indexing support system. This system takes
as input an ontology and a plain text document and provides as output
contextualized keywords of the document. The method has been evaluated by
exploiting Wikipedia's category links as a termino-ontological resources
Best-First Surface Realization
Current work in surface realization concentrates on the use of general,
abstract algorithms that interpret large, reversible grammars. Only little
attention has been paid so far to the many small and simple applications that
require coverage of a small sublanguage at different degrees of sophistication.
The system TG/2 described in this paper can be smoothly integrated with deep
generation processes, it integrates canned text, templates, and context-free
rules into a single formalism, it allows for both textual and tabular output,
and it can be parameterized according to linguistic preferences. These features
are based on suitably restricted production system techniques and on a generic
backtracking regime.Comment: 10 pages, LaTeX source, one EPS figur
Semi-Automated SVG Programming via Direct Manipulation
Direct manipulation interfaces provide intuitive and interactive features to
a broad range of users, but they often exhibit two limitations: the built-in
features cannot possibly cover all use cases, and the internal representation
of the content is not readily exposed. We believe that if direct manipulation
interfaces were to (a) use general-purpose programs as the representation
format, and (b) expose those programs to the user, then experts could customize
these systems in powerful new ways and non-experts could enjoy some of the
benefits of programmable systems.
In recent work, we presented a prototype SVG editor called Sketch-n-Sketch
that offered a step towards this vision. In that system, the user wrote a
program in a general-purpose lambda-calculus to generate a graphic design and
could then directly manipulate the output to indirectly change design
parameters (i.e. constant literals) in the program in real-time during the
manipulation. Unfortunately, the burden of programming the desired
relationships rested entirely on the user.
In this paper, we design and implement new features for Sketch-n-Sketch that
assist in the programming process itself. Like typical direct manipulation
systems, our extended Sketch-n-Sketch now provides GUI-based tools for drawing
shapes, relating shapes to each other, and grouping shapes together. Unlike
typical systems, however, each tool carries out the user's intention by
transforming their general-purpose program. This novel, semi-automated
programming workflow allows the user to rapidly create high-level, reusable
abstractions in the program while at the same time retaining direct
manipulation capabilities. In future work, our approach may be extended with
more graphic design features or realized for other application domains.Comment: In 29th ACM User Interface Software and Technology Symposium (UIST
2016
A Domain-Specific Language and Editor for Parallel Particle Methods
Domain-specific languages (DSLs) are of increasing importance in scientific
high-performance computing to reduce development costs, raise the level of
abstraction and, thus, ease scientific programming. However, designing and
implementing DSLs is not an easy task, as it requires knowledge of the
application domain and experience in language engineering and compilers.
Consequently, many DSLs follow a weak approach using macros or text generators,
which lack many of the features that make a DSL a comfortable for programmers.
Some of these features---e.g., syntax highlighting, type inference, error
reporting, and code completion---are easily provided by language workbenches,
which combine language engineering techniques and tools in a common ecosystem.
In this paper, we present the Parallel Particle-Mesh Environment (PPME), a DSL
and development environment for numerical simulations based on particle methods
and hybrid particle-mesh methods. PPME uses the meta programming system (MPS),
a projectional language workbench. PPME is the successor of the Parallel
Particle-Mesh Language (PPML), a Fortran-based DSL that used conventional
implementation strategies. We analyze and compare both languages and
demonstrate how the programmer's experience can be improved using static
analyses and projectional editing. Furthermore, we present an explicit domain
model for particle abstractions and the first formal type system for particle
methods.Comment: Submitted to ACM Transactions on Mathematical Software on Dec. 25,
201
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
- …