4,568 research outputs found

    Visualizing Z Notation in HTML Documents

    Full text link
    The use of the WWW as a communication medium for software engineers is limited by the lack of tools for writing, sharing, and verifying formal notations. For instance, the Z specification language has a a rich set of mathematical characters, and requires graphic-rich boxes and schemas for its specifications. It is difficult to integrate Z specifications and text on WWW pages written with the current versions of HTML, and traditional tools are not suited for the task. We present a Java-based tool for rendering Z specifications within HTML documents that can be shown on every WWW browser with Java capabilities. Being a complete rendering engine, text parts and Z specifications can be freely intermixed, and all the standard features of HTML (such as links, etc.) are available outside and inside Z specifications. Furthermore, the extensibility of our engine allows additional notations to be supported and integrated with current ones

    Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

    Full text link
    Mathematical formulae represent complex semantic information in a concise form. Especially in Science, Technology, Engineering, and Mathematics, mathematical formulae are crucial to communicate information, e.g., in scientific papers, and to perform computations using computer algebra systems. Enabling computers to access the information encoded in mathematical formulae requires machine-readable formats that can represent both the presentation and content, i.e., the semantics, of formulae. Exchanging such information between systems additionally requires conversion methods for mathematical representation formats. We analyze how the semantic enrichment of formulae improves the format conversion process and show that considering the textual context of formulae reduces the error rate of such conversions. Our main contributions are: (1) providing an openly available benchmark dataset for the mathematical format conversion task consisting of a newly created test collection, an extensive, manually curated gold standard and task-specific evaluation metrics; (2) performing a quantitative evaluation of state-of-the-art tools for mathematical format conversions; (3) presenting a new approach that considers the textual context of formulae to reduce the error rate for mathematical format conversions. Our benchmark dataset facilitates future research on mathematical format conversions as well as research on many problems in mathematical information retrieval. Because we annotated and linked all components of formulae, e.g., identifiers, operators and other entities, to Wikidata entries, the gold standard can, for instance, be used to train methods for formula concept discovery and recognition. Such methods can then be applied to improve mathematical information retrieval systems, e.g., for semantic formula search, recommendation of mathematical content, or detection of mathematical plagiarism.Comment: 10 pages, 4 figure

    Identification of Design Principles

    Get PDF
    This report identifies those design principles for a (possibly new) query and transformation language for the Web supporting inference that are considered essential. Based upon these design principles an initial strawman is selected. Scenarios for querying the Semantic Web illustrate the design principles and their reflection in the initial strawman, i.e., a first draft of the query language to be designed and implemented by the REWERSE working group I4

    Web and Semantic Web Query Languages

    Get PDF
    A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This article introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced using common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion

    Interactive web-based visualization and sharing of phylogenetic trees using phylogeny.IO

    Get PDF
    Traditional static publication formats make visualization, exploration, and sharing of massive phylogenetic trees difficult. A phylogenetic study often involves hundreds of taxa, and the resulting tree has to be split across multiple journal pages, or be shrunk onto one, which jeopardizes legibility. Furthermore, additional data layers, such as species-specific information or time calibrations are often displayed in separate figures, making the entire picture difficult for readers to grasp. Web-based technologies, such as the Data Driven Document (D3) JavaScript library, were created to overcome such challenges by allowing interactive displays of complex data sets. The new phylogeny.IO web server (https://phylogeny.io) overcomes this issue by allowing users to easily import, annotate, and share interactive phylogenetic trees. It allows a range of static (e.g. such as shapes and colors) and dynamic (e.g. pop-up text and images) annotations. Annotated trees can be saved on the server for subsequent modification or they may be shared as IFrame HTML objects, easily embeddable in any web page. The principal goal of phylogeny.IO is not to produce publication-ready figures, but rather to provide a simple and intuitive annotation interface that allows easy and rapid sharing of figures in blogs, lecture notes, press releases, etc

    GiViP: A Visual Profiler for Distributed Graph Processing Systems

    Full text link
    Analyzing large-scale graphs provides valuable insights in different application scenarios. While many graph processing systems working on top of distributed infrastructures have been proposed to deal with big graphs, the tasks of profiling and debugging their massive computations remain time consuming and error-prone. This paper presents GiViP, a visual profiler for distributed graph processing systems based on a Pregel-like computation model. GiViP captures the huge amount of messages exchanged throughout a computation and provides an interactive user interface for the visual analysis of the collected data. We show how to take advantage of GiViP to detect anomalies related to the computation and to the infrastructure, such as slow computing units and anomalous message patterns.Comment: Appears in the Proceedings of the 25th International Symposium on Graph Drawing and Network Visualization (GD 2017

    The Document Similarity Network: A Novel Technique for Visualizing Relationships in Text Corpora

    Get PDF
    With the abundance of written information available online, it is useful to be able to automatically synthesize and extract meaningful information from text corpora. We present a unique method for visualizing relationships between documents in a text corpus. By using Latent Dirichlet Allocation to extract topics from the corpus, we create a graph whose nodes represent individual documents and whose edge weights indicate the distance between topic distributions in documents. These edge lengths are then scaled using multidimensional scaling techniques, such that more similar documents are clustered together. Applying this method to several datasets, we demonstrate that these graphs are useful in visually representing high-dimensional document clustering in topic-space

    The Weight Function in the Subtree Kernel is Decisive

    Get PDF
    Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through eight real data classification problems the great efficiency of our approach, in particular for small datasets, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.Comment: 36 page
    corecore