440 research outputs found

    Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

    Full text link
    Mathematical formulae represent complex semantic information in a concise form. Especially in Science, Technology, Engineering, and Mathematics, mathematical formulae are crucial to communicate information, e.g., in scientific papers, and to perform computations using computer algebra systems. Enabling computers to access the information encoded in mathematical formulae requires machine-readable formats that can represent both the presentation and content, i.e., the semantics, of formulae. Exchanging such information between systems additionally requires conversion methods for mathematical representation formats. We analyze how the semantic enrichment of formulae improves the format conversion process and show that considering the textual context of formulae reduces the error rate of such conversions. Our main contributions are: (1) providing an openly available benchmark dataset for the mathematical format conversion task consisting of a newly created test collection, an extensive, manually curated gold standard and task-specific evaluation metrics; (2) performing a quantitative evaluation of state-of-the-art tools for mathematical format conversions; (3) presenting a new approach that considers the textual context of formulae to reduce the error rate for mathematical format conversions. Our benchmark dataset facilitates future research on mathematical format conversions as well as research on many problems in mathematical information retrieval. Because we annotated and linked all components of formulae, e.g., identifiers, operators and other entities, to Wikidata entries, the gold standard can, for instance, be used to train methods for formula concept discovery and recognition. Such methods can then be applied to improve mathematical information retrieval systems, e.g., for semantic formula search, recommendation of mathematical content, or detection of mathematical plagiarism.Comment: 10 pages, 4 figure

    An Investigation into Ontology-Based Enhancement of Search Technologies for E-Government: Literature Review

    Get PDF
    Services provided by E-government are no longer considered as a new topic, there is a continuous evolution of the level of services provided by the E-government that matches the development of the techniques and technologies used. The success or failure of E-government builds mainly on providing different services to citizens in a suitable and effective manner. This research study aims at providing an empirical and evaluation study of the effects and the opportunities of implementing various techniques in the development of E-government. The research focuses on the impact of using ontology technique on the success or failure of the services provided by E-government. The services provided to citizens are expanded from information extraction to vote, tax, and other services. It becomes necessary to provide a detail description of the most appropriate technologies in order to reach to a successful E-government which provides effective services

    A Model for Managing Information Flow on the World Wide Web

    Get PDF
    Metadata merged with duplicate record (http://hdl.handle.net/10026.1/330) on 20.12.2016 by CS (TIS).This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.This thesis considers the nature of information management on the World Wide Web. The web has evolved into a global information system that is completely unregulated, permitting anyone to publish whatever information they wish. However, this information is almost entirely unmanaged, which, together with the enormous number of users who access it, places enormous strain on the web's architecture. This has led to the exposure of inherent flaws, which reduce its effectiveness as an information system. The thesis presents a thorough analysis of the state of this architecture, and identifies three flaws that could render the web unusable: link rot; a shrinking namespace; and the inevitable increase of noise in the system. A critical examination of existing solutions to these flaws is provided, together with a discussion on why the solutions have not been deployed or adopted. The thesis determines that they have failed to take into account the nature of the information flow between information provider and consumer, or the open philosophy of the web. The overall aim of the research has therefore been to design a new solution to these flaws in the web, based on a greater understanding of the nature of the information that flows upon it. The realization of this objective has included the development of a new model for managing information flow on the web, which is used to develop a solution to the flaws. The solution comprises three new additions to the web's architecture: a temporal referencing scheme; an Oracle Server Network for more effective web browsing; and a Resource Locator Service, which provides automatic transparent resource migration. The thesis describes their design and operation, and presents the concept of the Request Router, which provides a new way of integrating such distributed systems into the web's existing architecture without breaking it. The design of the Resource Locator Service, including the development of new protocols for resource migration, is covered in great detail, and a prototype system that has been developed to prove the effectiveness of the design is presented. The design is further validated by comprehensive performance measurements of the prototype, which show that it will scale to manage a web whose size is orders of magnitude greater than it is today

    Internet based molecular collaborative and publishing tools

    No full text
    The scientific electronic publishing model has hitherto been an Internet based delivery of electronic articles that are essentially replicas of their paper counterparts. They contain little in the way of added semantics that may better expose the science, assist the peer review process and facilitate follow on collaborations, even though the enabling technologies have been around for some time and are mature. This thesis will examine the evolution of chemical electronic publishing over the past 15 years. It will illustrate, which the help of two frameworks, how publishers should be exploiting technologies to improve the semantics of chemical journal articles, namely their value added features and relationships with other chemical resources on the Web. The first framework is an early exemplar of structured and scalable electronic publishing where a Web content management system and a molecular database are integrated. It employs a test bed of articles from several RSC journals and supporting molecular coordinate and connectivity information. The value of converting 3D molecular expressions in chemical file formats, such as the MOL file, into more generic 3D graphics formats, such as Web3D, is assessed. This exemplar highlights the use of metadata management for bidirectional hyperlink maintenance in electronic publishing. The second framework repurposes this metadata management concept into a Semantic Web application called SemanticEye. SemanticEye demonstrates how relationships between chemical electronic articles and other chemical resources are established. It adapts the successful semantic model used for digital music metadata management by popular applications such as iTunes. Globally unique identifiers enable relationships to be established between articles and other resources on the Web and SemanticEye implements two: the Document Object Identifier (DOI) for articles and the IUPAC International Chemical Identifier (InChI) for molecules. SemanticEye’s potential as a framework for seeding collaborations between researchers, who have hitherto never met, is explored using FOAF, the friend-of-a-friend Semantic Web standard for social networks

    Educational tool based on topology and evolution of hyperlinks in the Wikipedia

    Get PDF
    We propose a new method to support educationalexploration in the hyperlink network of the Wikipedia onlineencyclopedia. The learner is provided with alternative parallelranking lists, each one promoting hyperlinks that represent adifferent pedagogical perspective to the desired learning topic.The learner can browse the conceptual relations between thelatest versions of articles or the conceptual relations belongingto consecutive temporal versions of an article, or a mixture ofboth approaches. Based on her needs and intuition, the learnerexplores hyperlink network and meanwhile the method buildsautomatically concept maps that reflect her conceptualizationprocess and can be used for varied educational purposes.Initial experiments with a prototype tool based on the methodindicate enhancement to ordinary learning results and suggestfurther research.Peer reviewe

    Dynamic Generation of Intelligent Multimedia Presentations Through Semantic Inferencing

    Get PDF
    This paper first proposes a high-level architecture for semi-automatically generating multimedia presentations by combining semantic inferencing with multimedia presentation generation tools. It then describes a system, based on this architecture, which was developed as a service to run over OAI archives - but is applicable to any repositories containing mixed-media resources described using Dublin Core. By applying an iterative sequence of searches across the Dublin Core metadata, published by the OAI data providers, semantic relationships can be inferred between the mixed-media objects which are retrieved. Using predefined mapping rules, these semantic relationships are then mapped to spatial and temporal relationships between the objects. The spatial and temporal relationships are expressed within SMIL files which can be replayed as multimedia presentations. Our underlying hypothesis is that by using automated computer processing of metadata to organize and combine semantically-related objects within multimedia presentations, the system may be able to generate new knowledge by exposing previously unrecognized connections. In addition, the use of multilayered information-rich multimedia to present the results, enables faster and easier information browsing, analysis, interpretation and deduction by the end-user
    • …
    corecore