567 research outputs found

    CMLLite: a design philosophy for CML.

    Get PDF
    CMLLite is a collection of definitions and processes which provide strong and flexible validation for a document in Chemical Markup Language (CML). It consists of an updated CML schema (schema3), conventions specifying rules in both human and machine-understandable forms and a validator available both online and offline to check conformance. This article explores the rationale behind the changes which have been made to the schema, explains how conventions interact and how they are designed, formulated, implemented and tested, and gives an overview of the validation service.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    MQALD: Evaluating the impact of modifiers in question answering over knowledge graphs.

    Get PDF
    Question Answering (QA) over Knowledge Graphs (KG) aims to develop a system that is capable of answering users’ questions using the information coming from one or multiple Knowledge Graphs, like DBpedia, Wikidata, and so on. Question Answering systems need to translate the user’s question, written using natural language, into a query formulated through a specific data query language that is compliant with the underlying KG. This translation process is already non-trivial when trying to answer simple questions that involve a single triple pattern. It becomes even more troublesome when trying to cope with questions that require modifiers in the final query, i.e., aggregate functions, query forms, and so on. The attention over this last aspect is growing but has never been thoroughly addressed by the existing literature. Starting from the latest advances in this field, we want to further step in this direction. This work aims to provide a publicly available dataset designed for evaluating the performance of a QA system in translating articulated questions into a specific data query language. This dataset has also been used to evaluate three QA systems available at the state of the art

    Computational Materials Repository

    Get PDF

    Analysis and study on text representation to improve the accuracy of the Normalized Compression Distance

    Full text link
    The huge amount of information stored in text form makes methods that deal with texts really interesting. This thesis focuses on dealing with texts using compression distances. More specifically, the thesis takes a small step towards understanding both the nature of texts and the nature of compression distances. Broadly speaking, the way in which this is done is exploring the effects that several distortion techniques have on one of the most successful distances in the family of compression distances, the Normalized Compression Distance -NCD-.Comment: PhD Thesis; 202 page

    TREE-D-SEEK: A Framework for Retrieving Three-Dimensional Scenes

    Get PDF
    In this dissertation, a strategy and framework for retrieving 3D scenes is proposed. The strategy is to retrieve 3D scenes based on a unified approach for indexing content from disparate information sources and information levels. The TREE-D-SEEK framework implements the proposed strategy for retrieving 3D scenes and is capable of indexing content from a variety of corpora at distinct information levels. A semantic annotation model for indexing 3D scenes in the TREE-D-SEEK framework is also proposed. The semantic annotation model is based on an ontology for rapid prototyping of 3D virtual worlds. With ongoing improvements in computer hardware and 3D technology, the cost associated with the acquisition, production and deployment of 3D scenes is decreasing. As a consequence, there is a need for efficient 3D retrieval systems for the increasing number of 3D scenes in corpora. An efficient 3D retrieval system provides several benefits such as enhanced sharing and reuse of 3D scenes and 3D content. Existing 3D retrieval systems are closed systems and provide search solutions based on a predefined set of indexing and matching algorithms Existing 3D search systems and search solutions cannot be customized for specific requirements, type of information source and information level. In this research, TREE-D-SEEK—an open, extensible framework for retrieving 3D scenes—is proposed. The TREE-D-SEEK framework is capable of retrieving 3D scenes based on indexing low level content to high-level semantic metadata. The TREE-D-SEEK framework is discussed from a software architecture perspective. The architecture is based on a common process flow derived from indexing disparate information sources. Several indexing and matching algorithms are implemented. Experiments are conducted to evaluate the usability and performance of the framework. Retrieval performance of the framework is evaluated using benchmarks and manually collected corpora. A generic, semantic annotation model is proposed for indexing a 3D scene. The primary objective of using the semantic annotation model in the TREE-D-SEEK framework is to improve retrieval relevance and to support richer queries within a 3D scene. The semantic annotation model is driven by an ontology. The ontology is derived from a 3D rapid prototyping framework. The TREE-D-SEEK framework supports querying by example, keyword based and semantic annotation based query types for retrieving 3D scenes

    Doctor of Philosophy

    Get PDF
    dissertationOver 40 years ago, the first computer simulation of a protein was reported: the atomic motions of a 58 amino acid protein were simulated for few picoseconds. With today's supercomputers, simulations of large biomolecular systems with hundreds of thousands of atoms can reach biologically significant timescales. Through dynamics information biomolecular simulations can provide new insights into molecular structure and function to support the development of new drugs or therapies. While the recent advances in high-performance computing hardware and computational methods have enabled scientists to run longer simulations, they also created new challenges for data management. Investigators need to use local and national resources to run these simulations and store their output, which can reach terabytes of data on disk. Because of the wide variety of computational methods and software packages available to the community, no standard data representation has been established to describe the computational protocol and the output of these simulations, preventing data sharing and collaboration. Data exchange is also limited due to the lack of repositories and tools to summarize, index, and search biomolecular simulation datasets. In this dissertation a common data model for biomolecular simulations is proposed to guide the design of future databases and APIs. The data model was then extended to a controlled vocabulary that can be used in the context of the semantic web. Two different approaches to data management are also proposed. The iBIOMES repository offers a distributed environment where input and output files are indexed via common data elements. The repository includes a dynamic web interface to summarize, visualize, search, and download published data. A simpler tool, iBIOMES Lite, was developed to generate summaries of datasets hosted at remote sites where user privileges and/or IT resources might be limited. These two informatics-based approaches to data management offer new means for the community to keep track of distributed and heterogeneous biomolecular simulation data and create collaborative networks

    Past, present and future of historical information science

    Full text link
    Der Bericht evaluiert Entwicklungen und Einflüsse von Forschungen im Bereich der empirisch orientierten Geschichtswissenschaft und deren rechnergestützten Methoden. Vorgestellt werden ein Forschungsparadigma und eine Forschungsinfrastruktur für die zukünftige historisch orientierte Informationswissenschaft. Die entscheidenden Anstöße dafür kommen eher von Außen, also nicht aus der scientific community der Assoziation for History and Computing (AHC). Die Gründe hierfür liegen darin, dass die AHC niemals klare Aussagen darüber gemacht hat, welches ihre Adressaten sind: Historiker, die sich für EDV interessieren, oder historisch orientierte Informationswissenschaftler. Das Ergebnis war, dass sich keine dieser Fraktionen angesprochen fühlte und kein Diskurs mit der 'traditionellen' Geschichtswissenschaft und der Informationswissenschaft zustande kam. Der Autor skizziert ein Forschungsprogramm, das diese Ambiguitäten vermeidet und die Ansätze in einer Forschungsinfrastruktur integriert. (ICAÜbers)'This report evaluates the impact of two decades of research within the framework of history and computing, and sets out a research paradigm and research infrastructure for future historical information science. It is good to see that there has been done a lot of historical information research in the past, much of it has been done, however, outside the field of history and computing, and not within a community like the Association for History and Computing. The reason is that the AHC never made a clear statement about what audience to address: historians with an interest in computing, or historical information scientists. As a result, both parties have not been accommodated, and communications with both 'traditional' history and 'information science' have not been established. A proper research program, based on new developments in information science, is proposed, along with an unambiguous scientific research infrastructure.' (author's abstract

    A New Model for Image-Based Humanities Computing

    Get PDF
    Image-based humanities computing, the computer-assisted study of digitallyrepresented “objects or artifacts of cultural heritage,” is an increasingly popular yet “established practice” located at the most recent intersections of humanities scholarship and “digital imaging technologies,” as Matthew Kirschenbaum has pointed out. Many exciting things have been and are being done in this field, as multifaceted multimedia projects and “advanced visual and visualization tools” continue to be produced and used; but it also seems to lack definition and seems unnecessarily limited in its critical approach to digital images. That is, the textual mediation required to make images usable or knowable, and the kinds of knowledge images offer, often goes unexamined, and the value of creative or deformative responses to images overlooked. This thesis will suggest Blake’s production of the Laoco¨on as a model for a more open and relevant approach to images, will analyze what image-based humanities computing does and how Blake’s engraving recapitulates these actions, and will describe how acritical approaches to image description could be integrated and used, and how images could function as graphic mediation for other materials, in this field. Blake’s idiosyncratic Laoco¨on exemplifies the ways that creators or editors respond to and describe images and the ways they use images to illuminate text. In entitling his plate “[Jah] & his two Sons [. . . ]” and filling it with descriptive text, Blake shares the focus of image-based humanities computing on images as things to be broken down, described, and understood. But Blake’s classification and description, deformative in misreading the image, reveals the true nature of such mediation and the need for a more open system, one which allows observers to record how they interpret an image, perhaps best accomplished in image-based humanities computing through semantic web technologies like folksonomy tagging or collaborative wiki formats. And Blake’s act of pulling a pre-existing image out of context and applying it to a new textual work suggests a new function for images and the highly structured image databases of image-based humanities computing, to clarify or complicate textual works through graphic mediation

    Atti del IX Convegno Annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale (AIUCD). La svolta inevitabile: sfide e prospettive per l'Informatica Umanistica

    Get PDF
    Proceedings of the IX edition of the annual AIUCD conferenc
    • …
    corecore