9,485 research outputs found
Towards OpenMath Content Dictionaries as Linked Data
"The term 'Linked Data' refers to a set of best practices for publishing and
connecting structured data on the web". Linked Data make the Semantic Web work
practically, which means that information can be retrieved without complicated
lookup mechanisms, that a lightweight semantics enables scalable reasoning, and
that the decentral nature of the Web is respected. OpenMath Content
Dictionaries (CDs) have the same characteristics - in principle, but not yet in
practice. The Linking Open Data movement has made a considerable practical
impact: Governments, broadcasting stations, scientific publishers, and many
more actors are already contributing to the "Web of Data". Queries can be
answered in a distributed way, and services aggregating data from different
sources are replacing hard-coded mashups. However, these services are currently
entirely lacking mathematical functionality. I will discuss real-world
scenarios, where today's RDF-based Linked Data do not quite get their job done,
but where an integration of OpenMath would help - were it not for certain
conceptual and practical restrictions. I will point out conceptual shortcomings
in the OpenMath 2 specification and common bad practices in publishing CDs and
then propose concrete steps to overcome them and to contribute OpenMath CDs to
the Web of Data.Comment: Presented at the OpenMath Workshop 2010, http://cicm2010.cnam.fr/om
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Knowledge Management for Foundations: Planning Study
Outlines objectives, methodologies, and issues for components of a study on knowledge management among foundations and solutions to challenges: existing practice, a market study, copyright issues, technical standards, taxonomies, and a pilot repository
Km4City Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Presently, a very large number of public and private data sets are available
from local governments. In most cases, they are not semantically interoperable
and a huge human effort would be needed to create integrated ontologies and
knowledge base for smart city. Smart City ontology is not yet standardized, and
a lot of research work is needed to identify models that can easily support the
data reconciliation, the management of the complexity, to allow the data
reasoning. In this paper, a system for data ingestion and reconciliation of
smart cities related aspects as road graph, services available on the roads,
traffic sensors etc., is proposed. The system allows managing a big data volume
of data coming from a variety of sources considering both static and dynamic
data. These data are mapped to a smart-city ontology, called KM4City (Knowledge
Model for City), and stored into an RDF-Store where they are available for
applications via SPARQL queries to provide new services to the users via
specific applications of public administration and enterprises. The paper
presents the process adopted to produce the ontology and the big data
architecture for the knowledge base feeding on the basis of open and private
data, and the mechanisms adopted for the data verification, reconciliation and
validation. Some examples about the possible usage of the coherent big data
knowledge base produced are also offered and are accessible from the RDF-Store
and related services. The article also presented the work performed about
reconciliation algorithms and their comparative assessment and selection
How FAIR can you get? Image Retrieval as a Use Case to calculate FAIR Metrics
A large number of services for research data management strive to adhere to
the FAIR guiding principles for scientific data management and stewardship. To
evaluate these services and to indicate possible improvements, use-case-centric
metrics are needed as an addendum to existing metric frameworks. The retrieval
of spatially and temporally annotated images can exemplify such a use case. The
prototypical implementation indicates that currently no research data
repository achieves the full score. Suggestions on how to increase the score
include automatic annotation based on the metadata inside the image file and
support for content negotiation to retrieve the images. These and other
insights can lead to an improvement of data integration workflows, resulting in
a better and more FAIR approach to manage research data.Comment: This is a preprint for a paper accepted for the 2018 IEEE conferenc
The CENDARI White Book of Archives
Over the course of its four year project timeline, the CENDARI project has
collected archival descriptions and metadata in various formats from a broad
range of cultural heritage institutions. These data were drawn together in a
single repository and are being stored there. The repository contains curated
data which has been manually established by the CENDARI team as well as data
acquired from small, âhiddenâ archives in spreadsheet format or from big
aggregators with advanced data exchange tools in place. While the acquisition
and curation of heterogeneous data in a single repository presents a technical
challenge in itself, the ingestion of data into the CENDARI repository also
opens up the possibility to process and index them through data extraction,
entity recognition, semantic enhancement and other transformations. In this
way the CENDARI project was able to act as a bridge between cultural heritage
institutions and historical researchers, insofar as it drew together holdings
from a broad range of institutions and enabled the browsing of this
heterogeneous content within a single search space. This paper describes a
broad range of ways in which the CENDARI project acquired data from cultural
heritage institutions as well as the necessary technical background. In
exemplifying diverse data creation or acquisition strategies, multiple formats
and technical solutions, assets and drawbacks of a repository, this âWhite
Bookâ aims at providing guidance and advice as well as best practices for
archivists and cultural heritage institutions collaborating or planning to
collaborate with infrastructure projects. http://www.cendari.eu/thematic-
research-guides/white-book-archives The CENDARI White Book of Archives.
Available from: http://hdl.handle.net/2262/7568
Web 2.0 technologies for learning: the current landscape â opportunities, challenges and tensions: supplementary materials
These supplementary materials accompany the report âWeb 2.0 technologies for learning: the current landscape â opportunities, challenges and tensionsâ, which is the first report from research commissioned by Becta into Web 2.0 technologies for learning at Key Stages 3 and 4. This report describes findings from the commissioned literature review of the then current landscape concerning learner use of Web 2.0 technologies and the implications for teachers, schools, local authorities and policy makers
The Research Object Suite of Ontologies: Sharing and Exchanging Research Data and Methods on the Open Web
Research in life sciences is increasingly being conducted in a digital and
online environment. In particular, life scientists have been pioneers in
embracing new computational tools to conduct their investigations. To support
the sharing of digital objects produced during such research investigations, we
have witnessed in the last few years the emergence of specialized repositories,
e.g., DataVerse and FigShare. Such repositories provide users with the means to
share and publish datasets that were used or generated in research
investigations. While these repositories have proven their usefulness,
interpreting and reusing evidence for most research results is a challenging
task. Additional contextual descriptions are needed to understand how those
results were generated and/or the circumstances under which they were
concluded. Because of this, scientists are calling for models that go beyond
the publication of datasets to systematically capture the life cycle of
scientific investigations and provide a single entry point to access the
information about the hypothesis investigated, the datasets used, the
experiments carried out, the results of the experiments, the people involved in
the research, etc. In this paper we present the Research Object (RO) suite of
ontologies, which provide a structured container to encapsulate research data
and methods along with essential metadata descriptions. Research Objects are
portable units that enable the sharing, preservation, interpretation and reuse
of research investigation results. The ontologies we present have been designed
in the light of requirements that we gathered from life scientists. They have
been built upon existing popular vocabularies to facilitate interoperability.
Furthermore, we have developed tools to support the creation and sharing of
Research Objects, thereby promoting and facilitating their adoption.Comment: 20 page
DRIVER Technology Watch Report
This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field
- âŠ