7,590 research outputs found
RDF, the semantic web, Jordan, Jordan and Jordan
This collection is addressed to archivists and library professionals, and so has a slight focus on implications implications for them. This chapter is nonetheless intended to be a more-or-less generic introduction to the Semantic Web and RDF, which isn't specific to that domain
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
An eco-friendly hybrid urban computing network combining community-based wireless LAN access and wireless sensor networking
Computer-enhanced smart environments, distributed environmental monitoring, wireless communication, energy conservation and sustainable technologies, ubiquitous access to Internet-located data and services, user mobility and innovation as a tool for service differentiation are all significant contemporary research subjects and societal developments. This position paper presents the design of a hybrid municipal network infrastructure that, to a lesser or greater degree, incorporates aspects from each of these topics by integrating a community-based Wi-Fi access network with Wireless Sensor Network (WSN) functionality. The former component provides free wireless Internet connectivity by harvesting the Internet subscriptions of city inhabitants. To minimize session interruptions for mobile clients, this subsystem incorporates technology that achieves (near-)seamless handover between Wi-Fi access points. The WSN component on the other hand renders it feasible to sense physical properties and to realize the Internet of Things (IoT) paradigm. This in turn scaffolds the development of value-added end-user applications that are consumable through the community-powered access network. The WSN subsystem invests substantially in ecological considerations by means of a green distributed reasoning framework and sensor middleware that collaboratively aim to minimize the network's global energy consumption. Via the discussion of two illustrative applications that are currently being developed as part of a concrete smart city deployment, we offer a taste of the myriad of innovative digital services in an extensive spectrum of application domains that is unlocked by the proposed platform
Old Techniques for New Join Algorithms: A Case Study in RDF Processing
Recently there has been significant interest around designing specialized RDF
engines, as traditional query processing mechanisms incur orders of magnitude
performance gaps on many RDF workloads. At the same time researchers have
released new worst-case optimal join algorithms which can be asymptotically
better than the join algorithms in traditional engines. In this paper we apply
worst-case optimal join algorithms to a standard RDF workload, the LUBM
benchmark, for the first time. We do so using two worst-case optimal engines:
(1) LogicBlox, a commercial database engine, and (2) EmptyHeaded, our prototype
research engine with enhanced worst-case optimal join algorithms. We show that
without any added optimizations both LogicBlox and EmptyHeaded outperform two
state-of-the-art specialized RDF engines, RDF-3X and TripleBit, by up to 6x on
cyclic join queries-the queries where traditional optimizers are suboptimal. On
the remaining, less complex queries in the LUBM benchmark, we show that three
classic query optimization techniques enable EmptyHeaded to compete with RDF
engines, even when there is no asymptotic advantage to the worst-case optimal
approach. We validate that our design has merit as EmptyHeaded outperforms
MonetDB by three orders of magnitude and LogicBlox by two orders of magnitude,
while remaining within an order of magnitude of RDF-3X and TripleBit
Distributed RDF query processing and reasoning for big data / linked data
Title from PDF of title page, viewed on August 27, 2014Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (pages 61-65)Thesis (M. S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2014The Linked Data Movement is aimed at converting unstructured and semi-structured
data on the documents to semantically connected documents called the "web of data." This is
based on Resource Description Framework (RDF) that represents the semantic data and a
collection of such statements shapes an RDF graph. SPARQL is a query language designed
specifically to query RDF data. Linked Data faces the same challenge that Big Data does. We
now lead the way to a new wave of a new paradigm, Big Data and Linked Data that identify
massive amounts of data in a connected form. Indeed, utilizing Linked Data and Big Data
continue to be in high demand. Therefore, we need a scalable and accessible query system
for the reusability and availability of existing web data. However, existing SPAQL query
systems are not sufficiently scalable for Big Data and Linked Data.
In this thesis, we address an issue of how to improve the scalability and performance
of query processing with Big Data / Linked Data. Our aim is to evaluate and assess presently
available SPARQL query engines and develop an effective model to query RDF data that
should be scalable with reasoning capabilities. We designed an efficient and distributed
SPARQL engine using MapReduce (parallel and distributed processing for large data sets on
a cluster) and the Apache Cassandra database (scalable and highly available peer to peer distributed database system). We evaluated an existing in-memory based ARQ engine
provided by Jena framework and found that it cannot handle large datasets, as it only works
based on the in-memory feature of the system. It was shown that the proposed model had
powerful reasoning capabilities and dealt efficiently with big datasetsAbstract -- Illistrations -- Tables -- Introduction -- Background and related work -- Graph-store based SPARQL model -- Graph-store based SPARQL model implementation -- Results and evaluation -- Conclusion and future work -- Reference
An ontology-based approach to Automatic Generation of GUI for Data Entry
This thesis reports an ontology-based approach to automatic generation of highly tailored GUI components that can make customized data requests for the end users. Using this GUI generator, without knowing any programming skill a domain expert can browse the data schema through the ontology file of his/her own field, choose attribute fields according to business\u27s needs, and make a highly customized GUI for end users\u27 data requests input. The interface for the domain expert is a tree view structure that shows not only the domain taxonomy categories but also the relationships between classes. By clicking the checkbox associated with each class, the expert indicates his/her choice of the needed information. These choices are stored in a metadata document in XML. From the viewpoint of programmers, the metadata contains no ambiguity; every class in an ontology is unique. The utilizations of the metadata can be various; I have carried out the process of GUI generation. Since every class and every attribute in the class has been formally specified in the ontology, generating GUI is automatic. This approach has been applied to a use case scenario in meteorological and oceanographic (METOC) area. The resulting features of this prototype have been reported in this thesis
- …