88,772 research outputs found
Bag-of-Features Image Indexing and Classification in Microsoft SQL Server Relational Database
This paper presents a novel relational database architecture aimed to visual
objects classification and retrieval. The framework is based on the
bag-of-features image representation model combined with the Support Vector
Machine classification and is integrated in a Microsoft SQL Server database.Comment: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF),
Gdynia, Poland, 24-26 June 201
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
A Revision Control System for Image Editing in Collaborative Multimedia Design
Revision control is a vital component in the collaborative development of
artifacts such as software code and multimedia. While revision control has been
widely deployed for text files, very few attempts to control the versioning of
binary files can be found in the literature. This can be inconvenient for
graphics applications that use a significant amount of binary data, such as
images, videos, meshes, and animations. Existing strategies such as storing
whole files for individual revisions or simple binary deltas, respectively
consume significant storage and obscure semantic information. To overcome these
limitations, in this paper we present a revision control system for digital
images that stores revisions in form of graphs. Besides, being integrated with
Git, our revision control system also facilitates artistic creation processes
in common image editing and digital painting workflows. A preliminary user
study demonstrates the usability of the proposed system.Comment: pp. 512-517 (6 pages
MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME
Computational experiments using spatial stochastic simulations have led to
important new biological insights, but they require specialized tools, a
complex software stack, as well as large and scalable compute and data analysis
resources due to the large computational cost associated with Monte Carlo
computational workflows. The complexity of setting up and managing a
large-scale distributed computation environment to support productive and
reproducible modeling can be prohibitive for practitioners in systems biology.
This results in a barrier to the adoption of spatial stochastic simulation
tools, effectively limiting the type of biological questions addressed by
quantitative modeling. In this paper, we present PyURDME, a new, user-friendly
spatial modeling and simulation package, and MOLNs, a cloud computing appliance
for distributed simulation of stochastic reaction-diffusion models. MOLNs is
based on IPython and provides an interactive programming platform for
development of sharable and reproducible distributed parallel computational
experiments
Non-hierarchical Structures: How to Model and Index Overlaps?
Overlap is a common phenomenon seen when structural components of a digital
object are neither disjoint nor nested inside each other. Overlapping
components resist reduction to a structural hierarchy, and tree-based indexing
and query processing techniques cannot be used for them. Our solution to this
data modeling problem is TGSA (Tree-like Graph for Structural Annotations), a
novel extension of the XML data model for non-hierarchical structures. We
introduce an algorithm for constructing TGSA from annotated documents; the
algorithm can efficiently process non-hierarchical structures and is associated
with formal proofs, ensuring that transformation of the document to the data
model is valid. To enable high performance query analysis in large data
repositories, we further introduce an extension of XML pre-post indexing for
non-hierarchical structures, which can process both reachability and
overlapping relationships.Comment: The paper has been accepted at the Balisage 2014 conferenc
Pretty Private Group Management
Group management is a fundamental building block of today's Internet
applications. Mailing lists, chat systems, collaborative document edition but
also online social networks such as Facebook and Twitter use group management
systems. In many cases, group security is required in the sense that access to
data is restricted to group members only. Some applications also require
privacy by keeping group members anonymous and unlinkable. Group management
systems routinely rely on a central authority that manages and controls the
infrastructure and data of the system. Personal user data related to groups
then becomes de facto accessible to the central authority. In this paper, we
propose a completely distributed approach for group management based on
distributed hash tables. As there is no enrollment to a central authority, the
created groups can be leveraged by various applications. Following this
paradigm we describe a protocol for such a system. We consider security and
privacy issues inherently introduced by removing the central authority and
provide a formal validation of security properties of the system using AVISPA.
We demonstrate the feasibility of this protocol by implementing a prototype
running on top of Vuze's DHT
- …