Search CORE

1,198 research outputs found

A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

Author: Price S
Publication venue: Department of Computer Science, University of Bristol
Publication date: 01/01/2004
Field of study

Data De-Duplication in NoSQL Databases

Author: Brad Nicoleta
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

With the popularity and expansion of Cloud Computing, NoSQL databases (DBs) are becoming the preferred choice of storing data in the Cloud. Because they are highly de-normalized, these DBs tend to store significant amounts of redundant data. Data de-duplication (DD) has an important role in reducing storage consumption to make it affordable to manage in today’s explosive data growth. Numerous DD methodologies like chunking and, delta encoding are available today to optimize the use of storage. These technologies approach DD at file and/or sub-file level but this approach has never been optimal for NoSQL DBs. This research proposes data De-Duplication in NoSQL Databases (DDNSDB) which makes use of a DD approach at a higher level of abstraction, namely at the DB level. It makes use of the structural information about the data (metadata) exploiting its granularity to identify and remove duplicates. The main goals of this research are: to maximally reduce the amount of duplicates in one type of NoSQL DBs, namely the key-value store, to maximally increase the process performance such that the backup window is marginally affected, and to design with horizontal scaling in mind such that it would run on a Cloud Platform competitively. Additionally, this research presents an analysis of the various types of NoSQL DBs (such as key-value, tabular/columnar, and document DBs) to understand their data model required for the design and implementation of DDNSDB. Primary experiments have demonstrated that DDNSDB can further reduce the NoSQL DB storage space compared with current archiving methods (from 17% to near 69% as more structural information is available). Also, by following an optimized adapted MapReduce architecture, DDNSDB proves to have competitive performance advantage in a horizontal scaling cloud environment compared with a vertical scaling environment (from 28.8 milliseconds to 34.9 milliseconds as the number of parallel Virtual Machines grows)

eCommons@USASK

University of Saskatchewan Research Archive

The C Object System: Using C as a High-Level Object-Oriented Language

Author: Deniau Laurent
Publication venue
Publication date: 12/03/2010
Field of study

The C Object System (Cos) is a small C library which implements high-level concepts available in Clos, Objc and other object-oriented programming languages: uniform object model (class, meta-class and property-metaclass), generic functions, multi-methods, delegation, properties, exceptions, contracts and closures. Cos relies on the programmable capabilities of the C programming language to extend its syntax and to implement the aforementioned concepts as first-class objects. Cos aims at satisfying several general principles like simplicity, extensibility, reusability, efficiency and portability which are rarely met in a single programming language. Its design is tuned to provide efficient and portable implementation of message multi-dispatch and message multi-forwarding which are the heart of code extensibility and reusability. With COS features in hand, software should become as flexible and extensible as with scripting languages and as efficient and portable as expected with C programming. Likewise, Cos concepts should significantly simplify adaptive and aspect-oriented programming as well as distributed and service-oriented computingComment: 18

arXiv.org e-Print Archive

NeuroVault.org : a web-based repository for collecting and sharing unthresholded statistical maps of the human brain

Author: Camille eMaumet
Daniel S. Margulies
Gabriel eRivera
Gael eVaroquaux
Jean-Baptiste ePoline
Krzysztof Jacek Gorgolewski
Krzysztof Jacek Gorgolewski
Russell A. Poldrack
Satrajit S Ghosh
Tal eYarkoni
Thomas E. Nichols
Vanessa V Sochat
Yannick eSchwartz
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

Here we present NeuroVault-a web based repository that allows researchers to store, share, visualize, and decode statistical maps of the human brain. NeuroVault is easy to use and employs modern web technologies to provide informative visualization of data without the need to install additional software. In addition, it leverages the power of the Neurosynth database to provide cognitive decoding of deposited maps. The data are exposed through a public REST API enabling other services and tools to take advantage of it. NeuroVault is a new resource for researchers interested in conducting meta- and coactivation analyses

Directory of Open Access Journals

INRIA a CCSD electronic archive server

Frontiers - Publisher Connector

Warwick Research Archives Portal Repository