Search CORE

47,310 research outputs found

Towards MKM in the Large: Modular Representation and Scalable Software Architecture

Author: Kohlhase Michael
Rabe Florian
Zholudev Vyacheslav
Publication venue
Publication date: 01/01/2010
Field of study

MKM has been defined as the quest for technologies to manage mathematical knowledge. MKM "in the small" is well-studied, so the real problem is to scale up to large, highly interconnected corpora: "MKM in the large". We contend that advances in two areas are needed to reach this goal. We need representation languages that support incremental processing of all primitive MKM operations, and we need software architectures and implementations that implement these operations scalably on large knowledge bases. We present instances of both in this paper: the MMT framework for modular theory-graphs that integrates meta-logical foundations, which forms the base of the next OMDoc version; and TNTBase, a versioned storage system for XML-based document formats. TNTBase becomes an MMT database by instantiating it with special MKM operations for MMT.Comment: To appear in The 9th International Conference on Mathematical Knowledge Management: MKM 201

arXiv.org e-Print Archive

CiteSeerX

Investigation into Indexing XML Data Techniques

Author: Joan Lu
Klaib Alhadi
Publication venue
Publication date: 21/07/2014
Field of study

The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues. Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow

University of Huddersfield Repository

Adaptation of scalable multimedia documents

Author: Benoît Pellan
Cyril Concolato
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Several scalable media codecs have been standardized in recent years to cope with heterogeneous usage conditions and to aim at always providing audio, video and image content in the best possible quality. Today, interactive multimedia presentations are becoming accessible on handheld terminals and face the same adaptation challenges as the media elements they present: quite diversified screen, memory and processing power capabilities. In this paper, we address the adaptation of multimedia documents by applying the concept of scalability to their presentation. The Scalable MSTI document model introduced in this paper has been designed with two main requirements in mind. First, the adaptation process must be simple to execute because it may be performed on limited terminals in broadcast scenarios. Second, the adaptation process must be simple to describe so that authored adaptation directives can be transported along with the document with a limited bandwidth overhead. The Scalable MSTI model achieves both objectives by specifying Spatial, Temporal and Interactive scalability axes on which incremental authoring can be performed to create progressive presentation layers. Our experiments are conducted on scalable multimedia documents designed for Digital Radio services on DMB channels using MPEG-4 BIFS and also for web services using XHTML, SVG, SMIL and Flash. A scalable image gallery is described throughout this article and illustrates the features offered by our document model in a rich multimedia example

CiteSeerX

Crossref

A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

Author: Hassanzadeh Hamed
Keyvanpour MohammadReza
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 26/04/2011
Field of study

The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as multilinguality, scalability, and issues which are related to diversity and inconsistency in content of different web pages. Due to the wide range of domains and the dynamic environments that the Semantic Annotation systems must be performed on, the problem of automating annotation process is one of the significant challenges in this domain. To overcome this problem, different machine learning approaches such as supervised learning, unsupervised learning and more recent ones like, semi-supervised learning and active learning have been utilized. In this paper we present an inclusive layered classification of Semantic Annotation challenges and discuss the most important issues in this field. Also, we review and analyze machine learning applications for solving semantic annotation problems. For this goal, the article tries to closely study and categorize related researches for better understanding and to reach a framework that can map machine learning techniques into the Semantic Annotation challenges and requirements

arXiv.org e-Print Archive

Crossref

A horizontally-scalable multiprocessing platform based on Node.js

Author: Maatouki Ahmad
Meyer Jörg
Streit Achim
Szuba Marek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/07/2015
Field of study

This paper presents a scalable web-based platform called Node Scala which allows to split and handle requests on a parallel distributed system according to pre-defined use cases. We applied this platform to a client application that visualizes climate data stored in a NoSQL database MongoDB. The design of Node Scala leads to efficient usage of available computing resources in addition to allowing the system to scale simply by adding new workers. Performance evaluation of Node Scala demonstrated a gain of up to 74 % compared to the state-of-the-art techniques.Comment: 8 pages, 7 figures. Accepted for publication as a conference paper for the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA-15

arXiv.org e-Print Archive

Crossref

Efficient Correlated Topic Modeling with Topic Embedding

Author: Berg-Kirkpatrick Taylor
He Junxian
Hu Zhiting
Huang Ying
Xing Eric P.
Publication venue
Publication date: 01/07/2017
Field of study

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval.Comment: KDD 2017 oral. The first two authors contributed equall

arXiv.org e-Print Archive

Crossref

Impliance: A Next Generation Information Management Appliance

Author: Bhattacharjee Bishwaranjan
Ercegovac Vuk
Glider Joseph
Golding Richard
Lohman Guy
Markl Volke
Pirahesh Hamid
Rao Jun
Rees Robert
Reiss Frederick
Shekita Eugene
Swart Garret
Publication venue
Publication date: 22/12/2006
Field of study

ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

arXiv.org e-Print Archive

CiteSeerX