Search CORE

8,674 research outputs found

Main Memory Adaptive Indexing for Multi-core Systems

Author: Alvarez Victor
Dittrich Jens
Richter Stefan
Schuhknecht Felix Martin
Publication venue
Publication date: 01/01/2014
Field of study

Adaptive indexing is a concept that considers index creation in databases as a by-product of query processing; as opposed to traditional full index creation where the indexing effort is performed up front before answering any queries. Adaptive indexing has received a considerable amount of attention, and several algorithms have been proposed over the past few years; including a recent experimental study comparing a large number of existing methods. Until now, however, most adaptive indexing algorithms have been designed single-threaded, yet with multi-core systems already well established, the idea of designing parallel algorithms for adaptive indexing is very natural. In this regard only one parallel algorithm for adaptive indexing has recently appeared in the literature: The parallel version of standard cracking. In this paper we describe three alternative parallel algorithms for adaptive indexing, including a second variant of a parallel standard cracking algorithm. Additionally, we describe a hybrid parallel sorting algorithm, and a NUMA-aware method based on sorting. We then thoroughly compare all these algorithms experimentally; along a variant of a recently published parallel version of radix sort. Parallel sorting algorithms serve as a realistic baseline for multi-threaded adaptive indexing techniques. In total we experimentally compare seven parallel algorithms. Additionally, we extensively profile all considered algorithms. The initial set of experiments considered in this paper indicates that our parallel algorithms significantly improve over previously known ones. Our results suggest that, although adaptive indexing algorithms are a good design choice in single-threaded environments, the rules change considerably in the parallel case. That is, in future highly-parallel environments, sorting algorithms could be serious alternatives to adaptive indexing.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

An adaptive hierarchical domain decomposition method for parallel contact dynamics simulations of granular materials

Author: Allen
Anitescu
Brendel
Calvetti
Cundall
Deng
Dietrich E. Wolf
Fleissner
Haff
Iglberger
Iglberger
Jean
Joer
Jourdan
János Török
Kadau
Kaufman
Knudsen
Lothar Brendel
Luding
Lötstedt
M. Reza Shaebani
McNamara
Miller
Miller
Moreau
Mueth
Nassi
Nyland
Plimpton
Plimpton
Press
Radjai
Radjai
Rapaport
Renouf
Revathi
Rock
Shaebani
Shaebani
Stewart
Stewart
Stewart
Unger
Unger
Unger
Wackenhut
Walton
Zahra Shojaaee
Publication venue: 'Elsevier BV'
Publication date: 28/12/2011
Field of study

A fully parallel version of the contact dynamics (CD) method is presented in this paper. For large enough systems, 100% efficiency has been demonstrated for up to 256 processors using a hierarchical domain decomposition with dynamic load balancing. The iterative scheme to calculate the contact forces is left domain-wise sequential, with data exchange after each iteration step, which ensures its stability. The number of additional iterations required for convergence by the partially parallel updates at the domain boundaries becomes negligible with increasing number of particles, which allows for an effective parallelization. Compared to the sequential implementation, we found no influence of the parallelization on simulation results.Comment: 19 pages, 15 figures, published in Journal of Computational Physics (2011

arXiv.org e-Print Archive

Crossref

Complexity vs. performance in granular embedding spaces for graph classification

Author: Baldini Luca
Martino Alessio
Rizzi Antonello
Publication venue: 'Scitepress'
Publication date: 01/01/2020
Field of study

The most distinctive trait in structural pattern recognition in graph domain is the ability to deal with the organization and relations between the constituent entities of the pattern. Even if this can be convenient and/or necessary in many contexts, most of the state-of the art classi\ufb01cation techniques can not be deployed directly in the graph domain without \ufb01rst embedding graph patterns towards a metric space. Granular Computing is a powerful information processing paradigm that can be employed in order to drive the synthesis of automatic embedding spaces from structured domains. In this paper we investigate several classi\ufb01cation techniques starting from Granular Computing-based embedding procedures and provide a thorough overview in terms of model complexity, embedding space complexity and performances on several open-access datasets for graph classi\ufb01cation. We witness that certain classi\ufb01cation techniques perform poorly both from the point of view of complexity and learning performances as the case of non-linear SVM, suggesting that high dimensionality of the synthesized embedding space can negatively affect the effectiveness of these approaches. On the other hand, linear support vector machines, neuro-fuzzy networks and nearest neighbour classi\ufb01ers have comparable performances in terms of accuracy, with second being the most competitive in terms of structural complexity and the latter being the most competitive in terms of embedding space dimensionality

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio della ricerca- Università di Roma La Sapienza

A granular approach to web search result presentation

Author: Jose J.M.
Ruthven I.
White R.W.
Publication venue
Publication date: 01/01/2003
Field of study

In this paper we propose and evaluate interfaces for presenting the results of web searches. Sentences, taken from the top retrieved documents, are used as fine-grained representations of document content and, when combined in a ranked list, to provide a query-specific overview of the set of retrieved documents. Current search engine interfaces assume users examine such results document-by-document. In contrast our approach groups, ranks and presents the contents of the top ranked document set. We evaluate our hypotheses that the use of such an approach can lead to more effective web searching and to increased user satisfaction. Our evaluation, with real users and different types of information seeking scenario, showed, with statistical significance, that these hypotheses hold

University of Strathclyde Institutional Repository

Mistakes in medical ontologies: Where do they come from and how can they be detected?

Author: Ceusters Werner
Dhaen Christoffel
Kumar Anand
Smith Barry
Publication venue
Publication date: 01/01/2004
Field of study

We present the details of a methodology for quality assurance in large medical terminologies and describe three algorithms that can help terminology developers and users to identify potential mistakes. The methodology is based in part on linguistic criteria and in part on logical and ontological principles governing sound classifications. We conclude by outlining the results of applying the methodology in the form of a taxonomy different types of errors and potential errors detected in SNOMED-CT

PhilPapers

MorphoSys: efficient colocation of QoS-constrained workloads in the cloud

Author: Bestavros Azer
Ishakian Vatche
Publication venue: Computer Science Department, Boston University
Publication date: 25/01/2011
Field of study

In hosting environments such as IaaS clouds, desirable application performance is usually guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated for unencumbered use for proper operation. Arbitrary colocation of applications with different SLAs on a single host may result in inefficient utilization of the host’s resources. In this paper, we propose that periodic resource allocation and consumption models -- often used to characterize real-time workloads -- be used for a more granular expression of SLAs. Our proposed SLA model has the salient feature that it exposes flexibilities that enable the infrastructure provider to safely transform SLAs from one form to another for the purpose of achieving more efficient colocation. Towards that goal, we present MORPHOSYS: a framework for a service that allows the manipulation of SLAs to enable efficient colocation of arbitrary workloads in a dynamic setting. We present results from extensive trace-driven simulations of colocated Video-on-Demand servers in a cloud setting. These results show that potentially-significant reduction in wasted resources (by as much as 60%) are possible using MORPHOSYS.National Science Foundation (0720604, 0735974, 0820138, 0952145, 1012798

Crossref

Boston University Institutional Repository (OpenBU)

Security and Privacy Issues of Big Data

Author: Moura Jose
Serrao Carlos
Publication venue: 'IGI Global'
Publication date: 01/01/2015
Field of study

This chapter revises the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by Big Data applications. One of them is privacy. It is a pertinent aspect to be addressed because users share more and more personal data and content through their devices and computers to social networks and public clouds. So, a secure framework to social networks is a very hot topic research. This last topic is addressed in one of the two sections of the current chapter with case studies. In addition, the traditional mechanisms to support security such as firewalls and demilitarized zones are not suitable to be applied in computing systems to support Big Data. SDN is an emergent management solution that could become a convenient mechanism to implement security in Big Data systems, as we show through a second case study at the end of the chapter. This also discusses current relevant work and identifies open issues.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

On Information Granulation via Data Filtering for Granular Computing-Based Pattern Recognition: A Graph Embedding Case Study

Author: De Santis Enrico
Martino Alessio
Rizzi Antonello
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

Granular Computing is a powerful information processing paradigm, particularly useful for the synthesis of pattern recognition systems in structured domains (e.g., graphs or sequences). According to this paradigm, granules of information play the pivotal role of describing the underlying (possibly complex) process, starting from the available data. Under a pattern recognition viewpoint, granules of information can be exploited for the synthesis of semantically sound embedding spaces, where common supervised or unsupervised problems can be solved via standard machine learning algorithms. In this companion paper, we follow our previous paper (Martino et al. in Algorithms 15(5):148, 2022) in the context of comparing different strategies for the automatic synthesis of information granules in the context of graph classification. These strategies mainly differ on the specific topology adopted for subgraphs considered as candidate information granules and the possibility of using or neglecting the ground-truth class labels in the granulation process and, conversely, to our previous work, we employ a filtering-based approach for the synthesis of information granules instead of a clustering-based one. Computational results on 6 open-access data sets corroborate the robustness of our filtering-based approach with respect to data stratification, if compared to a clustering-based granulation stage

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio della ricerca- Università di Roma La Sapienza

Analyzing Methods and Opportunities in Software-Defined (SDN) Networks for Data Traffic Optimizations

Author: Yasir Ali Matnee, Chasib Hasan Abooddy, Zainab Qahtan Mohammed
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/01/2018
Field of study

Computer networks are dynamic and require constant updating and monitoring of operations to meet the growing volume of data trafficked. This generates a number of cost issues as well as performance management and tuning to deliver granular quality of service (QoS), balancing data load, and controlling the occurrence of bottlenecks. As an alternative, a new programmable network paradigm has been used under the name of Software Defined Networks (SDN). The SDN consists of decoupling the data plane and controlling the network, where a programmable controller is responsible for managing rules for routing the data to various devices. Thus, the hardware that remains in the network data stream simply addresses the routing of the packets quickly according to these rules. In this context, this article conducts a study on different methods and approaches that are being used in the literature to solve problems in the optimization of data traffic in the network through the use of SDN. In particular, this study differs from other reviews of SDN because it focuses on issues such as QoS, load balancing, and congestion control. Finally, in addition to the review of the SDN's state-of-the-art in the areas mentioned, a survey of future challenges and research opportunities in the area is also presented. load balancing and congestion control. Finally, in addition to the review of the SDN's state-of-the-art in the areas mentioned, a survey of future challenges and research opportunities in the area is also presented. load balancing and congestion control. Finally, in addition to the review of the SDN's state-of-the-art in the areas mentioned, a survey of future challenges and research opportunities in the area is also presented

International Journal on Recent and Innovation Trends in Computing and Communication