Search CORE

49,389 research outputs found

DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks

Author: Ghadiri Nasser
Maleki Erfan Farhangi
Maleki Zeinab
Shahreza Maryam Lotfi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Background and Objective: Heterogeneous complex networks are large graphs consisting of different types of nodes and edges. The knowledge extraction from these networks is complicated. Moreover, the scale of these networks is steadily increasing. Thus, scalable methods are required. Methods: In this paper, two distributed label propagation algorithms for heterogeneous networks, namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type of the heterogeneous complex networks. As a case study, we have measured the efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network consisting of drugs, diseases, and targets. The subject we have studied in this network is drug repositioning but our algorithms can be used as general methods for heterogeneous networks other than the biological network. Results: We compared the proposed algorithms with similar non-distributed versions of them namely MINProp and Heter-LP. The experiments revealed the good performance of the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo

arXiv.org e-Print Archive

Western Sydney ResearchDirect

Recommended from our members

FutureGRID: A Program for long-term research into GRID systems architecture

Author: Crowcroft Jon
Hand SM
Harris TL
Herbert AJ
Parker Michael A
Pratt IA
Publication venue
Publication date: 26/06/2008
Field of study

Proceedings of the 2003 UK e-Science All Hands Meeting, 31st August - 3rd September, Nottingham UKThis is a project to carry out research into long-term GRID architecture, in the University of Cambridge Computer Laboratory and the Cambridge eScience Center, with support from the Microsoft Research Laboratory, Cambridge. It is part of a larger vision for future systems architectures for public computing platforms, including both scientitic GRID and commodity level computing such as games, peer2peer computing and storage services and so forth, based on work in the laboratories in recent years into massively scaleable distributed systems for storage, computation, content distribution and collaboration[26]

Apollo (Cambridge)

Structured Review of the Evidence for Effects of Code Duplication on Software Quality

Author: Hordijk Wiebe
Ponisio María Laura
Wieringa Roel
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

University of Twente Research Information

Scalable Persistent Storage for Erlang

Author: Chechina Natalia
Ghaffari Amir
Meredith Jon
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

The many core revolution makes scalability a key property. The RELEASE project aims to improve the scalability of Erlang on emergent commodity architectures with 100,000 cores. Such architectures require scalable and available persistent storage on up to 100 hosts. We enumerate the requirements for scalable and available persistent storage, and evaluate four popular Erlang DBMSs against these requirements. This analysis shows that Mnesia and CouchDB are not suitable persistent storage at our target scale, but Dynamo-like NoSQL DataBase Management Systems (DBMSs) such as Cassandra and Riak potentially are. We investigate the current scalability limits of the Riak 1.1.1 NoSQL DBMS in practice on a 100-node cluster. We establish for the first time scientifically the scalability limit of Riak as 60 nodes on the Kalkyl cluster, thereby confirming developer folklore. We show that resources like memory, disk, and network do not limit the scalability of Riak. By instrumenting Erlang/OTP and Riak libraries we identify a specific Riak functionality that limits scalability. We outline how later releases of Riak are refactored to eliminate the scalability bottlenecks. We conclude that Dynamo-style NoSQL DBMSs provide scalable and available persistent storage for Erlang in general, and for our RELEASE target architecture in particular

Crossref

Enlighten

Building scalable digital library ingestion pipelines using microservices

Author: Anastasiou Lucas
Cancellieri Matteo
Knoth Petr
Pearce Samuel
Pontika Nancy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/12/2016
Field of study

CORE, a harvesting service offering access to millions of open access research papers from around the world, has shifted its harvesting process from following a monolithic approach to the adoption of a microservices infrastructure. In this paper, we explain how we rearranged and re-scheduled our old ingestion pipeline, present CORE's move to managing microservices and outline the tools we use in a new and optimised ingestion system. In addition, we discuss the ineffciencies of our old harvesting process, the advantages, and challenges of our new ingestion system and our future plans. We conclude that via the adoption of microservices architecture we managed to achieve a scalable and distributed system that would assist with CORE's future performance and evolution

Crossref

Open Research Online (The Open University)

Scipedia