803 research outputs found

    Optimisation of Mobile Communication Networks - OMCO NET

    Get PDF
    The mini conference “Optimisation of Mobile Communication Networks” focuses on advanced methods for search and optimisation applied to wireless communication networks. It is sponsored by Research & Enterprise Fund Southampton Solent University. The conference strives to widen knowledge on advanced search methods capable of optimisation of wireless communications networks. The aim is to provide a forum for exchange of recent knowledge, new ideas and trends in this progressive and challenging area. The conference will popularise new successful approaches on resolving hard tasks such as minimisation of transmit power, cooperative and optimal routing

    Link prediction in large directed graphs

    Get PDF
    The first chapter introduces an approach to machine learning (ML) were data is understood as a network of connected entities. This strategy seeks inter-entity information for knowledge discovery, in contrast with traditional intra-entity approaches based on instances and their features. We discuss the importance of this connectivist ML (which we refer to as graph mining) in the current context where large, topology-based data sets have been made available. Chapter ends by introducing the Link Prediction (LP) problem, together with its current computational and performance limitations. The second chapter discusses early contributions to graph mining, and introduces problems frequently tackled through this paradigm. Later the chapter focuses on the state-of-the-art of LP. It presents three different approaches to the problem of finding links in a relational set, and argues about the importance of the most computationally scalable one: similarity-based algorithms. It categorizes similarity-based algorithms in three types of LP scores. For the most scalable type, local similarity-based algorithms, the chapter identifies and formally describes the most competitive proposals according to the bibliography. Chapter three analyses the LP problem, partly as a classic binary classification problem. A list of graph properties such as directionality, weights and time are discussed in the context of LP. Follows a formal time and space complexity analysis of similarity-based scores of LP. The chapter ends with an study of the class imbalance found in LP problems. In chapter four a novel similarity-based score of LP is introduced. The chapter first elaborates on the importance of hierarchies for representing knowledge through directed graphs. Several modifications to the proposed score are also defined. This chapter presents a modified version of the most competitive undirected scores of LP, to adapt them to directed graphs. The evaluation methodologies of LP are analyzed in the fifth chapter. It starts by discussing the problem of evaluating domains with a huge class imbalance, identifying the most appropriate methodologies for it. A modification of the most appropriate evaluation methodology according to the bibliography is presented, with the goal of focusing on relevant predictions. Follows a discussion on the faithful estimation of the precision of predictors. Chapter six describes the graphs used for score evaluation, as well as how data was transformed into a directed graph. Reasons on why these particular domains were chosen are given, making a special case of webgraphs and their well known relation with hierarchies. The most basic properties of each resultant graph are shown. Tests performed are presented in chapter seven. The three most competitive LP scores currently available are tested among themselves, and against a proposed version of those same scores for directed graphs. Our proposed score and its modifications are tested against the scores obtaining the best results in the previous tests. The case of LP in webgraphs is considered separately, testing six different webgraphs. The chapter ends with a discussion on the limitations of this formal analysis, showing examples of predictions obtained. Chapter eight includes the computational aspects of the work done. It starts with a discussion on the importance of memory management for determining the computational cost of LP algorithms. A proposal on how to reduce this cost through precision reduction is presented. Follows a section focused on the parallelization of code, which includes two different implementations on one graph-specific programming model (Pregel) and on one generic programming model (OpenMP). The chapter ends with a specification of the computational resources used for the tests done. The conclusions of this thesis proposal are presented in nine. Chapter ten contains several future lines of work.El primer capítol introdueix una perspectiva de l'aprenentatge automàtic on les dades s'entén com una xarxa d'entitats connectades. Aquesta estratègia es centra en les relacions entre entitats per aprendre, en contrast amb les solucions tradicionals basades en instancies i els seus atributs. Discutim sobre la importància d'aquesta perspectiva connectivista (a la que ens referim com mineria de grafs) en el context actual on grans conjunts de dades basats en xarxes estan apareixent. El capítol finalitza amb la presentació del problema de Predicció d'Arestes (PA), junt amb una primera anàlisi de les seves limitacions actuals. El segon capítol presenta les primeres contribucions a la mineria de grafs, introduint problemes típicament solucionats mitjançant aquest paradigma. El capítol es centra en l'estat de l'art de PA. Presenta tres solucions diferents per al problema i argumenta la importància del més computacionalment escalable: els algoritmes basats en similitud. Categoritza aquests en tres tipus, i per als més escalables d'aquests, els algoritmes locals, s'identifica i es descriu formalment les propostes més competitives d'acord amb la bibliografia. El tercer capítol analitza el problema de PA, inicialment com a problema de classificació binari. Una llista de propietats de grafs són discutides en el context de la PA, com la direccionalitat o els pesos. Segueix una anàlisi del cost computacional en temps com en espai, dels algorismes basats en similitud. El capítol finalitza amb un estudi del desbalanceig de classes, freqüent en la PA. Al capítol quatre es presenta un nou algorisme basat en similitud per la PA. El capítol elabora sobre la importància de les jerarquies a la representació del coneixement a través de grafs dirigits. Varies modificacions es proposen per al nou algorisme. Aquest capítol també inclou una modificació sobre els actuals algorismes de similitud per a grafs no dirigits, per adaptar-los per a grafs dirigits. Les metodologies d'avaluació de la PA s'analitzen al cinquè capítol. Comença amb una discussió sobre els problemes que suposa avaluar un context amb un gran desbalanceig de classes, identificant les metodologies apropiades per aquests casos. Es proposa una modificació sobre el mètode més apropiat actualment disponible, per tal de centrar-se en les prediccions rellevants. Segueix una discussió sobre l'estimació fidedigna de la precisió dels predictors. El sisè capítol descriu els grafs usats per avaluar els algorismes, així com la metodologia usada per transformar-los en grafs dirigits. Les raons per triar aquest conjunt de grafs són exposades, posant especial interès al cas dels grafs web i a la seva ben coneguda relació amb les jerarquies. Les propietats més bàsiques de cada graf resultant són descrites. Els tests efectuats es mostren al capítol setè. Els tres algorismes actuals de PA més competitius són comparats amb ells mateixos i amb la versió per a grafs dirigits definida anteriorment. L'algorisme proposat anteriorment i les seves modificacions també són avaluats. El problema de la PA en grafs web es considera per separat, avaluant sis grafs web diferents. El capítol acaba amb una discussió sobre les limitacions de les avaluacions formals, mostrant exemples de prediccions obtingudes. El vuitè capítol inclou els aspectes computacionals de la tesi. Comença amb una discussió sobre la importància de la gestió de memòria per a la definició del cost computacional dels algorismes de PA. Inclou una proposta sobre com reduir aquest cost mitjançant una reducció en la precisió. Segueix una secció centrada en la paral·lelització del codi, que inclou dues implementacions diferents, una en un model de programació específic per grafs (Pregel) i una amb un model de programació paral·lela genèric (OpenMP). El capítol finalitza amb una especificació dels recursos computacionals usats per als tests realitzats. Les conclusions de la tesi es presenten al capítol novè, i les línies de treball futur al des

    Trustworthiness in Social Big Data Incorporating Semantic Analysis, Machine Learning and Distributed Data Processing

    Get PDF
    This thesis presents several state-of-the-art approaches constructed for the purpose of (i) studying the trustworthiness of users in Online Social Network platforms, (ii) deriving concealed knowledge from their textual content, and (iii) classifying and predicting the domain knowledge of users and their content. The developed approaches are refined through proof-of-concept experiments, several benchmark comparisons, and appropriate and rigorous evaluation metrics to verify and validate their effectiveness and efficiency, and hence, those of the applied frameworks

    Parallel Processes in HPX: Designing an Infrastructure for Adaptive Resource Management

    Get PDF
    Advancement in cutting edge technologies have enabled better energy efficiency as well as scaling computational power for the latest High Performance Computing(HPC) systems. However, complexity, due to hybrid architectures as well as emerging classes of applications, have shown poor computational scalability using conventional execution models. Thus alternative means of computation, that addresses the bottlenecks in computation, is warranted. More precisely, dynamic adaptive resource management feature, both from systems as well as application\u27s perspective, is essential for better computational scalability and efficiency. This research presents and expands the notion of Parallel Processes as a placeholder for procedure definitions, targeted at one or more synchronous domains, meta data for computation and resource management as well as infrastructure for dynamic policy deployment. In addition to this, the research presents additional guidelines for a framework for resource management in HPX runtime system. Further, this research also lists design principles for scalability of Active Global Address Space (AGAS), a necessary feature for Parallel Processes. Also, to verify the usefulness of Parallel Processes, a preliminary performance evaluation of different task scheduling policies is carried out using two different applications. The applications used are: Unbalanced Tree Search, a reference dynamic graph application, implemented by this research in HPX and MiniGhost, a reference stencil based application using bulk synchronous parallel model. The results show that different scheduling policies provide better performance for different classes of applications; and for the same application class, in certain instances, one policy fared better than the others, while vice versa in other instances, hence supporting the hypothesis of the need of dynamic adaptive resource management infrastructure, for deploying different policies and task granularities, for scalable distributed computing

    Recent Advances in Social Data and Artificial Intelligence 2019

    Get PDF
    The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

    From Text to Knowledge with Graphs: modelling, querying and exploiting textual content

    Full text link
    This paper highlights the challenges, current trends, and open issues related to the representation, querying and analytics of content extracted from texts. The internet contains vast text-based information on various subjects, including commercial documents, medical records, scientific experiments, engineering tests, and events that impact urban and natural environments. Extracting knowledge from this text involves understanding the nuances of natural language and accurately representing the content without losing information. This allows knowledge to be accessed, inferred, or discovered. To achieve this, combining results from various fields, such as linguistics, natural language processing, knowledge representation, data storage, querying, and analytics, is necessary. The vision in this paper is that graphs can be a well-suited text content representation once annotated and the right querying and analytics techniques are applied. This paper discusses this hypothesis from the perspective of linguistics, natural language processing, graph models and databases and artificial intelligence provided by the panellists of the DOING session in the MADICS Symposium 2022

    Scalable and distributed constrained low rank approximations

    Get PDF
    Low rank approximation is the problem of finding two low rank factors W and H such that the rank(WH) << rank(A) and A ≈ WH. These low rank factors W and H can be constrained for meaningful physical interpretation and referred as Constrained Low Rank Approximation (CLRA). Like most of the constrained optimization problem, performing CLRA can be computationally expensive than its unconstrained counterpart. A widely used CLRA is the Non-negative Matrix Factorization (NMF) which enforces non-negativity constraints in each of its low rank factors W and H. In this thesis, I focus on scalable/distributed CLRA algorithms for constraints such as boundedness and non-negativity for large real world matrices that includes text, High Definition (HD) video, social networks and recommender systems. First, I begin with the Bounded Matrix Low Rank Approximation (BMA) which imposes a lower and an upper bound on every element of the lower rank matrix. BMA is more challenging than NMF as it imposes bounds on the product WH rather than on each of the low rank factors W and H. For very large input matrices, we extend our BMA algorithm to Block BMA that can scale to a large number of processors. In applications, such as HD video, where the input matrix to be factored is extremely large, distributed computation is inevitable and the network communication becomes a major performance bottleneck. Towards this end, we propose a novel distributed Communication Avoiding NMF (CANMF) algorithm that communicates only the right low rank factor to its neighboring machine. Finally, a general distributed HPC- NMF framework that uses HPC techniques in communication intensive NMF operations and suitable for broader class of NMF algorithms.Ph.D
    corecore