Search CORE

532 research outputs found

Ideal Keyword Match in a Big Data Application Using Keyword Aware Service Recommendation Method

Author: Tumma Susmitha et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 05/11/2023
Field of study

The big data movement additionally influenced service recommender systems. The emergence of alternative providers has created a big research issue in providing clients with relevant suggestions for services they want. Service recommender systems have proven to be helpful tools that help users manage the multitude of services at their disposal and provide pertinent recommendations. Because the quantity of customers, services, and other online information is growing exponentially, service recommender systems function in a "Big Data" context. This poses serious challenges for these systems. In this work, we address these difficulties by contributing the following: This makes use of a collaborative filtering algorithm that is user-input driven. Keywords extracted from user reviews reflect their preferences here. Additionally, we apply it to Hadoop, a distributed computing framework that builds on Map Reduce for processing. by applying a collaborative filtering process that is user-based. In the proposed system, we are using a user-based Collaborative Filtering method. It also has similarities to the existing system. We consider both customer reviews and company rankings. We provide KASR, a method for keyword-aware service recommendation. Key words in KASR serve as indicators of users' preferences, and recommendations are produced by a user-based Collaborative Filtering algorithm. A domain thesaurus and keyword-candidate list are provided to help better understand the preferences of the customers. The active user indicates their choices by selecting keywords and preferences from the keyword-candidate list

International Journal on Recent and Innovation Trends in Computing and Communication

Recommended from our members

Network Structures, Concurrency, and Interpretability: Lessons from the Development of an AI Enabled Graph Database System

Author: Cooper Hal James
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

This thesis describes the development of the SmartGraph, an AI enabled graph database. The need for such a system has been independently recognized in the isolated fields of graph databases, graph computing, and computational graph deep learning systems, such as TensorFlow. Though prior works have investigated some relationships between these fields, we believe that the SmartGraph is the first system designed from conception to incorporate the most significant and useful characteristics of each. Examples include the ability to store graph structured data, run analytics natively on this data, and run gradient descent algorithms. It is the synergistic aspects of combining these fields that provide the most novel results presented in this dissertation. Key among them is how the notion of “graph querying” as used in graph databases can be used to solve a problem that has plagued deep learning systems since their inception; rather than attempting to embed graph structured datasets into restrictive vector spaces, we instead allow the deep learning functionality of the system to natively perform graph querying in memory during optimization as a way of interpreting (and learning) the graph. This results in a concept of natural and interpretable processing of graph structured data. Graph computing systems have traditionally used distributed computing across multiple compute nodes (e.g. separate machines connected via Ethernet or internet) to deal with large-scale datasets whilst working sequentially on problems over entire datasets. In this dissertation, we outline a distributed graph computing methodology that facilitates all the above capabilities (even in an environment consisting of a single physical machine) while allowing for a workflow more typical of a graph database than a graph computing system; massive concurrent access allowing for arbitrarily asynchronous execution of queries and analytics across the entire system. Further, we demonstrate how this methodology is key to the artificial intelligence capabilities of the system

Columbia University Academic Commons

Performance Analysis and Improvement for Scalable and Distributed Applications Based on Asynchronous Many-Task Systems

Author: Wu Nanmiao
Publication venue: LSU Digital Commons
Publication date: 30/03/2022
Field of study

As the complexity of recent and future large-scale data and exascale systems architectures grows, so do productivity, portability, software scalability, and efficient utilization of system resources challenges presented to both industry and the research community. Software solutions and applications are expected to scale in performance on such complex systems. Asynchronous many-task (AMT) systems, taking advantage of multi-core architectures with light-weight threads, asynchronous executions, and smart scheduling, are showing promise in addressing these challenges. In this research, we implement several scalable and distributed applications based on HPX, an exemplar AMT runtime system. First, a distributed HPX implementation for a parameterized benchmark Task Bench is introduced. The performance bottleneck is analyzed where the repeated HPX threads creation costs and a global barrier for all threads limit the performance. The methodologies to retain the spawning threads alive and overlap communication and computation are presented. The evaluation results prove the effectiveness of the improved approach, where HPX is comparable with the prevalent programming models and takes advantages of multi-task scenarios. Second, an algorithms and data-structures SHAD library with HPX support is introduced. The methodologies to support local and remote operations in synchronous and asynchronous manners are developed. The HPX implementation in support of the SHAD library is further provided. Performance results demonstrate that the proposed system presents the similar performance as SHAD with Intel TBB (Threading Building Blocks) support for shared-memory parallelism and is better to explore the distributed-memory parallelism than SHAD with GMT (Global Memory and Threading) support. Third, an asynchronous array processing framework Phylanx is introduced. The methodologies that support a distributed alternating least square algorithm are developed. The implementation of this algorithm along with a number of distributed primitives are provided. The performance results show that Phylanx implementation presents a good scalability. Finally, a scalable second-order method for optimization is introduced. The implementation of a Krylov-Newton second-order method via PyTorch framework is provided. Evaluation results illustrate the effectiveness of scalability, convergence, and robust to hyper-parameters of the proposed method

Louisiana State University

Social Search: retrieving information in Online Social Platforms -- A Survey

Author: Amendola Maddalena
Passarella Andrea
Perego Raffaele
Publication venue
Publication date: 13/09/2023
Field of study

Social Search research deals with studying methodologies exploiting social information to better satisfy user information needs in Online Social Media while simplifying the search effort and consequently reducing the time spent and the computational resources utilized. Starting from previous studies, in this work, we analyze the current state of the art of the Social Search area, proposing a new taxonomy and highlighting current limitations and open research directions. We divide the Social Search area into three subcategories, where the social aspect plays a pivotal role: Social Question&Answering, Social Content Search, and Social Collaborative Search. For each subcategory, we present the key concepts and selected representative approaches in the literature in greater detail. We found that, up to now, a large body of studies model users' preferences and their relations by simply combining social features made available by social platforms. It paves the way for significant research to exploit more structured information about users' social profiles and behaviors (as they can be inferred from data available on social platforms) to optimize their information needs further

arXiv.org e-Print Archive

Towards Soft Circuit Breaking in Service Meshes via Application-agnostic Caching

Author: Elmroth Erik
Kihl Maria
Klein Cristian
Larsson Lars
Tärneberg William
Publication venue
Publication date: 06/04/2021
Field of study

Service meshes factor out code dealing with inter-micro-service communication, such as circuit breaking. Circuit breaking actuation is currently limited to an "on/off" switch, i.e., a tripped circuit breaker will return an application-level error indicating service unavailability to the calling micro-service. This paper proposes a soft circuit breaker actuator, which returns cached data instead of an error. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. We evaluate our approach through two experiments. First, we quantify the trade-off between traffic reduction and data staleness using a purpose-build service, thereby identifying algorithm configurations that keep data staleness at about 3% or less while reducing network load by up to 30%. Second, we quantify the network load reduction with the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching as a circuit breaking actuator in service meshes

arXiv.org e-Print Archive

Lund University Publications

Wireless Sensor Network Virtualization: A Survey

Author: Belqasmi Fatna
Crespi Noel
Glitho Roch
Khan Imran
Morrow Monique
Polako Paul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Wireless Sensor Networks (WSNs) are the key components of the emerging Internet-of-Things (IoT) paradigm. They are now ubiquitous and used in a plurality of application domains. WSNs are still domain specific and usually deployed to support a specific application. However, as WSN nodes are becoming more and more powerful, it is getting more and more pertinent to research how multiple applications could share a very same WSN infrastructure. Virtualization is a technology that can potentially enable this sharing. This paper is a survey on WSN virtualization. It provides a comprehensive review of the state-of-the-art and an in-depth discussion of the research issues. We introduce the basics of WSN virtualization and motivate its pertinence with carefully selected scenarios. Existing works are presented in detail and critically evaluated using a set of requirements derived from the scenarios. The pertinent research projects are also reviewed. Several research issues are also discussed with hints on how they could be tackled.Comment: Accepted for publication on 3rd March 2015 in forthcoming issue of IEEE Communication Surveys and Tutorials. This version has NOT been proof-read and may have some some inconsistencies. Please refer to final version published in IEEE Xplor

arXiv.org e-Print Archive

ZU Scholars (Zayed University)

Recommended from our members

Enhancing Usability and Explainability of Data Systems

Author: Fariha Anna
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, most existing data systems offer limited usability and support for explanations: these systems are usable only by experts with sound technical skills, and even expert users are hindered by the lack of transparency into the systems\u27 inner workings and functions. The aim of my thesis is to bridge the usability gap between nonexpert users and complex data systems, aid all sort of users, including the expert ones, in data and system understanding, and provide explanations that help reason about unexpected outcomes involving data systems. Specifically, my thesis has the following three goals: (1) enhancing usability of data systems for nonexperts, (2) enable data understanding that can assist users in a variety of tasks such as achieving trust in data-driven machine learning, gaining data understanding, and data cleaning, and (3) explaining causes of unexpected outcomes involving data and data systems. For enhancing usability, we focus on example-driven user intent discovery. We develop systems based on example-driven interactions in two different settings: querying relational databases and personalized document summarization. Towards data understanding, we develop a new data-profiling primitive that can characterize tuples for which a machine-learned model is likely to produce untrustworthy predictions. We also develop an explanation framework to explain causes of such untrustworthy predictions. Additionally, this new data-profiling primitive enables interactive data cleaning. Finally, we develop two explanation frameworks, tailored to provide explanations in debugging data system components, including the data itself. The explanation frameworks focus on explaining the root cause of a concurrent application\u27s intermittent failure and exposing issues in the data that cause a data-driven system to malfunction

ScholarWorks@UMass Amherst

HIGH PERFORMANCE AGENT-BASED MODELS WITH REAL-TIME IN SITU VISUALIZATION OF INFLAMMATORY AND HEALING RESPONSES IN INJURED VOCAL FOLDS

Author: Seekhao Nuttiiya
Publication venue
Publication date: 01/01/2019
Field of study

The introduction of clusters of multi-core and many-core processors has played a major role in recent advances in tackling a wide range of new challenging applications and in enabling new frontiers in BigData. However, as the computing power increases, the programming complexity to take optimal advantage of the machine's resources has significantly increased. High-performance computing (HPC) techniques are crucial in realizing the full potential of parallel computing. This research is an interdisciplinary effort focusing on two major directions. The first involves the introduction of HPC techniques to substantially improve the performance of complex biological agent-based models (ABM) simulations, more specifically simulations that are related to the inflammatory and healing responses of vocal folds at the physiological scale in mammals. The second direction involves improvements and extensions of the existing state-of-the-art vocal fold repair models. These improvements and extensions include comprehensive visualization of large data sets generated by the model and a significant increase in user-simulation interactivity. We developed a highly-interactive remote simulation and visualization framework for vocal fold (VF) agent-based modeling (ABM). The 3D VF ABM was verified through comparisons with empirical vocal fold data. Representative trends of biomarker predictions in surgically injured vocal folds were observed. The physiologically representative human VF ABM consisted of more than 15 million mobile biological cells. The model maintained and generated 1.7 billion signaling and extracellular matrix (ECM) protein data points in each iteration. The VF ABM employed HPC techniques to optimize its performance by concurrently utilizing the power of multi-core CPU and multiple GPUs. The optimization techniques included the minimization of data transfer between the CPU host and the rendering GPU. These transfer minimization techniques also reduced transfers between peer GPUs in multi-GPU setups. The data transfer minimization techniques were executed with a scheduling scheme that aims to achieve load balancing, maximum overlap of computation and communication, and a high degree of interactivity. This scheduling scheme achieved optimal interactivity by hyper-tasking the available GPUs (GHT). In comparison to the original serial implementation on a popular ABM framework, NetLogo, these schemes have shown substantial performance improvements of 400x and 800x for the 2D and 3D model, respectively. Furthermore, the combination of data footprint and data transfer reduction techniques with GHT achieved high-interactivity visualization with an average framerate of 42.8 fps. This performance enabled the users to perform real-time data exploration on large simulated outputs and steer the course of their simulation as needed

Digital Repository at the University of Maryland

An efficient Rust implementation of BFT for supporting Byzantine Tolerant Distributed Storage

Author: Nuno Gonçalo Neto Martingo
Publication venue
Publication date: 15/12/2022
Field of study

Repositório Aberto da Universidade do Porto

GraphScope Flex: LEGO-like Graph Computing Stack

Author: He Tao
Hu Shuxian
Lai Longbin
Li Dongze
Li Neng
Li Xue
Liu Lexiao
Luo Xiaojian
Lyu Binqing
Meng Ke
Shen Sijie
Su Li
Wang Lei
Xu Jingbo
Yu Wenyuan
Zeng Weibin
Zhang Lei
Zhang Siyuan
Zhou Jingren
Zhou Xiaoli
Zhu Diwen
Publication venue
Publication date: 19/12/2023
Field of study

Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained widespread adoption across various industries. However, one key lesson from this journey has been understanding the limitations of a "one-size-fits-all" approach, especially when dealing with the diversity of programming interfaces, applications, and data storage formats in graph computing. In response to these challenges, we present GraphScope Flex, the next iteration of GraphScope. GraphScope Flex is designed to be both resource-efficient and cost-effective, while also providing flexibility and user-friendliness through its LEGO-like modularity. This paper explores the architectural innovations and fundamental design principles of GraphScope Flex, all of which are direct outcomes of the lessons learned during our ongoing development process. We validate the adaptability and efficiency of GraphScope Flex with extensive evaluations on synthetic and real-world datasets. The results show that GraphScope Flex achieves 2.4X throughput and up to 55.7X speedup over other systems on the LDBC Social Network and Graphalytics benchmarks, respectively. Furthermore, GraphScope Flex accomplishes up to a 2,400X performance gain in real-world applications, demonstrating its proficiency across a wide range of graph computing scenarios with increased effectiveness

arXiv.org e-Print Archive