532 research outputs found

    Ideal Keyword Match in a Big Data Application Using Keyword Aware Service Recommendation Method

    Get PDF
    The big data movement additionally influenced service recommender systems. The emergence of alternative providers has created a big research issue in providing clients with relevant suggestions for services they want. Service recommender systems have proven to be helpful tools that help users manage the multitude of services at their disposal and provide pertinent recommendations. Because the quantity of customers, services, and other online information is growing exponentially, service recommender systems function in a "Big Data" context. This poses serious challenges for these systems. In this work, we address these difficulties by contributing the following: This makes use of a collaborative filtering algorithm that is user-input driven. Keywords extracted from user reviews reflect their preferences here. Additionally, we apply it to Hadoop, a distributed computing framework that builds on Map Reduce for processing. by applying a collaborative filtering process that is user-based. In the proposed system, we are using a user-based Collaborative Filtering method. It also has similarities to the existing system. We consider both customer reviews and company rankings. We provide KASR, a method for keyword-aware service recommendation. Key words in KASR serve as indicators of users' preferences, and recommendations are produced by a user-based Collaborative Filtering algorithm. A domain thesaurus and keyword-candidate list are provided to help better understand the preferences of the customers. The active user indicates their choices by selecting keywords and preferences from the keyword-candidate list

    Performance Analysis and Improvement for Scalable and Distributed Applications Based on Asynchronous Many-Task Systems

    Get PDF
    As the complexity of recent and future large-scale data and exascale systems architectures grows, so do productivity, portability, software scalability, and efficient utilization of system resources challenges presented to both industry and the research community. Software solutions and applications are expected to scale in performance on such complex systems. Asynchronous many-task (AMT) systems, taking advantage of multi-core architectures with light-weight threads, asynchronous executions, and smart scheduling, are showing promise in addressing these challenges. In this research, we implement several scalable and distributed applications based on HPX, an exemplar AMT runtime system. First, a distributed HPX implementation for a parameterized benchmark Task Bench is introduced. The performance bottleneck is analyzed where the repeated HPX threads creation costs and a global barrier for all threads limit the performance. The methodologies to retain the spawning threads alive and overlap communication and computation are presented. The evaluation results prove the effectiveness of the improved approach, where HPX is comparable with the prevalent programming models and takes advantages of multi-task scenarios. Second, an algorithms and data-structures SHAD library with HPX support is introduced. The methodologies to support local and remote operations in synchronous and asynchronous manners are developed. The HPX implementation in support of the SHAD library is further provided. Performance results demonstrate that the proposed system presents the similar performance as SHAD with Intel TBB (Threading Building Blocks) support for shared-memory parallelism and is better to explore the distributed-memory parallelism than SHAD with GMT (Global Memory and Threading) support. Third, an asynchronous array processing framework Phylanx is introduced. The methodologies that support a distributed alternating least square algorithm are developed. The implementation of this algorithm along with a number of distributed primitives are provided. The performance results show that Phylanx implementation presents a good scalability. Finally, a scalable second-order method for optimization is introduced. The implementation of a Krylov-Newton second-order method via PyTorch framework is provided. Evaluation results illustrate the effectiveness of scalability, convergence, and robust to hyper-parameters of the proposed method

    Social Search: retrieving information in Online Social Platforms -- A Survey

    Full text link
    Social Search research deals with studying methodologies exploiting social information to better satisfy user information needs in Online Social Media while simplifying the search effort and consequently reducing the time spent and the computational resources utilized. Starting from previous studies, in this work, we analyze the current state of the art of the Social Search area, proposing a new taxonomy and highlighting current limitations and open research directions. We divide the Social Search area into three subcategories, where the social aspect plays a pivotal role: Social Question&Answering, Social Content Search, and Social Collaborative Search. For each subcategory, we present the key concepts and selected representative approaches in the literature in greater detail. We found that, up to now, a large body of studies model users' preferences and their relations by simply combining social features made available by social platforms. It paves the way for significant research to exploit more structured information about users' social profiles and behaviors (as they can be inferred from data available on social platforms) to optimize their information needs further

    Towards Soft Circuit Breaking in Service Meshes via Application-agnostic Caching

    Full text link
    Service meshes factor out code dealing with inter-micro-service communication, such as circuit breaking. Circuit breaking actuation is currently limited to an "on/off" switch, i.e., a tripped circuit breaker will return an application-level error indicating service unavailability to the calling micro-service. This paper proposes a soft circuit breaker actuator, which returns cached data instead of an error. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. We evaluate our approach through two experiments. First, we quantify the trade-off between traffic reduction and data staleness using a purpose-build service, thereby identifying algorithm configurations that keep data staleness at about 3% or less while reducing network load by up to 30%. Second, we quantify the network load reduction with the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching as a circuit breaking actuator in service meshes

    Wireless Sensor Network Virtualization: A Survey

    Get PDF
    Wireless Sensor Networks (WSNs) are the key components of the emerging Internet-of-Things (IoT) paradigm. They are now ubiquitous and used in a plurality of application domains. WSNs are still domain specific and usually deployed to support a specific application. However, as WSN nodes are becoming more and more powerful, it is getting more and more pertinent to research how multiple applications could share a very same WSN infrastructure. Virtualization is a technology that can potentially enable this sharing. This paper is a survey on WSN virtualization. It provides a comprehensive review of the state-of-the-art and an in-depth discussion of the research issues. We introduce the basics of WSN virtualization and motivate its pertinence with carefully selected scenarios. Existing works are presented in detail and critically evaluated using a set of requirements derived from the scenarios. The pertinent research projects are also reviewed. Several research issues are also discussed with hints on how they could be tackled.Comment: Accepted for publication on 3rd March 2015 in forthcoming issue of IEEE Communication Surveys and Tutorials. This version has NOT been proof-read and may have some some inconsistencies. Please refer to final version published in IEEE Xplor

    HIGH PERFORMANCE AGENT-BASED MODELS WITH REAL-TIME IN SITU VISUALIZATION OF INFLAMMATORY AND HEALING RESPONSES IN INJURED VOCAL FOLDS

    Get PDF
    The introduction of clusters of multi-core and many-core processors has played a major role in recent advances in tackling a wide range of new challenging applications and in enabling new frontiers in BigData. However, as the computing power increases, the programming complexity to take optimal advantage of the machine's resources has significantly increased. High-performance computing (HPC) techniques are crucial in realizing the full potential of parallel computing. This research is an interdisciplinary effort focusing on two major directions. The first involves the introduction of HPC techniques to substantially improve the performance of complex biological agent-based models (ABM) simulations, more specifically simulations that are related to the inflammatory and healing responses of vocal folds at the physiological scale in mammals. The second direction involves improvements and extensions of the existing state-of-the-art vocal fold repair models. These improvements and extensions include comprehensive visualization of large data sets generated by the model and a significant increase in user-simulation interactivity. We developed a highly-interactive remote simulation and visualization framework for vocal fold (VF) agent-based modeling (ABM). The 3D VF ABM was verified through comparisons with empirical vocal fold data. Representative trends of biomarker predictions in surgically injured vocal folds were observed. The physiologically representative human VF ABM consisted of more than 15 million mobile biological cells. The model maintained and generated 1.7 billion signaling and extracellular matrix (ECM) protein data points in each iteration. The VF ABM employed HPC techniques to optimize its performance by concurrently utilizing the power of multi-core CPU and multiple GPUs. The optimization techniques included the minimization of data transfer between the CPU host and the rendering GPU. These transfer minimization techniques also reduced transfers between peer GPUs in multi-GPU setups. The data transfer minimization techniques were executed with a scheduling scheme that aims to achieve load balancing, maximum overlap of computation and communication, and a high degree of interactivity. This scheduling scheme achieved optimal interactivity by hyper-tasking the available GPUs (GHT). In comparison to the original serial implementation on a popular ABM framework, NetLogo, these schemes have shown substantial performance improvements of 400x and 800x for the 2D and 3D model, respectively. Furthermore, the combination of data footprint and data transfer reduction techniques with GHT achieved high-interactivity visualization with an average framerate of 42.8 fps. This performance enabled the users to perform real-time data exploration on large simulated outputs and steer the course of their simulation as needed

    GraphScope Flex: LEGO-like Graph Computing Stack

    Full text link
    Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained widespread adoption across various industries. However, one key lesson from this journey has been understanding the limitations of a "one-size-fits-all" approach, especially when dealing with the diversity of programming interfaces, applications, and data storage formats in graph computing. In response to these challenges, we present GraphScope Flex, the next iteration of GraphScope. GraphScope Flex is designed to be both resource-efficient and cost-effective, while also providing flexibility and user-friendliness through its LEGO-like modularity. This paper explores the architectural innovations and fundamental design principles of GraphScope Flex, all of which are direct outcomes of the lessons learned during our ongoing development process. We validate the adaptability and efficiency of GraphScope Flex with extensive evaluations on synthetic and real-world datasets. The results show that GraphScope Flex achieves 2.4X throughput and up to 55.7X speedup over other systems on the LDBC Social Network and Graphalytics benchmarks, respectively. Furthermore, GraphScope Flex accomplishes up to a 2,400X performance gain in real-world applications, demonstrating its proficiency across a wide range of graph computing scenarios with increased effectiveness
    • …
    corecore