11,684 research outputs found

    Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

    Full text link
    In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.Comment: Accepted to The 5th IEEE International Conference on Big Data and Cloud Computing (BDCloud 2015

    Towards delay-aware container-based Service Function Chaining in Fog Computing

    Get PDF
    Recently, the fifth-generation mobile network (5G) is getting significant attention. Empowered by Network Function Virtualization (NFV), 5G networks aim to support diverse services coming from different business verticals (e.g. Smart Cities, Automotive, etc). To fully leverage on NFV, services must be connected in a specific order forming a Service Function Chain (SFC). SFCs allow mobile operators to benefit from the high flexibility and low operational costs introduced by network softwarization. Additionally, Cloud computing is evolving towards a distributed paradigm called Fog Computing, which aims to provide a distributed cloud infrastructure by placing computational resources close to end-users. However, most SFC research only focuses on Multi-access Edge Computing (MEC) use cases where mobile operators aim to deploy services close to end-users. Bi-directional communication between Edges and Cloud are not considered in MEC, which in contrast is highly important in a Fog environment as in distributed anomaly detection services. Therefore, in this paper, we propose an SFC controller to optimize the placement of service chains in Fog environments, specifically tailored for Smart City use cases. Our approach has been validated on the Kubernetes platform, an open-source orchestrator for the automatic deployment of micro-services. Our SFC controller has been implemented as an extension to the scheduling features available in Kubernetes, enabling the efficient provisioning of container-based SFCs while optimizing resource allocation and reducing the end-to-end (E2E) latency. Results show that the proposed approach can lower the network latency up to 18% for the studied use case while conserving bandwidth when compared to the default scheduling mechanism

    Steps in Metagenomics: Let’s Avoid Garbage in and Garbage Out

    Get PDF
    Is metagenomics a revolution or a new fad? Metagenomics is tightly associated with the availability of next-generation sequencing in all its implementations. The key feature of these new technologies, moving beyond the Sanger-based DNA sequencing approach, is the depth of nucleotide sequencing per sample.1 Knowing much more about a sample changes the traditional paradigms of “What is the most abundant?” or “What is the most significant?” to “What is present and potentially sig­nificant that might influence the situation and outcome?” Let’s take the case of identifying proper biomarkers of disease state in the context of chronic disease prevention. Prevention has been deemed as a viable option to avert human chronic diseases and to curb health­care management costs.2 The actual implementation of any effective preventive measures has proven to be rather difficult. In addition to the typically poor compliance of the general public, the vagueness of the successful validation of habit modification on the long-term risk, points to the need of defining new biomarkers of disease state. Scientists and the public are accepting the fact that humans are super-organisms, harboring both a human genome and a microbial genome, the latter being much bigger in size and diversity, and key for the health of individuals.3,4 It is time to investigate the intricate relationship between humans and their associated microbiota and how this relationship mod­ulates or affects both partners.5 These remarks can be expanded to the animal and plant kingdoms, and holistically to the Earth’s biome. By its nature, the evolution and function of all the Earth’s biomes are influenced by a myriad of interactions between and among microbes (planktonic, in biofilms or host associated) and the surrounding physical environment. The general definition of metagenomics is the cultivation-indepen­dent analysis of the genetic information of the collective genomes of the microbes within a given environment based on its sampling. It focuses on the collection of genetic information through sequencing that can target DNA, RNA, or both. The subsequent analyses can be solely fo­cused on sequence conservation, phylogenetic, phylogenomic, function, or genetic diversity representation including yet-to-be annotated genes. The diversity of hypotheses, questions, and goals to be accomplished is endless. The primary design is based on the nature of the material to be analyzed and its primary function

    E-Journals and the Big Deal: A Review of the Literature

    Get PDF
    Faced with shrinking budgets and increased subscription prices, many academic libraries are seeking ways to reduce the cost of e-journal access. A common target for cuts is the “Big Deal,” or large bundled subscription model, a term coined by Kenneth Frazier in a 2001 paper criticizing the effects of the Big Deal on the academic community. The purpose of this literature review is to examine issues related to reducing e-journal costs, including criteria for subscription retention or cancellation, decision-making strategies, impacts of cancellations, and other options for e-journal content provision. Commonly used criteria for decision-making include usage statistics, overlap analysis, and input from subject specialists. The most commonly used strategy for guiding the process and aggregating data is the rubric or decision grid. While the e-journal landscape supports several access models, such as Pay-Per-View, cloud access, and interlibrary loan, the Big Deal continues to dominate. Trends over the past several years point to dwindling support for the Big Deal however, due largely to significant annual rate increases and loss of content control

    Towards co-designed optimizations in parallel frameworks: A MapReduce case study

    Full text link
    The explosion of Big Data was followed by the proliferation of numerous complex parallel software stacks whose aim is to tackle the challenges of data deluge. A drawback of a such multi-layered hierarchical deployment is the inability to maintain and delegate vital semantic information between layers in the stack. Software abstractions increase the semantic distance between an application and its generated code. However, parallel software frameworks contain inherent semantic information that general purpose compilers are not designed to exploit. This paper presents a case study demonstrating how the specific semantic information of the MapReduce paradigm can be exploited on multicore architectures. MR4J has been implemented in Java and evaluated against hand-optimized C and C++ equivalents. The initial observed results led to the design of a semantically aware optimizer that runs automatically without requiring modification to application code. The optimizer is able to speedup the execution time of MR4J by up to 2.0x. The introduced optimization not only improves the performance of the generated code, during the map phase, but also reduces the pressure on the garbage collector. This demonstrates how semantic information can be harnessed without sacrificing sound software engineering practices when using parallel software frameworks.Comment: 8 page
    • …
    corecore