126 research outputs found

    Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP

    Get PDF
    Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to 11 % compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance

    Hubungan di antara pengaturan kerja fleksibel dan prestasi pekerja dalam kalangan ejen insurans wanita

    Get PDF
    Ejen insurans merupakan jurujual pertengahan bagi syarikat insurans di mana mereka memainkan peranan penting dalam memberi khidmat nasihat kewangan (Hannah, 2011). Ejen insurans bekerja berdasarkan persekitaran pengaturan kerja yang fleksibel di mana mereka boleh menyediakan jadual waktu bekerja sendiri. Sebahagian daripada mereka bertemu dengan pelanggan pada waktu perniagaan siang hari, sementara yang lain pula membuat kertas kerja dan menyediakan konsultasi untuk pelanggan pada waktu petang. Kebanyakan mereka bekerja selama 40 jam seminggu dan ada juga beberapa ejen yang bekerja lebih lama daripada 40 jam (Hannah, 2011). Prestasi ejen insurans sangat penting untuk mengekalkan jenama produk insurans. Penilaian terhadap prestasi di kalangan ejen insurans biasanya bergantung kepada kejayaan atau kegagalan mencapai sasaran penjualan (Insurance Agent Job Overview, 2019). Proses menjual produk insurans memerlukan masa kerana mereka perlu mendekati pelanggan sebanyak mungkin dan ketersediaan waktu bekerja yang tidak tetap

    Development of mobile agent framework in wireless sensor networks for multi-sensor collaborative processing

    Get PDF
    Recent advances in processor, memory and radio technology have enabled production of tiny, low-power, low-cost sensor nodes capable of sensing, communication and computation. Although a single node is resource constrained with limited power, limited computation and limited communication bandwidth, these nodes deployed in large number form a new type of network called the wireless sensor network (WSN). One of the challenges brought by WSNs is an efficient computing paradigm to support the distributed nature of the applications built on these networks considering the resource limitations of the sensor nodes. Collaborative processing between multiple sensor nodes is essential to generate fault-tolerant, reliable information from the densely-spatial sensing phenomenon. The typical model used in distributed computing is the client/server model. However, this computing model is not appropriate in the context of sensor networks. This thesis develops an energy-efficient, scalable and real-time computing model for collaborative processing in sensor networks called the mobile agent computing paradigm. In this paradigm, instead of each sensor node sending data or result to a central server which is typical in the client/server model, the information processing code is moved to the nodes using mobile agents. These agents carry the execution code and migrate from one node to another integrating result at each node. This thesis develops the mobile agent framework on top of an energy-efficient routing protocol called directed diffusion. The mobile agent framework described has been mapped to collaborative target classification application. This application has been tested in three field demos conducted at Twentynine palms, CA; BAE Austin, TX; and BBN Waltham, MA

    SQC: secure quality control for meta-analysis of genome-wide association studies.

    Get PDF
    Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. [email protected]. Supplementary data are available at Bioinformatics online

    Neural Architecture Search as Multiobjective Optimization Benchmarks: Problem Formulation and Performance Assessment

    Full text link
    The ongoing advancements in network architecture design have led to remarkable achievements in deep learning across various challenging computer vision tasks. Meanwhile, the development of neural architecture search (NAS) has provided promising approaches to automating the design of network architectures for lower prediction error. Recently, the emerging application scenarios of deep learning have raised higher demands for network architectures considering multiple design criteria: number of parameters/floating-point operations, and inference latency, among others. From an optimization point of view, the NAS tasks involving multiple design criteria are intrinsically multiobjective optimization problems; hence, it is reasonable to adopt evolutionary multiobjective optimization (EMO) algorithms for tackling them. Nonetheless, there is still a clear gap confining the related research along this pathway: on the one hand, there is a lack of a general problem formulation of NAS tasks from an optimization point of view; on the other hand, there are challenges in conducting benchmark assessments of EMO algorithms on NAS tasks. To bridge the gap: (i) we formulate NAS tasks into general multi-objective optimization problems and analyze the complex characteristics from an optimization point of view; (ii) we present an end-to-end pipeline, dubbed EvoXBench\texttt{EvoXBench}, to generate benchmark test problems for EMO algorithms to run efficiently -- without the requirement of GPUs or Pytorch/Tensorflow; (iii) we instantiate two test suites comprehensively covering two datasets, seven search spaces, and three hardware devices, involving up to eight objectives. Based on the above, we validate the proposed test suites using six representative EMO algorithms and provide some empirical analyses. The code of EvoXBench\texttt{EvoXBench} is available from \href\href{https://github.com/EMI-Group/EvoXBench}{\rm{here}}

    A Global Clustering Algorithm to Identify Long Intergenic Non-Coding RNA - with Applications in Mouse Macrophages

    Get PDF
    Identification of diffuse signals from the chromatin immunoprecipitation and high-throughput massively parallel sequencing (ChIP-Seq) technology poses significant computational challenges, and there are few methods currently available. We present a novel global clustering approach to enrich diffuse CHIP-Seq signals of RNA polymerase II and histone 3 lysine 4 trimethylation (H3K4Me3) and apply it to identify putative long intergenic non-coding RNAs (lincRNAs) in macrophage cells. Our global clustering method compares favorably to the local clustering method SICER that was also designed to identify diffuse CHIP-Seq signals. The validity of the algorithm is confirmed at several levels. First, 8 out of a total of 11 selected putative lincRNA regions in primary macrophages respond to lipopolysaccharides (LPS) treatment as predicted by our computational method. Second, the genes nearest to lincRNAs are enriched with biological functions related to metabolic processes under resting conditions but with developmental and immune-related functions under LPS treatment. Third, the putative lincRNAs have conserved promoters, modestly conserved exons, and expected secondary structures by prediction. Last, they are enriched with motifs of transcription factors such as PU.1 and AP.1, previously shown to be important lineage determining factors in macrophages, and 83% of them overlap with distal enhancers markers. In summary, GCLS based on RNA polymerase II and H3K4Me3 CHIP-Seq method can effectively detect putative lincRNAs that exhibit expected characteristics, as exemplified by macrophages in the study

    Machine Tool Communication (MTComm) Method and Its Applications in a Cyber-Physical Manufacturing Cloud

    Get PDF
    The integration of cyber-physical systems and cloud manufacturing has the potential to revolutionize existing manufacturing systems by enabling better accessibility, agility, and efficiency. To achieve this, it is necessary to establish a communication method of manufacturing services over the Internet to access and manage physical machines from cloud applications. Most of the existing industrial automation protocols utilize Ethernet based Local Area Network (LAN) and are not designed specifically for Internet enabled data transmission. Recently MTConnect has been gaining popularity as a standard for monitoring status of machine tools through RESTful web services and an XML based messaging structure, but it is only designed for data collection and interpretation and lacks remote operation capability. This dissertation presents the design, development, optimization, and applications of a service-oriented Internet-scale communication method named Machine Tool Communication (MTComm) for exchanging manufacturing services in a Cyber-Physical Manufacturing Cloud (CPMC) to enable manufacturing with heterogeneous physically connected machine tools from geographically distributed locations over the Internet. MTComm uses an agent-adapter based architecture and a semantic ontology to provide both remote monitoring and operation capabilities through RESTful services and XML messages. MTComm was successfully used to develop and implement multi-purpose applications in in a CPMC including remote and collaborative manufacturing, active testing-based and edge-based fault diagnosis and maintenance of machine tools, cross-domain interoperability between Internet-of-things (IoT) devices and supply chain robots etc. To improve MTComm’s overall performance, efficiency, and acceptability in cyber manufacturing, the concept of MTComm’s edge-based middleware was introduced and three optimization strategies for data catching, transmission, and operation execution were developed and adopted at the edge. Finally, a hardware prototype of the middleware was implemented on a System-On-Chip based FPGA device to reduce computational and transmission latency. At every stage of its development, MTComm’s performance and feasibility were evaluated with experiments in a CPMC testbed with three different types of manufacturing machine tools. Experimental results demonstrated MTComm’s excellent feasibility for scalable cyber-physical manufacturing and superior performance over other existing approaches

    Machine Tool Communication (MTComm) Method and Its Applications in a Cyber-Physical Manufacturing Cloud

    Get PDF
    The integration of cyber-physical systems and cloud manufacturing has the potential to revolutionize existing manufacturing systems by enabling better accessibility, agility, and efficiency. To achieve this, it is necessary to establish a communication method of manufacturing services over the Internet to access and manage physical machines from cloud applications. Most of the existing industrial automation protocols utilize Ethernet based Local Area Network (LAN) and are not designed specifically for Internet enabled data transmission. Recently MTConnect has been gaining popularity as a standard for monitoring status of machine tools through RESTful web services and an XML based messaging structure, but it is only designed for data collection and interpretation and lacks remote operation capability. This dissertation presents the design, development, optimization, and applications of a service-oriented Internet-scale communication method named Machine Tool Communication (MTComm) for exchanging manufacturing services in a Cyber-Physical Manufacturing Cloud (CPMC) to enable manufacturing with heterogeneous physically connected machine tools from geographically distributed locations over the Internet. MTComm uses an agent-adapter based architecture and a semantic ontology to provide both remote monitoring and operation capabilities through RESTful services and XML messages. MTComm was successfully used to develop and implement multi-purpose applications in in a CPMC including remote and collaborative manufacturing, active testing-based and edge-based fault diagnosis and maintenance of machine tools, cross-domain interoperability between Internet-of-things (IoT) devices and supply chain robots etc. To improve MTComm’s overall performance, efficiency, and acceptability in cyber manufacturing, the concept of MTComm’s edge-based middleware was introduced and three optimization strategies for data catching, transmission, and operation execution were developed and adopted at the edge. Finally, a hardware prototype of the middleware was implemented on a System-On-Chip based FPGA device to reduce computational and transmission latency. At every stage of its development, MTComm’s performance and feasibility were evaluated with experiments in a CPMC testbed with three different types of manufacturing machine tools. Experimental results demonstrated MTComm’s excellent feasibility for scalable cyber-physical manufacturing and superior performance over other existing approaches

    Operating System Contribution to Composable Timing Behaviour in High-Integrity Real-Time Systems

    Get PDF
    The development of High-Integrity Real-Time Systems has a high footprint in terms of human, material and schedule costs. Factoring functional, reusable logic in the application favors incremental development and contains costs. Yet, achieving incrementality in the timing behavior is a much harder problem. Complex features at all levels of the execution stack, aimed to boost average-case performance, exhibit timing behavior highly dependent on execution history, which wrecks time composability and incrementaility with it. Our goal here is to restitute time composability to the execution stack, working bottom up across it. We first characterize time composability without making assumptions on the system architecture or the software deployment to it. Later, we focus on the role played by the real-time operating system in our pursuit. Initially we consider single-core processors and, becoming less permissive on the admissible hardware features, we devise solutions that restore a convincing degree of time composability. To show what can be done for real, we developed TiCOS, an ARINC-compliant kernel, and re-designed ORK+, a kernel for Ada Ravenscar runtimes. In that work, we added support for limited-preemption to ORK+, an absolute premiere in the landscape of real-word kernels. Our implementation allows resource sharing to co-exist with limited-preemptive scheduling, which extends state of the art. We then turn our attention to multicore architectures, first considering partitioned systems, for which we achieve results close to those obtained for single-core processors. Subsequently, we shy away from the over-provision of those systems and consider less restrictive uses of homogeneous multiprocessors, where the scheduling algorithm is key to high schedulable utilization. To that end we single out RUN, a promising baseline, and extend it to SPRINT, which supports sporadic task sets, hence matches real-world industrial needs better. To corroborate our results we present findings from real-world case studies from avionic industry
    • 

    corecore