Search CORE

126 research outputs found

Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP

Author: Ciorba M. Florina
Eleliemy Ahmed
Mohammed Ali
Müller Korndörfer Jonas H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to 11 % compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance

edoc

Hubungan di antara pengaturan kerja fleksibel dan prestasi pekerja dalam kalangan ejen insurans wanita

Author: Abu Seman Noor Aslinda
Ahmad Nurazwa
Amir Hanis Zahidah
Publication venue: 'Penerbit UTHM'
Publication date: 01/01/2020
Field of study

Ejen insurans merupakan jurujual pertengahan bagi syarikat insurans di mana mereka memainkan peranan penting dalam memberi khidmat nasihat kewangan (Hannah, 2011). Ejen insurans bekerja berdasarkan persekitaran pengaturan kerja yang fleksibel di mana mereka boleh menyediakan jadual waktu bekerja sendiri. Sebahagian daripada mereka bertemu dengan pelanggan pada waktu perniagaan siang hari, sementara yang lain pula membuat kertas kerja dan menyediakan konsultasi untuk pelanggan pada waktu petang. Kebanyakan mereka bekerja selama 40 jam seminggu dan ada juga beberapa ejen yang bekerja lebih lama daripada 40 jam (Hannah, 2011). Prestasi ejen insurans sangat penting untuk mengekalkan jenama produk insurans. Penilaian terhadap prestasi di kalangan ejen insurans biasanya bergantung kepada kejayaan atau kegagalan mencapai sasaran penjualan (Insurance Agent Job Overview, 2019). Proses menjual produk insurans memerlukan masa kerana mereka perlu mendekati pelanggan sebanyak mungkin dan ketersediaan waktu bekerja yang tidak tetap

UTHM Institutional Repository

Development of mobile agent framework in wireless sensor networks for multi-sensor collaborative processing

Author: Kuruganti Phani Teja
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2003
Field of study

Recent advances in processor, memory and radio technology have enabled production of tiny, low-power, low-cost sensor nodes capable of sensing, communication and computation. Although a single node is resource constrained with limited power, limited computation and limited communication bandwidth, these nodes deployed in large number form a new type of network called the wireless sensor network (WSN). One of the challenges brought by WSNs is an efficient computing paradigm to support the distributed nature of the applications built on these networks considering the resource limitations of the sensor nodes. Collaborative processing between multiple sensor nodes is essential to generate fault-tolerant, reliable information from the densely-spatial sensing phenomenon. The typical model used in distributed computing is the client/server model. However, this computing model is not appropriate in the context of sensor networks. This thesis develops an energy-efficient, scalable and real-time computing model for collaborative processing in sensor networks called the mobile agent computing paradigm. In this paradigm, instead of each sensor node sending data or result to a central server which is typical in the client/server model, the information processing code is moved to the nodes using mobile agents. These agents carry the execution code and migrate from one node to another integrating result at each node. This thesis develops the mobile agent framework on top of an energy-efficient routing protocol called directed diffusion. The mobile agent framework described has been mapped to collaborative target classification application. This application has been tested in three field demos conducted at Twentynine palms, CA; BAE Austin, TX; and BBN Waltham, MA

University of Tennessee, Knoxville: Trace

Recommended from our members

Development of a three-dimensional printed heart from computed tomography images of a plastinated specimen for learning anatomy

Author: Ferenczi MA
Low-Beer N
Mogali SR
Radzi S
Tan GJS
Tan HKJ
Yeong WY
Publication venue: 'Korean Association of Anatomists'
Publication date: 01/03/2020
Field of study

Copyright © 2020 The Author(s) and Korean Association of ANATOMISTS. Learning anatomy is commonly facilitated by use of cadavers, plastic models and more recently three-dimensional printed (3DP) anatomical models as they allow students to physically touch and hold the body segments. However, most existing models are limited to surface features of the specimen, with little opportunity to manipulate the structures. There is much interest in developing better 3DP models suitable for anatomy education. This study aims to determine the feasibility of developing a multi-material 3DP heart model, and to evaluate students' perceptions of the model. Semi-automated segmentation was performed on computed tomgoraphy plastinated heart images to develop its 3D digital heart model. Material jetting was used as part of the 3D printing process so that various colors and textures could be assigned to the individual segments of the model. Morphometric analysis was conducted to quantify the differences between the printed model and the plastinated heart. Medical students' opinions were sought using a 5-point Likert scale. The 3DP full heart was anatomically accurate, pliable and compressible to touch. The major vessels of the heart were color-coded for easy recognition. Morphometric analysis of the printed model was comparable with the plastinated heart. Students were positive about the quality of the model and the majority of them reported that the model was useful for their learning and that they would recommend their use for anatomical education. The successful feasibility study and students' positive views suggest that the development of multi-material 3DP models is promising for medical education

Brunel University Research Archive

SQC: secure quality control for meta-analysis of genome-wide association studies.

Author: Fellay J.
Huang Z.
Hubaux J.P.
Kutalik Z.
Lin H.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/06/2017
Field of study

Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. [email protected]. Supplementary data are available at Bioinformatics online

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Serveur académique lausannois

Neural Architecture Search as Multiobjective Optimization Benchmarks: Problem Formulation and Performance Assessment

Author: Cheng Ran
Deb Kalyanmoy
Jin Yaochu
Lu Zhichao
Tan Kay Chen
Publication venue
Publication date: 18/04/2023
Field of study

The ongoing advancements in network architecture design have led to remarkable achievements in deep learning across various challenging computer vision tasks. Meanwhile, the development of neural architecture search (NAS) has provided promising approaches to automating the design of network architectures for lower prediction error. Recently, the emerging application scenarios of deep learning have raised higher demands for network architectures considering multiple design criteria: number of parameters/floating-point operations, and inference latency, among others. From an optimization point of view, the NAS tasks involving multiple design criteria are intrinsically multiobjective optimization problems; hence, it is reasonable to adopt evolutionary multiobjective optimization (EMO) algorithms for tackling them. Nonetheless, there is still a clear gap confining the related research along this pathway: on the one hand, there is a lack of a general problem formulation of NAS tasks from an optimization point of view; on the other hand, there are challenges in conducting benchmark assessments of EMO algorithms on NAS tasks. To bridge the gap: (i) we formulate NAS tasks into general multi-objective optimization problems and analyze the complex characteristics from an optimization point of view; (ii) we present an end-to-end pipeline, dubbed

\texttt{EvoXBench}

, to generate benchmark test problems for EMO algorithms to run efficiently -- without the requirement of GPUs or Pytorch/Tensorflow; (iii) we instantiate two test suites comprehensively covering two datasets, seven search spaces, and three hardware devices, involving up to eight objectives. Based on the above, we validate the proposed test suites using six representative EMO algorithms and provide some empirical analyses. The code of

\texttt{EvoXBench}

is available from

\href{https://github.com/EMI-Group/EvoXBench}{\rm{here}}

arXiv.org e-Print Archive

A Global Clustering Algorithm to Identify Long Intergenic Non-Coding RNA - with Applications in Mouse Macrophages

Author: A Barski
A Matsuda
AC Marques
AD Friedman
AM Khalil
AP Fejes
C Burge
C Zang
Christopher K. Glass
D De Keyzer
DA Gilchrist
David G. Garmire
F De Santa
F Wermeling
FF Costa
FF Costa
H Bjorkbacka
H Huang
H Ji
I. King Jordan
J Adolfsson
J Feng
J Zeitlinger
JM Olefsky
Joyee Yao
K Adelman
KC Pang
Lana X. Garmire
LanaX Garmire
M Guttman
M Guttman
M Huarte
MC Tsai
MR Maurya
N Shibata
RL Eskay
S Heinz
S Jaiswal
S Nechaev
S Nechaev
S Washietl
Shankar Subramaniam
T Ravasi
W Huang da
Wendy Huang
Publication venue: Public Library of Science
Publication date: 30/09/2011
Field of study

Identification of diffuse signals from the chromatin immunoprecipitation and high-throughput massively parallel sequencing (ChIP-Seq) technology poses significant computational challenges, and there are few methods currently available. We present a novel global clustering approach to enrich diffuse CHIP-Seq signals of RNA polymerase II and histone 3 lysine 4 trimethylation (H3K4Me3) and apply it to identify putative long intergenic non-coding RNAs (lincRNAs) in macrophage cells. Our global clustering method compares favorably to the local clustering method SICER that was also designed to identify diffuse CHIP-Seq signals. The validity of the algorithm is confirmed at several levels. First, 8 out of a total of 11 selected putative lincRNA regions in primary macrophages respond to lipopolysaccharides (LPS) treatment as predicted by our computational method. Second, the genes nearest to lincRNAs are enriched with biological functions related to metabolic processes under resting conditions but with developmental and immune-related functions under LPS treatment. Third, the putative lincRNAs have conserved promoters, modestly conserved exons, and expected secondary structures by prediction. Last, they are enriched with motifs of transcription factors such as PU.1 and AP.1, previously shown to be important lineage determining factors in macrophages, and 83% of them overlap with distal enhancers markers. In summary, GCLS based on RNA polymerase II and H3K4Me3 CHIP-Seq method can effectively detect putative lincRNAs that exhibit expected characteristics, as exemplified by macrophages in the study

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Machine Tool Communication (MTComm) Method and Its Applications in a Cyber-Physical Manufacturing Cloud

Author: Sunny S M Nahian Al
Publication venue: ScholarWorks@UARK
Publication date: 01/10/2023
Field of study

The integration of cyber-physical systems and cloud manufacturing has the potential to revolutionize existing manufacturing systems by enabling better accessibility, agility, and efficiency. To achieve this, it is necessary to establish a communication method of manufacturing services over the Internet to access and manage physical machines from cloud applications. Most of the existing industrial automation protocols utilize Ethernet based Local Area Network (LAN) and are not designed specifically for Internet enabled data transmission. Recently MTConnect has been gaining popularity as a standard for monitoring status of machine tools through RESTful web services and an XML based messaging structure, but it is only designed for data collection and interpretation and lacks remote operation capability. This dissertation presents the design, development, optimization, and applications of a service-oriented Internet-scale communication method named Machine Tool Communication (MTComm) for exchanging manufacturing services in a Cyber-Physical Manufacturing Cloud (CPMC) to enable manufacturing with heterogeneous physically connected machine tools from geographically distributed locations over the Internet. MTComm uses an agent-adapter based architecture and a semantic ontology to provide both remote monitoring and operation capabilities through RESTful services and XML messages. MTComm was successfully used to develop and implement multi-purpose applications in in a CPMC including remote and collaborative manufacturing, active testing-based and edge-based fault diagnosis and maintenance of machine tools, cross-domain interoperability between Internet-of-things (IoT) devices and supply chain robots etc. To improve MTComm’s overall performance, efficiency, and acceptability in cyber manufacturing, the concept of MTComm’s edge-based middleware was introduced and three optimization strategies for data catching, transmission, and operation execution were developed and adopted at the edge. Finally, a hardware prototype of the middleware was implemented on a System-On-Chip based FPGA device to reduce computational and transmission latency. At every stage of its development, MTComm’s performance and feasibility were evaluated with experiments in a CPMC testbed with three different types of manufacturing machine tools. Experimental results demonstrated MTComm’s excellent feasibility for scalable cyber-physical manufacturing and superior performance over other existing approaches

UARK (University of Arkansas )

Machine Tool Communication (MTComm) Method and Its Applications in a Cyber-Physical Manufacturing Cloud

Author: Sunny S M Nahian Al
Publication venue: ScholarWorks@UARK
Publication date: 01/10/2023
Field of study

ScholarWorks@UARK

Operating System Contribution to Composable Timing Behaviour in High-Integrity Real-Time Systems

Author: Baldovin Andrea <1983>
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 19/05/2014
Field of study

The development of High-Integrity Real-Time Systems has a high footprint in terms of human, material and schedule costs. Factoring functional, reusable logic in the application favors incremental development and contains costs. Yet, achieving incrementality in the timing behavior is a much harder problem. Complex features at all levels of the execution stack, aimed to boost average-case performance, exhibit timing behavior highly dependent on execution history, which wrecks time composability and incrementaility with it. Our goal here is to restitute time composability to the execution stack, working bottom up across it. We first characterize time composability without making assumptions on the system architecture or the software deployment to it. Later, we focus on the role played by the real-time operating system in our pursuit. Initially we consider single-core processors and, becoming less permissive on the admissible hardware features, we devise solutions that restore a convincing degree of time composability. To show what can be done for real, we developed TiCOS, an ARINC-compliant kernel, and re-designed ORK+, a kernel for Ada Ravenscar runtimes. In that work, we added support for limited-preemption to ORK+, an absolute premiere in the landscape of real-word kernels. Our implementation allows resource sharing to co-exist with limited-preemptive scheduling, which extends state of the art. We then turn our attention to multicore architectures, first considering partitioned systems, for which we achieve results close to those obtained for single-core processors. Subsequently, we shy away from the over-provision of those systems and consider less restrictive uses of homogeneous multiprocessors, where the scheduling algorithm is key to high schedulable utilization. To that end we single out RUN, a promising baseline, and extend it to SPRINT, which supports sporadic task sets, hence matches real-world industrial needs better. To corroborate our results we present findings from real-world case studies from avionic industry

AMS Tesi di Dottorato