Search CORE

3,029 research outputs found

Scheduling with processing set restrictions : a survey

Author: Leung JY
Li CL
Publication venue: 'Elsevier BV'
Publication date: 11/12/2014
Field of study

2008-2009 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

PolyU Institutional Repository

Scheduling parallel machines with inclusive processing set restrictions and job release times

Author: Li CL
Wang X
Publication venue: 'Elsevier BV'
Publication date: 11/12/2014
Field of study

2009-2010 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

PolyU Institutional Repository

Data-Driven Intelligent Scheduling For Long Running Workloads In Large-Scale Datacenters

Author: Xu Guoyao
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2019
Field of study

Cloud computing is becoming a fundamental facility of society today. Large-scale public or private cloud datacenters spreading millions of servers, as a warehouse-scale computer, are supporting most business of Fortune-500 companies and serving billions of users around the world. Unfortunately, modern industry-wide average datacenter utilization is as low as 6% to 12%. Low utilization not only negatively impacts operational and capital components of cost efficiency, but also becomes the scaling bottleneck due to the limits of electricity delivered by nearby utility. It is critical and challenge to improve multi-resource efficiency for global datacenters. Additionally, with the great commercial success of diverse big data analytics services, enterprise datacenters are evolving to host heterogeneous computation workloads including online web services, batch processing, machine learning, streaming computing, interactive query and graph computation on shared clusters. Most of them are long-running workloads that leverage long-lived containers to execute tasks. We concluded datacenter resource scheduling works over last 15 years. Most previous works are designed to maximize the cluster efficiency for short-lived tasks in batch processing system like Hadoop. They are not suitable for modern long-running workloads of Microservices, Spark, Flink, Pregel, Storm or Tensorflow like systems. It is urgent to develop new effective scheduling and resource allocation approaches to improve efficiency in large-scale enterprise datacenters. In the dissertation, we are the first of works to define and identify the problems, challenges and scenarios of scheduling and resource management for diverse long-running workloads in modern datacenter. They rely on predictive scheduling techniques to perform reservation, auto-scaling, migration or rescheduling. It forces us to pursue and explore more intelligent scheduling techniques by adequate predictive knowledges. We innovatively specify what is intelligent scheduling, what abilities are necessary towards intelligent scheduling, how to leverage intelligent scheduling to transfer NP-hard online scheduling problems to resolvable offline scheduling issues. We designed and implemented an intelligent cloud datacenter scheduler, which automatically performs resource-to-performance modeling, predictive optimal reservation estimation, QoS (interference)-aware predictive scheduling to maximize resource efficiency of multi-dimensions (CPU, Memory, Network, Disk I/O), and strictly guarantee service level agreements (SLA) for long-running workloads. Finally, we introduced a large-scale co-location techniques of executing long-running and other workloads on the shared global datacenter infrastructure of Alibaba Group. It effectively improves cluster utilization from 10% to averagely 50%. It is far more complicated beyond scheduling that involves technique evolutions of IDC, network, physical datacenter topology, storage, server hardwares, operating systems and containerization. We demonstrate its effectiveness by analysis of newest Alibaba public cluster trace in 2017. We are the first of works to reveal the global view of scenarios, challenges and status in Alibaba large-scale global datacenters by data demonstration, including big promotion events like Double 11 . Data-driven intelligent scheduling methodologies and effective infrastructure co-location techniques are critical and necessary to pursue maximized multi-resource efficiency in modern large-scale datacenter, especially for long-running workloads

Digital Commons@Wayne State University

ProQuest OAI Repository

Complex scheduling models and analyses for property-based real-time embedded systems

Author: Ueter Niklas
Publication venue
Publication date: 01/01/2023
Field of study

Modern multi core architectures and parallel applications pose a significant challenge to the worst-case centric real-time system verification and design efforts. The involved model and parameter uncertainty contest the fidelity of formal real-time analyses, which are mostly based on exact model assumptions. In this dissertation, various approaches that can accept parameter and model uncertainty are presented. In an attempt to improve predictability in worst-case centric analyses, the exploration of timing predictable protocols are examined for parallel task scheduling on multiprocessors and network-on-chip arbitration. A novel scheduling algorithm, called stationary rigid gang scheduling, for gang tasks on multiprocessors is proposed. In regard to fixed-priority wormhole-switched network-on-chips, a more restrictive family of transmission protocols called simultaneous progression switching protocols is proposed with predictability enhancing properties. Moreover, hierarchical scheduling for parallel DAG tasks under parameter uncertainty is studied to achieve temporal- and spatial isolation. Fault-tolerance as a supplementary reliability aspect of real-time systems is examined, in spite of dynamic external causes of fault. Using various job variants, which trade off increased execution time demand with increased error protection, a state-based policy selection strategy is proposed, which provably assures an acceptable quality-of-service (QoS). Lastly, the temporal misalignment of sensor data in sensor fusion applications in cyber-physical systems is examined. A modular analysis based on minimal properties to obtain an upper-bound for the maximal sensor data time-stamp difference is proposed

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

Author: Chen Kwang-Cheng
Hanzo Lajos
Jiang Chunxiao
Ren Yong
Wang Jingjing
Zhang Haijun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/01/2019
Field of study

Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Efficient similarity computations on parallel machines using data shaping

Author: Shukla Parijat
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

Similarity computation is a fundamental operation in all forms of data. Big Data is, typically, characterized by attributes such as volume, velocity, variety, veracity, etc. In general, Big Data variety appears as structured, semi-structured or unstructured forms. The volume of Big Data in general, and semi-structured data in particular, is increasing at a phenomenal rate. Big Data phenomenon is posing new set of challenges to similarity computation problems occurring in semi-structured data. Technology and processor architecture trends suggest very strongly that future processors shall have ten\u27s of thousands of cores (hardware threads). Another crucial trend is that ratio between on-chip and off-chip memory to core counts is decreasing. State-of-the-art parallel computing platforms such as General Purpose Graphics Processors (GPUs) and MICs are promising for high performance as well high throughput computing. However, processing semi-structured component of Big Data efficiently using parallel computing systems (e.g. GPUs) is challenging. Reason being most of the emerging platforms (e.g. GPUs) are organized as Single Instruction Multiple Thread/Data machines which are highly structured, where several cores (streaming processors) operate in lock-step manner, or they require a high degree of task-level parallelism. We argue that effective and efficient solutions to key similarity computation problems need to operate in a synergistic manner with the underlying computing hardware. Moreover, semi-structured form input data needs to be shaped or reorganized with the goal to exploit the enormous computing power of \textit{state-of-the-art} highly threaded architectures such as GPUs. For example, shaping input data (via encoding) with minimal data-dependence can facilitate flexible and concurrent computations on high throughput accelerators/co-processors such as GPU, MIC, etc. We consider various instances of traditional and futuristic problems occurring in intersection of semi-structured data and data analytics. Preprocessing is an operation common at initial stages of data processing pipelines. Typically, the preprocessing involves operations such as data extraction, data selection, etc. In context of semi-structured data, twig filtering is used in identifying (and extracting) data of interest. Duplicate detection and record linkage operations are useful in preprocessing tasks such as data cleaning, data fusion, and also useful in data mining, etc., in order to find similar tree objects. Likewise, tree edit is a fundamental metric used in context of tree problems; and similarity computation between trees another key problem in context of Big Data. This dissertation makes a case for platform-centric data shaping as a potent mechanism to tackle the data- and architecture-borne issues in context of semi-structured data processing on GPU and GPU-like parallel architecture machines. In this dissertation, we propose several data shaping techniques for tree matching problems occurring in semi-structured data. We experiment with real world datasets. The experimental results obtained reveal that the proposed platform-centric data shaping approach is effective for computing similarities between tree objects using GPGPUs. The techniques proposed result in performance gains up to three orders of magnitude, subject to problem and platform

Digital Repository @ Iowa State University (ISU)

Evolving Clustering Algorithms And Their Application For Condition Monitoring, Diagnostics, & Prognostics

Author: Tseng Fling Finn
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2017
Field of study

Applications of Condition-Based Maintenance (CBM) technology requires effective yet generic data driven methods capable of carrying out diagnostics and prognostics tasks without detailed domain knowledge and human intervention. Improved system availability, operational safety, and enhanced logistics and supply chain performance could be achieved, with the widespread deployment of CBM, at a lower cost level. This dissertation focuses on the development of a Mutual Information based Recursive Gustafson-Kessel-Like (MIRGKL) clustering algorithm which operates recursively to identify underlying model structure and parameters from stream type data. Inspired by the Evolving Gustafson-Kessel-like Clustering (eGKL) algorithm, we applied the notion of mutual information to the well-known Mahalanobis distance as the governing similarity measure throughout. This is also a special case of the Kullback-Leibler (KL) Divergence where between-cluster shape information (governed by the determinant and trace of the covariance matrix) is omitted and is only applicable in the case of normally distributed data. In the cluster assignment and consolidation process, we proposed the use of the Chi-square statistic with the provision of having different probability thresholds. Due to the symmetry and boundedness property brought in by the mutual information formulation, we have shown with real-world data that the algorithm’s performance becomes less sensitive to the same range of probability thresholds which makes system tuning a simpler task in practice. As a result, improvement demonstrated by the proposed algorithm has implications in improving generic data driven methods for diagnostics, prognostics, generic function approximations and knowledge extractions for stream type of data. The work in this dissertation demonstrates MIRGKL’s effectiveness in clustering and knowledge representation and shows promising results in diagnostics and prognostics applications

Digital Commons@Wayne State University

Production Engineering and Management

Author: Department of Production Engineering and Management
Hochschule Ostwestfalen-Lippe
Padoano Elio
Villmer Franz-Josef
Publication venue
Publication date: 01/01/2016
Field of study

The annual International Conference on Production Engineering and Management takes place for the sixth time his year, and can therefore be considered a well - established event that is the result of the joint effort of the OWL University of Applied Sciences and the University of Trieste. The conference has been established as an annual meeting under the Double Degree Master Program ‘Production Engineering and Management’ by the two partner universities. The main goal of the conference is to provide an opportunity for students, researchers and professionals from Germany, Italy and abroad, to meet and exchange information, discuss experiences, specific practices and technical solutions used in planning, design and management of production and service systems. In addition, the conference is a platform aimed at presenting research projects, introducing young academics to the tradition of Symposiums and promoting the exchange of ideas between the industry and the academy. Especially the contributions of successful graduates of the Double Degree Master Program ‘Production Engineering and Management’ and those of other postgraduate researchers from several European countries have been enforced. This year’s special focus is on Direct Digital Manufacturing in the context of Industry 4.0, a topic of great interest for the global industry. The concept is spreading, but the actual solutions must be presented in order to highlight the practical benefits to industry and customers. Indeed, as Henning Banthien, Secretary General of the German ‘Plattform Industrie 4.0’ project office, has recently remarked, “Industry 4.0 requires a close alliance amongst the private sector, academia, politics and trade unions” in order to be “translated into practice and be implemented now”. PEM 2016 takes place between September 29 and 30, 2016 at the OWL University of Applied Sciences in Lemgo. The program is defined by the Organizing and Scientific Committees and clustered into scientific sessions covering topics of main interest and importance to the participants of the conference. The scientific sessions deal with technical and engineering issues, as well as management topics, and include contributions by researchers from academia and industry. The extended abstracts and full papers of the contributions underwent a double - blind review process. The 24 accepted presentations are assigned, according to their subject, to one of the following sessions: ‘Direct Digital Manufacturing in the Context of Industry 4.0’, ‘Industrial Engineering and Lean Management’, ‘Management Techniques and Methodologies’, ‘Wood Processing Technologies and Furniture Production’ and ‘Innovation Techniques and Methodologies

Publikationen an der Technischen Hochschule Ostwestfalen-Lippe

Space Station Freedom data management system growth and evolution report

Author: Bartlett R.
Davis G.
Gibson J.
Grant T. L.
Hedges R.
Johnson M. J.
Liu Y. K.
Patterson-Hine A.
Sliwa N.
Sowizral H.
Publication venue
Publication date
Field of study

The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS

NASA Technical Reports Server