2,050 research outputs found
Recommended from our members
A Cloud-based Framework for Shop Floor Big Data Management and Elastic Computing Analytics
Advanced digitalization together with the rise of disruptive Internet technologies are key enablers of a fundamental paradigm shift observed in industrial production. This is known as the fourth industrial revolution (Industry 4.0) which proposes the integration of the new generation of ICT solutions for the monitoring, adaptation, simulation, and optimisation of factories. With the democratization of sensors and actuators, factories and machine tools can now be sensorized and the data generated by these devices can be exploited, for instance, to optimise the utilization of the machines as well as their operation and maintenance. However, analyzing the vast amount of generated data is resource demanding both in terms of computing power and network bandwidth, thus requiring highly scalable solutions. This paper presents a novel big data approach and analytics framework for the management and analysis of machine generated data in the cloud. It brings together standard open source technologies and the exploitation of elastic computing, which, as a whole, can be adapted to and deployed on different cloud computing platforms. This enables reducing infrastructure costs, minimizing deployment difficulty and providing on-demand access to a virtually infinite set of computing power, storage and network resources.H202
Resource optimization of edge servers dealing with priority-based workloads by utilizing service level objective-aware virtual rebalancing
IoT enables profitable communication between sensor/actuator devices and the cloud. Slow network causing Edge data to lack Cloud analytics hinders real-time analytics adoption. VRebalance solves priority-based workload performance for stream processing at the Edge. BO is used in VRebalance to prioritize workloads and find optimal resource configurations for efficient resource management. Apache Storm platform was used with RIoTBench IoT benchmark tool for real-time stream processing. Tools were used to evaluate VRebalance. Study shows VRebalance is more effective than traditional methods, meeting SLO targets despite system changes. VRebalance decreased SLO violation rates by almost 30% for static priority-based workloads and 52.2% for dynamic priority-based workloads compared to hill climbing algorithm. Using VRebalance decreased SLO violations by 66.1% compared to Apache Storm\u27s default allocation
Habitat Use of Blacktip Sharks (Carcharhinus limbatus) at Fishing Piers
Blacktip sharks (Carcharhinus limbatus) can be observed near fishing piers throughout the summer along the northeast coast of South Carolina. These piers attract and support a wide variety of potential prey and sharks are able to forage on fishers’ discards with minimal energetic cost. I tagged 12 blacktip sharks with acoustic transmitters, monitored piers with acoustic receivers, and conducted pier-creel surveys to determine the habitat use of blacktip sharks at fishing piers, factors that influenced residence time and presence/absence at piers, and any cyclical patterns in visits to piers. Data were analyzed with pier association indices (PAI), mixed models, and fast Fourier transformation analyses. While the majority of monitored sharks were infrequently detected at piers, four (33.3%) displayed a high degree of fidelity at piers. Two sharks (16.7%) were detected only at the pier where they were tagged, whereas two other individuals were detected at all monitored piers in 2017. The most likely model for shark residence time at piers included terms for pier location and diel cycle (wi = 0.52), while the most likely model explaining presence/absence of sharks at piers included terms for tidal height and diel cycle (wi = 0.95). Sharks did not display cyclical patterns in detections at piers. To my knowledge, this is the first study to specifically examine the habitat use of blacktip sharks at fishing piers. My data suggests that fidelity of sharks at piers is a phenomenon for some of the tagged sharks, but not all
Using sensor web technologies to help predict and monitor floods in urban areas
Includes abstract.Includes bibliographical references.Since flooding is worldwide one of the most common natural disasters, a number of flood prediction and monitoring approaches have been used. A lot of research has been conducted on the prediction and monitoring of floods by using hydrological models. The problem is that current hydrological models do not offer Disaster Management officials or township residents with timely data and information. In South Africa, possible flood warnings are usually communicated by Disaster Management officials using traditional approaches such as loudspeakers, radio and Television (TV). Making calls to warn residents about the possible occurrence of floods by using such means are, however, neither sufficient nor effective. As the result of improved communication, sensor, software and computing capabilities, the use of sensor networks and sensor web for predicting and monitoring environment have been considered in recent years. In order for sensor data such as sensor measurements, sensor descriptions and alerts to be integrated, the Open Geospatial Consortium (OGC) introduced the Sensor Web enablement (SWE) standards and suggested different specifications with respect to the geospatial sensor web. The first implementation of the sensor web framework is available. In this research, the results of using the sensor web technologies for predicting and monitoring floods in the urban areas are presented. The aim of this research project is to illustrate how the sensor web technology can help in the prediction and monitoring of floods in the urban areas, particularly in the Alexandra Township (Greater Johannesburg) which has experienced floods each and every year. The focus of this research is on the incorporation of the sensor data into the sensor web technology. The data used as input to sensor web and the hydrological model was historical rainfall data from the South African Weather Service (SAWS). Shuttle Radar Topography Mission (SRTM) free data from the internet was also used in this research
Data-Driven Intelligent Scheduling For Long Running Workloads In Large-Scale Datacenters
Cloud computing is becoming a fundamental facility of society today. Large-scale public or private cloud datacenters spreading millions of servers, as a warehouse-scale computer, are supporting most business of Fortune-500 companies and serving billions of users around the world. Unfortunately, modern industry-wide average datacenter utilization is as low as 6% to 12%. Low utilization not only negatively impacts operational and capital components of cost efficiency, but also becomes the scaling bottleneck due to the limits of electricity delivered by nearby utility. It is critical and challenge to improve multi-resource efficiency for global datacenters.
Additionally, with the great commercial success of diverse big data analytics services, enterprise datacenters are evolving to host heterogeneous computation workloads including online web services, batch processing, machine learning, streaming computing, interactive query and graph computation on shared clusters. Most of them are long-running workloads that leverage long-lived containers to execute tasks.
We concluded datacenter resource scheduling works over last 15 years. Most previous works are designed to maximize the cluster efficiency for short-lived tasks in batch processing system like Hadoop. They are not suitable for modern long-running workloads of Microservices, Spark, Flink, Pregel, Storm or Tensorflow like systems. It is urgent to develop new effective scheduling and resource allocation approaches to improve efficiency in large-scale enterprise datacenters.
In the dissertation, we are the first of works to define and identify the problems, challenges and scenarios of scheduling and resource management for diverse long-running workloads in modern datacenter. They rely on predictive scheduling techniques to perform reservation, auto-scaling, migration or rescheduling. It forces us to pursue and explore more intelligent scheduling techniques by adequate predictive knowledges. We innovatively specify what is intelligent scheduling, what abilities are necessary towards intelligent scheduling, how to leverage intelligent scheduling to transfer NP-hard online scheduling problems to resolvable offline scheduling issues.
We designed and implemented an intelligent cloud datacenter scheduler, which automatically performs resource-to-performance modeling, predictive optimal reservation estimation, QoS (interference)-aware predictive scheduling to maximize resource efficiency of multi-dimensions (CPU, Memory, Network, Disk I/O), and strictly guarantee service level agreements (SLA) for long-running workloads.
Finally, we introduced a large-scale co-location techniques of executing long-running and other workloads on the shared global datacenter infrastructure of Alibaba Group. It effectively improves cluster utilization from 10% to averagely 50%. It is far more complicated beyond scheduling that involves technique evolutions of IDC, network, physical datacenter topology, storage, server hardwares, operating systems and containerization. We demonstrate its effectiveness by analysis of newest Alibaba public cluster trace in 2017. We are the first of works to reveal the global view of scenarios, challenges and status in Alibaba large-scale global datacenters by data demonstration, including big promotion events like Double 11 .
Data-driven intelligent scheduling methodologies and effective infrastructure co-location techniques are critical and necessary to pursue maximized multi-resource efficiency in modern large-scale datacenter, especially for long-running workloads
System Support For Stream Processing In Collaborative Cloud-Edge Environment
Stream processing is a critical technique to process huge amount of data in real-time manner.
Cloud computing has been used for stream processing due to its unlimited computation
resources. At the same time, we are entering the era of Internet of Everything (IoE). The emerging
edge computing benefits low-latency applications by leveraging computation resources at
the proximity of data sources. Billions of sensors and actuators are being deployed worldwide
and huge amount of data generated by things are immersed in our daily life. It has become
essential for organizations to be able to stream and analyze data, and provide low-latency analytics
on streaming data. However, cloud computing is inefficient to process all data in a centralized
environment in terms of the network bandwidth cost and response latency. Although
edge computing offloads computation from the cloud to the edge of the Internet, there is not
a data sharing and processing framework that efficiently utilizes computation resources in the
cloud and the edge. Furthermore, the heterogeneity of edge devices brings more difficulty to the development of collaborative cloud-edge applications.
To explore and attack the challenges of stream processing system in collaborative cloudedge
environment, in this dissertation we design and develop a series of systems to support
stream processing applications in hybrid cloud-edge analytics. Specifically, we develop an
hierarchical and hybrid outlier detection model for multivariate time series streams that automatically
selects the best model for different time series. We optimize one of the stream
processing system (i.e., Spark Streaming) to reduce the end-to-end latency. To facilitate the
development of collaborative cloud-edge applications, we propose and implement a new computing
framework, Firework that allows stakeholders to share and process data by leveraging
both the cloud and the edge. A vision-based cloud-edge application is implemented to demonstrate
the capabilities of Firework. By combining all these studies, we provide comprehensive
system support for stream processing in collaborative cloud-edge environment
- …