6,548 research outputs found

    Engineering Crowdsourced Stream Processing Systems

    Full text link
    A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort

    Effective Use Methods for Continuous Sensor Data Streams in Manufacturing Quality Control

    Get PDF
    This work outlines an approach for managing sensor data streams of continuous numerical data in product manufacturing settings, emphasizing statistical process control, low computational and memory overhead, and saving information necessary to reduce the impact of nonconformance to quality specifications. While there is extensive literature, knowledge, and documentation about standard data sources and databases, the high volume and velocity of sensor data streams often makes traditional analysis unfeasible. To that end, an overview of data stream fundamentals is essential. An analysis of commonly used stream preprocessing and load shedding methods follows, succeeded by a discussion of aggregation procedures. Stream storage and querying systems are the next topics. Further, existing machine learning techniques for data streams are presented, with a focus on regression. Finally, the work describes a novel methodology for managing sensor data streams in which data stream management systems save and record aggregate data from small time intervals, and individual measurements from the stream that are nonconforming. The aggregates shall be continually entered into control charts and regressed on. To conserve memory, old data shall be periodically reaggregated at higher levels to reduce memory consumption

    Secure Data Management and Transmission Infrastructure for the Future Smart Grid

    Get PDF
    Power grid has played a crucial role since its inception in the Industrial Age. It has evolved from a wide network supplying energy for incorporated multiple areas to the largest cyber-physical system. Its security and reliability are crucial to any country’s economy and stability [1]. With the emergence of the new technologies and the growing pressure of the global warming, the aging power grid can no longer meet the requirements of the modern industry, which leads to the proposal of ‘smart grid’. In smart grid, both electricity and control information communicate in a massively distributed power network. It is essential for smart grid to deliver real-time data by communication network. By using smart meter, AMI can measure energy consumption, monitor loads, collect data and forward information to collectors. Smart grid is an intelligent network consists of many technologies in not only power but also information, telecommunications and control. The most famous structure of smart grid is the three-layer structure. It divides smart grid into three different layers, each layer has its own duty. All these three layers work together, providing us a smart grid that monitor and optimize the operations of all functional units from power generation to all the end-customers [2]. To enhance the security level of future smart grid, deploying a high secure level data transmission scheme on critical nodes is an effective and practical approach. A critical node is a communication node in a cyber-physical network which can be developed to meet certain requirements. It also has firewalls and capability of intrusion detection, so it is useful for a time-critical network system, in other words, it is suitable for future smart grid. The deployment of such a scheme can be tricky regarding to different network topologies. A simple and general way is to install it on every node in the network, that is to say all nodes in this network are critical nodes, but this way takes time, energy and money. Obviously, it is not the best way to do so. Thus, we propose a multi-objective evolutionary algorithm for the searching of critical nodes. A new scheme should be proposed for smart grid. Also, an optimal planning in power grid for embedding large system can effectively ensure every power station and substation to operate safely and detect anomalies in time. Using such a new method is a reliable method to meet increasing security challenges. The evolutionary frame helps in getting optimum without calculating the gradient of the objective function. In the meanwhile, a means of decomposition is useful for exploring solutions evenly in decision space. Furthermore, constraints handling technologies can place critical nodes on optimal locations so as to enhance system security even with several constraints of limited resources and/or hardware. The high-quality experimental results have validated the efficiency and applicability of the proposed approach. It has good reason to believe that the new algorithm has a promising space over the real-world multi-objective optimization problems extracted from power grid security domain. In this thesis, a cloud-based information infrastructure is proposed to deal with the big data storage and computation problems for the future smart grid, some challenges and limitations are addressed, and a new secure data management and transmission strategy regarding increasing security challenges of future smart grid are given as well

    Secure Data Management and Transmission Infrastructure for the Future Smart Grid

    Get PDF
    Power grid has played a crucial role since its inception in the Industrial Age. It has evolved from a wide network supplying energy for incorporated multiple areas to the largest cyber-physical system. Its security and reliability are crucial to any country’s economy and stability [1]. With the emergence of the new technologies and the growing pressure of the global warming, the aging power grid can no longer meet the requirements of the modern industry, which leads to the proposal of ‘smart grid’. In smart grid, both electricity and control information communicate in a massively distributed power network. It is essential for smart grid to deliver real-time data by communication network. By using smart meter, AMI can measure energy consumption, monitor loads, collect data and forward information to collectors. Smart grid is an intelligent network consists of many technologies in not only power but also information, telecommunications and control. The most famous structure of smart grid is the three-layer structure. It divides smart grid into three different layers, each layer has its own duty. All these three layers work together, providing us a smart grid that monitor and optimize the operations of all functional units from power generation to all the end-customers [2]. To enhance the security level of future smart grid, deploying a high secure level data transmission scheme on critical nodes is an effective and practical approach. A critical node is a communication node in a cyber-physical network which can be developed to meet certain requirements. It also has firewalls and capability of intrusion detection, so it is useful for a time-critical network system, in other words, it is suitable for future smart grid. The deployment of such a scheme can be tricky regarding to different network topologies. A simple and general way is to install it on every node in the network, that is to say all nodes in this network are critical nodes, but this way takes time, energy and money. Obviously, it is not the best way to do so. Thus, we propose a multi-objective evolutionary algorithm for the searching of critical nodes. A new scheme should be proposed for smart grid. Also, an optimal planning in power grid for embedding large system can effectively ensure every power station and substation to operate safely and detect anomalies in time. Using such a new method is a reliable method to meet increasing security challenges. The evolutionary frame helps in getting optimum without calculating the gradient of the objective function. In the meanwhile, a means of decomposition is useful for exploring solutions evenly in decision space. Furthermore, constraints handling technologies can place critical nodes on optimal locations so as to enhance system security even with several constraints of limited resources and/or hardware. The high-quality experimental results have validated the efficiency and applicability of the proposed approach. It has good reason to believe that the new algorithm has a promising space over the real-world multi-objective optimization problems extracted from power grid security domain. In this thesis, a cloud-based information infrastructure is proposed to deal with the big data storage and computation problems for the future smart grid, some challenges and limitations are addressed, and a new secure data management and transmission strategy regarding increasing security challenges of future smart grid are given as well

    The problem of scale in the prediction and management of pathogen spillover

    Get PDF
    Disease emergence events, epidemics and pandemics all underscore the need to predict zoonotic pathogen spillover. Because cross-species transmission is inherently hierarchical, involving processes that occur at varying levels of biological organization, such predictive efforts can be complicated by the many scales and vastness of data potentially required for forecasting. A wide range of approaches are currently used to forecast spillover risk (e.g. macroecology, pathogen discovery, surveillance of human populations, among others), each of which is bound within particular phylogenetic, spatial and temporal scales of prediction. Here, we contextualize these diverse approaches within their forecasting goals and resulting scales of prediction to illustrate critical areas of conceptual and pragmatic overlap. Specifically, we focus on an ecological perspective to envision a research pipeline that connects these different scales of data and predictions from the aims of discovery to intervention. Pathogen discovery and predictions focused at the phylogenetic scale can first provide coarse and pattern-based guidance for which reservoirs, vectors and pathogens are likely to be involved in spillover, thereby narrowing surveillance targets and where such efforts should be conducted. Next, these predictions can be followed with ecologically driven spatio-temporal studies of reservoirs and vectors to quantify spatio-temporal fluctuations in infection and to mechanistically understand how pathogens circulate and are transmitted to humans. This approach can also help identify general regions and periods for which spillover is most likely. We illustrate this point by highlighting several case studies where long-term, ecologically focused studies (e.g. Lyme disease in the northeast USA, Hendra virus in eastern Australia, Plasmodium knowlesi in Southeast Asia) have facilitated predicting spillover in space and time and facilitated the design of possible intervention strategies. Such studies can in turn help narrow human surveillance efforts and help refine and improve future large-scale, phylogenetic predictions. We conclude by discussing how greater integration and exchange between data and predictions generated across these varying scales could ultimately help generate more actionable forecasts and interventions

    Impact Assessment, Detection, and Mitigation of False Data Attacks in Electrical Power Systems

    Get PDF
    The global energy market has seen a massive increase in investment and capital flow in the last few decades. This has completely transformed the way power grids operate - legacy systems are now being replaced by advanced smart grid infrastructures that attest to better connectivity and increased reliability. One popular example is the extensive deployment of phasor measurement units, which is referred to PMUs, that constantly provide time-synchronized phasor measurements at a high resolution compared to conventional meters. This enables system operators to monitor in real-time the vast electrical network spanning thousands of miles. However, a targeted cyber attack on PMUs can prompt operators to take wrong actions that can eventually jeopardize the power system reliability. Such threats originating from the cyber-space continue to increase as power grids become more dependent on PMU communication networks. Additionally, these threats are becoming increasingly efficient in remaining undetected for longer periods while gaining deep access into the power networks. An attack on the energy sector immediately impacts national defense, emergency services, and all aspects of human life. Cyber attacks against the electric grid may soon become a tactic of high-intensity warfare between nations in near future and lead to social disorder. Within this context, this dissertation investigates the cyber security of PMUs that affects critical decision-making for a reliable operation of the power grid. In particular, this dissertation focuses on false data attacks, a key vulnerability in the PMU architecture, that inject, alter, block, or delete data in devices or in communication network channels. This dissertation addresses three important cyber security aspects - (1) impact assessment, (2) detection, and (3) mitigation of false data attacks. A comprehensive background of false data attack models targeting various steady-state control blocks is first presented. By investigating inter-dependencies between the cyber and the physical layers, this dissertation then identifies possible points of ingress and categorizes risk at different levels of threats. In particular, the likelihood of cyber attacks against the steady-state power system control block causing the worst-case impacts such as cascading failures is investigated. The case study results indicate that false data attacks do not often lead to widespread blackouts, but do result in subsequent line overloads and load shedding. The impacts are magnified when attacks are coordinated with physical failures of generators, transformers, or heavily loaded lines. Further, this dissertation develops a data-driven false data attack detection method that is independent of existing in-built security mechanisms in the state estimator. It is observed that a convolutional neural network classifier can quickly detect and isolate false measurements compared to other deep learning and traditional classifiers. Finally, this dissertation develops a recovery plan that minimizes the consequence of threats when sophisticated attacks remain undetected and have already caused multiple failures. Two new controlled islanding methods are developed that minimize the impact of attacks under the lack of, or partial information on the threats. The results indicate that the system operators can successfully contain the negative impacts of cyber attacks while creating stable and observable islands. Overall, this dissertation presents a comprehensive plan for fast and effective detection and mitigation of false data attacks, improving cyber security preparedness, and enabling continuity of operations

    Impact Assessment, Detection, And Mitigation Of False Data Attacks In Electrical Power Systems

    Get PDF
    The global energy market has seen a massive increase in investment and capital flow in the last few decades. This has completely transformed the way power grids operate - legacy systems are now being replaced by advanced smart grid infrastructures that attest to better connectivity and increased reliability. One popular example is the extensive deployment of phasor measurement units, which is referred to PMUs, that constantly provide time-synchronized phasor measurements at a high resolution compared to conventional meters. This enables system operators to monitor in real-time the vast electrical network spanning thousands of miles. However, a targeted cyber attack on PMUs can prompt operators to take wrong actions that can eventually jeopardize the power system reliability. Such threats originating from the cyber-space continue to increase as power grids become more dependent on PMU communication networks. Additionally, these threats are becoming increasingly efficient in remaining undetected for longer periods while gaining deep access into the power networks. An attack on the energy sector immediately impacts national defense, emergency services, and all aspects of human life. Cyber attacks against the electric grid may soon become a tactic of high-intensity warfare between nations in near future and lead to social disorder. Within this context, this dissertation investigates the cyber security of PMUs that affects critical decision-making for a reliable operation of the power grid. In particular, this dissertation focuses on false data attacks, a key vulnerability in the PMU architecture, that inject, alter, block, or delete data in devices or in communication network channels. This dissertation addresses three important cyber security aspects - (1) impact assessment, (2) detection, and (3) mitigation of false data attacks. A comprehensive background of false data attack models targeting various steady-state control blocks is first presented. By investigating inter-dependencies between the cyber and the physical layers, this dissertation then identifies possible points of ingress and categorizes risk at different levels of threats. In particular, the likelihood of cyber attacks against the steady-state power system control block causing the worst-case impacts such as cascading failures is investigated. The case study results indicate that false data attacks do not often lead to widespread blackouts, but do result in subsequent line overloads and load shedding. The impacts are magnified when attacks are coordinated with physical failures of generators, transformers, or heavily loaded lines. Further, this dissertation develops a data-driven false data attack detection method that is independent of existing in-built security mechanisms in the state estimator. It is observed that a convolutional neural network classifier can quickly detect and isolate false measurements compared to other deep learning and traditional classifiers. Finally, this dissertation develops a recovery plan that minimizes the consequence of threats when sophisticated attacks remain undetected and have already caused multiple failures. Two new controlled islanding methods are developed that minimize the impact of attacks under the lack of, or partial information on the threats. The results indicate that the system operators can successfully contain the negative impacts of cyber attacks while creating stable and observable islands. Overall, this dissertation presents a comprehensive plan for fast and effective detection and mitigation of false data attacks, improving cyber security preparedness, and enabling continuity of operations

    Attributes of Big Data Analytics for Data-Driven Decision Making in Cyber-Physical Power Systems

    Get PDF
    Big data analytics is a virtually new term in power system terminology. This concept delves into the way a massive volume of data is acquired, processed, analyzed to extract insight from available data. In particular, big data analytics alludes to applications of artificial intelligence, machine learning techniques, data mining techniques, time-series forecasting methods. Decision-makers in power systems have been long plagued by incapability and weakness of classical methods in dealing with large-scale real practical cases due to the existence of thousands or millions of variables, being time-consuming, the requirement of a high computation burden, divergence of results, unjustifiable errors, and poor accuracy of the model. Big data analytics is an ongoing topic, which pinpoints how to extract insights from these large data sets. The extant article has enumerated the applications of big data analytics in future power systems through several layers from grid-scale to local-scale. Big data analytics has many applications in the areas of smart grid implementation, electricity markets, execution of collaborative operation schemes, enhancement of microgrid operation autonomy, management of electric vehicle operations in smart grids, active distribution network control, district hub system management, multi-agent energy systems, electricity theft detection, stability and security assessment by PMUs, and better exploitation of renewable energy sources. The employment of big data analytics entails some prerequisites, such as the proliferation of IoT-enabled devices, easily-accessible cloud space, blockchain, etc. This paper has comprehensively conducted an extensive review of the applications of big data analytics along with the prevailing challenges and solutions
    • …
    corecore