2,771 research outputs found

    Big Data in Critical Infrastructures Security Monitoring: Challenges and Opportunities

    Full text link
    Critical Infrastructures (CIs), such as smart power grids, transport systems, and financial infrastructures, are more and more vulnerable to cyber threats, due to the adoption of commodity computing facilities. Despite the use of several monitoring tools, recent attacks have proven that current defensive mechanisms for CIs are not effective enough against most advanced threats. In this paper we explore the idea of a framework leveraging multiple data sources to improve protection capabilities of CIs. Challenges and opportunities are discussed along three main research directions: i) use of distinct and heterogeneous data sources, ii) monitoring with adaptive granularity, and iii) attack modeling and runtime combination of multiple data analysis techniques.Comment: EDCC-2014, BIG4CIP-201

    STAY FLEXIBLE: A PRESCRIPTIVE PROCESS MONITORING APPROACH FOR ENERGY FLEXIBILITY-ORIENTED PROCESS SCHEDULES

    Get PDF
    The transition of energy supply from fossil fuels to renewable energy sources poses major challenges for balancing increasingly weather-dependent energy supply and demand. Demand-side energy flexibility, offered particularly by companies, is seen as a promising and necessary approach to address these challenges. Process mining provides significant potential to prevent a deterioration of product quality or process flows due to flexibilization and allows for exploiting monetary benefits associated with flexible process operation. Hence, we follow the design science research paradigm to develop PM4Flex, a prescriptive process monitoring approach, that generates recommendations for pending process flows optimized under fluctuating power prices by implementing established energy flexibility measures. Thereby, we consider company- and process-specific constraints and historic event logs. We demonstrate and evaluate PM4Flex by implementing it as a software prototype and applying it to exemplary data from a heating and air conditioning company, observing considerable cost-savings of 1.42ct per kWh or 7.89%

    A Machine Learning Enhanced Scheme for Intelligent Network Management

    Get PDF
    The versatile networking services bring about huge influence on daily living styles while the amount and diversity of services cause high complexity of network systems. The network scale and complexity grow with the increasing infrastructure apparatuses, networking function, networking slices, and underlying architecture evolution. The conventional way is manual administration to maintain the large and complex platform, which makes effective and insightful management troublesome. A feasible and promising scheme is to extract insightful information from largely produced network data. The goal of this thesis is to use learning-based algorithms inspired by machine learning communities to discover valuable knowledge from substantial network data, which directly promotes intelligent management and maintenance. In the thesis, the management and maintenance focus on two schemes: network anomalies detection and root causes localization; critical traffic resource control and optimization. Firstly, the abundant network data wrap up informative messages but its heterogeneity and perplexity make diagnosis challenging. For unstructured logs, abstract and formatted log templates are extracted to regulate log records. An in-depth analysis framework based on heterogeneous data is proposed in order to detect the occurrence of faults and anomalies. It employs representation learning methods to map unstructured data into numerical features, and fuses the extracted feature for network anomaly and fault detection. The representation learning makes use of word2vec-based embedding technologies for semantic expression. Next, the fault and anomaly detection solely unveils the occurrence of events while failing to figure out the root causes for useful administration so that the fault localization opens a gate to narrow down the source of systematic anomalies. The extracted features are formed as the anomaly degree coupled with an importance ranking method to highlight the locations of anomalies in network systems. Two types of ranking modes are instantiated by PageRank and operation errors for jointly highlighting latent issue of locations. Besides the fault and anomaly detection, network traffic engineering deals with network communication and computation resource to optimize data traffic transferring efficiency. Especially when network traffic are constrained with communication conditions, a pro-active path planning scheme is helpful for efficient traffic controlling actions. Then a learning-based traffic planning algorithm is proposed based on sequence-to-sequence model to discover hidden reasonable paths from abundant traffic history data over the Software Defined Network architecture. Finally, traffic engineering merely based on empirical data is likely to result in stale and sub-optimal solutions, even ending up with worse situations. A resilient mechanism is required to adapt network flows based on context into a dynamic environment. Thus, a reinforcement learning-based scheme is put forward for dynamic data forwarding considering network resource status, which explicitly presents a promising performance improvement. In the end, the proposed anomaly processing framework strengthens the analysis and diagnosis for network system administrators through synthesized fault detection and root cause localization. The learning-based traffic engineering stimulates networking flow management via experienced data and further shows a promising direction of flexible traffic adjustment for ever-changing environments

    Incremental Discovery of Process Maps

    Get PDF
    Protsessikaeve on meetodite kogu, analüüsimaks protsesside teostuse jooksul loodud sündmuste logisid, et saada teavet nende parandamiseks. Protsessikaeve meetodite kogu, mida nimetatakse automatiseeritud protsessi avastuseks, lubab analüütikutel leida informatsiooni äriprotsesside mudelite kohta sündmuste logidest. Automatiseeritud protsessi avastusmeetodeid kasutatakse tavaliselt ühenduseta keskkonnas, mis tähendab, et protsessi mudel avastatakse hetketõmmisena tervest sündmuste logist. Samas on olukordi, kus uued juhtumid tulevad peale sellise suure kiirusega, et ei ole mõtet salvestada tervet sündmuste logi ja pidevalt nullist taasavastada mudelit. Selliste olukordade jaoks oleks vaja võrgus olevaid protsessi avastusmeetmeid. Andes sisendiks protsessi teostuse käigus loodud sündmuste voo, võrgus oleva protsessi avastusmeetodi eesmärk on järjepidevalt uuendada protsessi mudelit, tehes seda piiratud hulga mäluga ja säilitades sama täpsust, mida suudavad meetodid ühenduseta keskkondades. Olemasolevad meetodid vajavad palju mälu, et saavutada tulemusi, mis oleks võrreldavad ühenduseta keskkonnas saadud tulemustega. Käesolev lõputöö pakub välja võrgus oleva protsessi avastusraamistiku, ühtlustades protsessi avastus probleemi vähemälu haldusega ja kasutades vähemälu asenduspoliitikaid lahendamaks antud probleemi. Loodud raamistik on kirjutatud kasutades .NET-i, integreeritud Minit protsessikaeve tööriistaga ja analüüsitud kasutades elulisi ärijuhte.Process mining is a body of methods to analyze event logs produced during the execution of business processes in order to extract insights for their improvement. A family of process mining methods, known as automated process discovery, allows analysts to extract business process models from event logs. Traditional automated process discovery methods are intended to be used in an offline setting, meaning that the process model is extracted from a snapshot of an event log stored in its entirety. In some scenarios however, events keep coming with a high arrival rate to the extent that it is impractical to store the entire event log and to continuously re-discover a process model from scratch. Such scenarios require online automated process discovery approaches. Given an event stream produced by the execution of a business process, the goal of an online automated process discovery method is to maintain a continuously updated model of the process with a bounded amount of memory while at the same time achieving similar accuracy as offline methods. Existing automated discovery approaches require relatively large amounts of memory to achieve levels of accuracy comparable to that of offline methods. This thesis proposes a online process discovery framework that addresses this limitation by mapping the problem of online process discovery to that of cache memory management, and applying well-known cache replacement policies to the problem of online process discovery. The proposed framework has been implemented in .NET, integrated with the Minit process mining tool and comparatively evaluated against an existing baseline, using real-life datasets

    Intelligent Data Mining Techniques for Automatic Service Management

    Get PDF
    Today, as more and more industries are involved in the artificial intelligence era, all business enterprises constantly explore innovative ways to expand their outreach and fulfill the high requirements from customers, with the purpose of gaining a competitive advantage in the marketplace. However, the success of a business highly relies on its IT service. Value-creating activities of a business cannot be accomplished without solid and continuous delivery of IT services especially in the increasingly intricate and specialized world. Driven by both the growing complexity of IT environments and rapidly changing business needs, service providers are urgently seeking intelligent data mining and machine learning techniques to build a cognitive ``brain in IT service management, capable of automatically understanding, reasoning and learning from operational data collected from human engineers and virtual engineers during the IT service maintenance. The ultimate goal of IT service management optimization is to maximize the automation of IT routine procedures such as problem detection, determination, and resolution. However, to fully automate the entire IT routine procedure is still a challenging task without any human intervention. In the real IT system, both the step-wise resolution descriptions and scripted resolutions are often logged with their corresponding problematic incidents, which typically contain abundant valuable human domain knowledge. Hence, modeling, gathering and utilizing the domain knowledge from IT system maintenance logs act as an extremely crucial role in IT service management optimization. To optimize the IT service management from the perspective of intelligent data mining techniques, three research directions are identified and considered to be greatly helpful for automatic service management: (1) efficiently extract and organize the domain knowledge from IT system maintenance logs; (2) online collect and update the existing domain knowledge by interactively recommending the possible resolutions; (3) automatically discover the latent relation among scripted resolutions and intelligently suggest proper scripted resolutions for IT problems. My dissertation addresses these challenges mentioned above by designing and implementing a set of intelligent data-driven solutions including (1) constructing the domain knowledge base for problem resolution inference; (2) online recommending resolution in light of the explicit hierarchical resolution categories provided by domain experts; and (3) interactively recommending resolution with the latent resolution relations learned through a collaborative filtering model

    Data Mining Techniques to Understand Textual Data

    Get PDF
    More than ever, information delivery online and storage heavily rely on text. Billions of texts are produced every day in the form of documents, news, logs, search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Text understanding is a fundamental and essential task involving broad research topics, and contributes to many applications in the areas text summarization, search engine, recommendation systems, online advertising, conversational bot and so on. However, understanding text for computers is never a trivial task, especially for noisy and ambiguous text such as logs, search queries. This dissertation mainly focuses on textual understanding tasks derived from the two domains, i.e., disaster management and IT service management that mainly utilizing textual data as an information carrier. Improving situation awareness in disaster management and alleviating human efforts involved in IT service management dictates more intelligent and efficient solutions to understand the textual data acting as the main information carrier in the two domains. From the perspective of data mining, four directions are identified: (1) Intelligently generate a storyline summarizing the evolution of a hurricane from relevant online corpus; (2) Automatically recommending resolutions according to the textual symptom description in a ticket; (3) Gradually adapting the resolution recommendation system for time correlated features derived from text; (4) Efficiently learning distributed representation for short and lousy ticket symptom descriptions and resolutions. Provided with different types of textual data, data mining techniques proposed in those four research directions successfully address our tasks to understand and extract valuable knowledge from those textual data. My dissertation will address the research topics outlined above. Concretely, I will focus on designing and developing data mining methodologies to better understand textual information, including (1) a storyline generation method for efficient summarization of natural hurricanes based on crawled online corpus; (2) a recommendation framework for automated ticket resolution in IT service management; (3) an adaptive recommendation system on time-varying temporal correlated features derived from text; (4) a deep neural ranking model not only successfully recommending resolutions but also efficiently outputting distributed representation for ticket descriptions and resolutions

    Large Scale Data Mining for IT Service Management

    Get PDF
    More than ever, businesses heavily rely on IT service delivery to meet their current and frequently changing business requirements. Optimizing the quality of service delivery improves customer satisfaction and continues to be a critical driver for business growth. The routine maintenance procedure plays a key function in IT service management, which typically involves problem detection, determination and resolution for the service infrastructure. Many IT Service Providers adopt partial automation for incident diagnosis and resolution where the operation of the system administrators and automation operation are intertwined. Often the system administrators\u27 roles are limited to helping triage tickets to the processing teams for problem resolving. The processing teams are responsible to perform a complex root cause analysis, providing the system statistics, event and ticket data. A large scale of system statistics, event and ticket data aggravate the burden of problem diagnosis on both the system administrators and the processing teams during routine maintenance procedures. Alleviating human efforts involved in IT service management dictates intelligent and efficient solutions to maximize the automation of routine maintenance procedures. Three research directions are identified and considered to be helpful for IT service management optimization: (1) Automatically determine problem categories according to the symptom description in a ticket; (2) Intelligently discover interesting temporal patterns from system events; (3) Instantly identify temporal dependencies among system performance statistics data. Provided with ticket, event, and system performance statistics data, the three directions can be effectively addressed with a data-driven solution. The quality of IT service delivery can be improved in an efficient and effective way. The dissertation addresses the research topics outlined above. Concretely, we design and develop data-driven solutions to help system administrators better manage the system and alleviate the human efforts involved in IT Service management, including (1) a knowledge guided hierarchical multi-label classification method for IT problem category determination based on both the symptom description in a ticket and the domain knowledge from the system administrators; (2) an efficient expectation maximization approach for temporal event pattern discovery based on a parametric model; (3) an online inference on time-varying temporal dependency discovery from large-scale time series data

    PADAMOT : project overview report

    Get PDF
    Background and relevance to radioactive waste management International consensus confirms that placing radioactive wastes and spent nuclear fuel deep underground in a geological repository is the generally preferred option for their long-term management and disposal. This strategy provides a number of advantages compared to leaving it on or near the Earth’s surface. These advantages come about because, for a well chosen site, the geosphere can provide: • a physical barrier that can negate or buffer against the effects of surface dominated natural disruptive processes such as deep weathering, glaciation, river and marine erosion or flooding, asteroid/comet impact and earthquake shaking etc. • long and slow groundwater return pathways from the facility to the biosphere along which retardation, dilution and dispersion processes may operate to reduce radionuclide concentration in the groundwater. • a stable, and benign geochemical environment to maximise the longevity of the engineered barriers such as the waste containers and backfill in the facility. • a natural radiation shield around the wastes. • a mechanically stable environment in which the facility can be constructed and will afterwards be protected. • an environment which reduces the likelihood of the repository being disturbed by inadvertent human intrusion such as land use changes, construction projects, drilling, quarrying and mining etc. • protection against the effects of deliberate human activities such as vandalism, terrorism and war etc. However, safety considerations for storing and disposing of long-lived radioactive wastes must take into account various scenarios that might affect the ability of the geosphere to provide the functionality listed above. Therefore, in order to provide confidence in the ability of a repository to perform within the deep geological setting at a particular site, a demonstration of geosphere “stability” needs to be made. Stability is defined here to be the capacity of a geological and hydrogeological system to minimise the impact of external influences on the repository environment, or at least to account for them in a manner that would allow their impacts to be evaluated and accounted for in any safety assessments. A repository should be sited where the deep geosphere is a stable host in which the engineered containment can continue to perform according to design and in which the surrounding hydrogeological, geomechanical and geochemical environment will continue to operate as a natural barrier to radionuclide movement towards the biosphere. However, over the long periods of time during which long-lived radioactive wastes will pose a hazard, environmental change at the surface has the potential to disrupt the stability of the geosphere and therefore the causes of environmental change and their potential consequences need to be evaluated. As noted above, environmental change can include processes such as deep weathering, glaciation, river and marine erosion. It can also lead to changes in groundwater boundary conditions through alternating recharge/discharge relationships. One of the key drivers for environmental change is climate variability. The question then arises, how can geosphere stability be assessed with respect to changes in climate? Key issues raised in connection with this are: • What evidence is there that 'going underground' eliminates the extreme conditions that storage on the surface would be subjected to in the long term? • How can the additional stability and safety of the deep geosphere be demonstrated with evidence from the natural system? As a corollary to this, the capacity of repository sites deep underground in stable rock masses to mitigate potential impacts of future climate change on groundwater conditions therefore needs to be tested and demonstrated. To date, generic scenarios for groundwater evolution relating to climate change are currently weakly constrained by data and process understanding. Hence, the possibility of site-specific changes of groundwater conditions in the future can only be assessed and demonstrated by studying groundwater evolution in the past. Stability of groundwater conditions in the past is an indication of future stability, though both the climatic and geological contexts must be taken into account in making such an assertion
    corecore