649 research outputs found

    kLog: A Language for Logical and Relational Learning with Kernels

    Full text link
    We introduce kLog, a novel approach to statistical relational learning. Unlike standard approaches, kLog does not represent a probability distribution directly. It is rather a language to perform kernel-based learning on expressive logical and relational representations. kLog allows users to specify learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming, and deductive databases. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph --- in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. kLog supports mixed numerical and symbolic data, as well as background knowledge in the form of Prolog or Datalog programs as in inductive logic programming systems. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. We also report about empirical comparisons, showing that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at http://klog.dinfo.unifi.it along with tutorials

    Data Mining in Internet of Things Systems: A Literature Review

    Get PDF
    The Internet of Things (IoT) and cloud technologies have been the main focus of recent research, allowing for the accumulation of a vast amount of data generated from this diverse environment. These data include without any doubt priceless knowledge if could correctly discovered and correlated in an efficient manner. Data mining algorithms can be applied to the Internet of Things (IoT) to extract hidden information from the massive amounts of data that are generated by IoT and are thought to have high business value. In this paper, the most important data mining approaches covering classification, clustering, association analysis, time series analysis, and outlier analysis from the knowledge will be covered. Additionally, a survey of recent work in in this direction is included. Another significant challenges in the field are collecting, storing, and managing the large number of devices along with their associated features. In this paper, a deep look on the data mining for the IoT platforms will be given concentrating on real applications found in the literatur

    Probabilistic multiple kernel learning

    Get PDF
    The integration of multiple and possibly heterogeneous information sources for an overall decision-making process has been an open and unresolved research direction in computing science since its very beginning. This thesis attempts to address parts of that direction by proposing probabilistic data integration algorithms for multiclass decisions where an observation of interest is assigned to one of many categories based on a plurality of information channels

    Dynamic Compressive Sensing of Time-Varying Signals via Approximate Message Passing

    Full text link
    In this work the dynamic compressive sensing (CS) problem of recovering sparse, correlated, time-varying signals from sub-Nyquist, non-adaptive, linear measurements is explored from a Bayesian perspective. While there has been a handful of previously proposed Bayesian dynamic CS algorithms in the literature, the ability to perform inference on high-dimensional problems in a computationally efficient manner remains elusive. In response, we propose a probabilistic dynamic CS signal model that captures both amplitude and support correlation structure, and describe an approximate message passing algorithm that performs soft signal estimation and support detection with a computational complexity that is linear in all problem dimensions. The algorithm, DCS-AMP, can perform either causal filtering or non-causal smoothing, and is capable of learning model parameters adaptively from the data through an expectation-maximization learning procedure. We provide numerical evidence that DCS-AMP performs within 3 dB of oracle bounds on synthetic data under a variety of operating conditions. We further describe the result of applying DCS-AMP to two real dynamic CS datasets, as well as a frequency estimation task, to bolster our claim that DCS-AMP is capable of offering state-of-the-art performance and speed on real-world high-dimensional problems.Comment: 32 pages, 7 figure

    Multi-level Safety Performance Functions For High Speed Facilities

    Get PDF
    High speed facilities are considered the backbone of any successful transportation system; Interstates, freeways, and expressways carry the majority of daily trips on the transportation network. Although these types of roads are relatively considered the safest among other types of roads, they still experience many crashes, many of which are severe, which not only affect human lives but also can have tremendous economical and social impacts. These facts signify the necessity of enhancing the safety of these high speed facilities to ensure better and efficient operation. Safety problems could be assessed through several approaches that can help in mitigating the crash risk on long and short term basis. Therefore, the main focus of the research in this dissertation is to provide a framework of risk assessment to promote safety and enhance mobility on freeways and expressways. Multi-level Safety Performance Functions (SPFs) were developed at the aggregate level using historical crash data and the corresponding exposure and risk factors to identify and rank sites with promise (hot-spots). Additionally, SPFs were developed at the disaggregate level utilizing real-time weather data collected from meteorological stations located at the freeway section as well as traffic flow parameters collected from different detection systems such as Automatic Vehicle Identification (AVI) and Remote Traffic Microwave Sensors (RTMS). These disaggregate SPFs can identify real-time risks due to turbulent traffic conditions and their interactions with other risk factors. In this study, two main datasets were obtained from two different regions. Those datasets comprise historical crash data, roadway geometrical characteristics, aggregate weather and traffic parameters as well as real-time weather and traffic data. iii At the aggregate level, Bayesian hierarchical models with spatial and random effects were compared to Poisson models to examine the safety effects of roadway geometrics on crash occurrence along freeway sections that feature mountainous terrain and adverse weather. At the disaggregate level; a main framework of a proactive safety management system using traffic data collected from AVI and RTMS, real-time weather and geometrical characteristics was provided. Different statistical techniques were implemented. These techniques ranged from classical frequentist classification approaches to explain the relationship between an event (crash) occurring at a given time and a set of risk factors in real time to other more advanced models. Bayesian statistics with updating approach to update beliefs about the behavior of the parameter with prior knowledge in order to achieve more reliable estimation was implemented. Also a relatively recent and promising Machine Learning technique (Stochastic Gradient Boosting) was utilized to calibrate several models utilizing different datasets collected from mixed detection systems as well as real-time meteorological stations. The results from this study suggest that both levels of analyses are important, the aggregate level helps in providing good understanding of different safety problems, and developing policies and countermeasures to reduce the number of crashes in total. At the disaggregate level, real-time safety functions help toward more proactive traffic management system that will not only enhance the performance of the high speed facilities and the whole traffic network but also provide safer mobility for people and goods. In general, the proposed multi-level analyses are useful in providing roadway authorities with detailed information on where countermeasures must be implemented and when resources should be devoted. The study also proves that traffic data collected from different detection systems could be a useful asset that should be utilized iv appropriately not only to alleviate traffic congestion but also to mitigate increased safety risks. The overall proposed framework can maximize the benefit of the existing archived data for freeway authorities as well as for road users

    Bayesian Networks for Decision-Making and Causal Analysis under Uncertainty in Aviation

    Get PDF
    Most decisions in aviation regarding systems and operation are currently taken under uncertainty, relaying in limited measurable information, and with little assistance of formal methods and tools to help decision makers to cope with all those uncertainties. This chapter illustrates how Bayesian analysis can constitute a systematic approach for dealing with uncertainties in aviation and air transport. The chapter addresses the three main ways in which Bayesian networks are currently employed for scientific or regulatory decision-making purposes in the aviation industry, depending on the extent to which decision makers rely totally or partially on formal methods. These three alternatives are illustrated with three aviation case studies that reflect research work carried out by the authors

    Uncovering gravitational-wave backgrounds from noises of unknown shape with LISA

    Get PDF
    Detecting stochastic background radiation of cosmological origin is an exciting possibility for current and future gravitational-wave (GW) detectors. However, distinguishing it from other stochastic processes, such as instrumental noise and astrophysical backgrounds, is challenging. It is even more delicate for the space-based GW observatory LISA since it cannot correlate its observations with other detectors, unlike today's terrestrial network. Nonetheless, with multiple measurements across the constellation and high accuracy in the noise level, detection is still possible. In the context of GW background detection, previous studies have assumed that instrumental noise has a known, possibly parameterized, spectral shape. To make our analysis robust against imperfect knowledge of the instrumental noise, we challenge this crucial assumption and assume that the single-link interferometric noises have an arbitrary and unknown spectrum. We investigate possible ways of separating instrumental and GW contributions by using realistic LISA data simulations with time-varying arms and second-generation time-delay interferometry. By fitting a generic spline model to the interferometer noise and a power-law template to the signal, we can detect GW stochastic backgrounds up to energy density levels comparable with fixed-shape models. We also demonstrate that we can probe a region of the GW background parameter space that today's detectors cannot access

    Uncertainty in Engineering

    Get PDF
    This open access book provides an introduction to uncertainty quantification in engineering. Starting with preliminaries on Bayesian statistics and Monte Carlo methods, followed by material on imprecise probabilities, it then focuses on reliability theory and simulation methods for complex systems. The final two chapters discuss various aspects of aerospace engineering, considering stochastic model updating from an imprecise Bayesian perspective, and uncertainty quantification for aerospace flight modelling. Written by experts in the subject, and based on lectures given at the Second Training School of the European Research and Training Network UTOPIAE (Uncertainty Treatment and Optimization in Aerospace Engineering), which took place at Durham University (United Kingdom) from 2 to 6 July 2018, the book offers an essential resource for students as well as scientists and practitioners

    A Machine Learning Approach For Enhancing Security And Quality Of Service Of Optical Burst Switching Networks

    Get PDF
    The Optical Bust Switching (OBS) network has become one of the most promising switching technologies for building the next-generation of internet backbone infrastructure. However, OBS networks still face a number of security and Quality of Service (QoS) challenges, particularly from Burst Header Packet (BHP) flooding attacks. In OBS, a core switch handles requests, reserving one of the unoccupied channels for incoming data bursts (DB) through BHP. An attacker can exploit this fact and send malicious BHP without the corresponding DB. If unresolved, threats such as BHP flooding attacks can result in low bandwidth utilization, limited network performance, high burst loss rate, and eventually, denial of service (DoS). In this dissertation, we focus our investigations on the network security and QoS in the presence of BHP flooding attacks. First, we proposed and developed a new security model that can be embedded into OBS core switch architecture to prevent BHP flooding attacks. The countermeasure security model allows the OBS core switch to classify the ingress nodes based on their behavior and the amount of reserved resources not being utilized. A malicious node causing a BHP flooding attack will be blocked by the developed model until the risk disappears or the malicious node redeems itself. Using our security model, we can effectively and preemptively prevent a BHP flooding attack regardless of the strength of the attacker. In the second part of this dissertation, we investigated the potential use of machine learning (ML) in countering the risk of the BHP flood attack problem. In particular, we proposed and developed a new series of rules, using the decision tree method to prevent the risk of a BHP flooding attack. The proposed classification rule models were evaluated using different metrics to measure the overall performance of this approach. The experiments showed that using rules derived from the decision trees did indeed counter BHP flooding attacks, and enabled the automatic classification of edge nodes at an early stage. In the third part of this dissertation, we performed a comparative study, evaluating a number of ML techniques in classifying edge nodes, to determine the most suitable ML method to prevent this type of attack. The experimental results from a preprocessed dataset related to BHP flooding attacks showed that rule-based classifiers, in particular decision trees (C4.5), Bagging, and RIDOR, consistently derive classifiers that are more predictive, compared to alternate ML algorithms, including AdaBoost, Logistic Regression, Naïve Bayes, SVM-SMO and ANN-MultilayerPerceptron. Moreover, the harmonic mean, recall and precision results of the rule-based and tree classifiers were more competitive than those of the remaining ML algorithms. Lastly, the runtime results in ms showed that decision tree classifiers are not only more predictive, but are also more efficient than other algorithms. Thus, our findings show that decision tree identifier is the most appropriate technique for classifying ingress nodes to combat the BHP flooding attack problem
    corecore