326 research outputs found

    Structured learning of sum-of-submodular higher order energy functions

    Full text link
    Submodular functions can be exactly minimized in polynomial time, and the special case that graph cuts solve with max flow \cite{KZ:PAMI04} has had significant impact in computer vision \cite{BVZ:PAMI01,Kwatra:SIGGRAPH03,Rother:GrabCut04}. In this paper we address the important class of sum-of-submodular (SoS) functions \cite{Arora:ECCV12,Kolmogorov:DAM12}, which can be efficiently minimized via a variant of max flow called submodular flow \cite{Edmonds:ADM77}. SoS functions can naturally express higher order priors involving, e.g., local image patches; however, it is difficult to fully exploit their expressive power because they have so many parameters. Rather than trying to formulate existing higher order priors as an SoS function, we take a discriminative learning approach, effectively searching the space of SoS functions for a higher order prior that performs well on our training set. We adopt a structural SVM approach \cite{Joachims/etal/09a,Tsochantaridis/etal/04} and formulate the training problem in terms of quadratic programming; as a result we can efficiently search the space of SoS priors via an extended cutting-plane algorithm. We also show how the state-of-the-art max flow method for vision problems \cite{Goldberg:ESA11} can be modified to efficiently solve the submodular flow problem. Experimental comparisons are made against the OpenCV implementation of the GrabCut interactive segmentation technique \cite{Rother:GrabCut04}, which uses hand-tuned parameters instead of machine learning. On a standard dataset \cite{Gulshan:CVPR10} our method learns higher order priors with hundreds of parameter values, and produces significantly better segmentations. While our focus is on binary labeling problems, we show that our techniques can be naturally generalized to handle more than two labels

    Structured learning of sum-of-submodular higher order energy functions

    Get PDF
    Submodular functions can be exactly minimized in polynomial time, and the special case that graph cuts solve with max flow [18] has had significant impact in computer vision [5, 20, 27]. In this paper we address the important class of sum-of-submodular (SoS) functions [2, 17], which can be efficiently minimized via a variant of max flow called submodular flow [6]. SoS functions can naturally express higher order priors involving, e.g., local image patches; however, it is difficult to fully exploit their expressive power because they have so many parameters. Rather than trying to formulate existing higher order priors as an SoS function, we take a discriminative learning approach, effectively searching the space of SoS functions for a higher order prior that performs well on our training set. We adopt a structural SVM approach [14, 33] and formulate the training problem in terms of quadratic programming; as a result we can efficiently search the space of SoS priors via an extended cutting-plane algorithm. We also show how the state-of-the-art max flow method for vision problems [10] can be modified to efficiently solve the submodular flow problem. Experimental comparisons are made against the OpenCV implementation of the GrabCut interactive segmentation technique [27], which uses hand-tuned parameters instead of machine learning. On a standard dataset [11] our method learns higher order priors with hundreds of parameter values, and produces significantly better segmentations. While our focus is on binary labeling problems, we show that our techniques can be naturally generalized to handle more than two labels. 1

    The P-ART framework for placement of virtual network services in a multi-cloud environment

    Get PDF
    Carriers’ network services are distributed, dynamic, and investment intensive. Deploying them as virtual network services (VNS) brings the promise of low-cost agile deployments, which reduce time to market new services. If these virtual services are hosted dynamically over multiple clouds, greater flexibility in optimizing performance and cost can be achieved. On the flip side, when orchestrated over multiple clouds, the stringent performance norms for carrier services become difficult to meet, necessitating novel and innovative placement strategies. In selecting the appropriate combination of clouds for placement, it is important to look ahead and visualize the environment that will exist at the time a virtual network service is actually activated. This serves multiple purposes — clouds can be selected to optimize the cost, the chosen performance parameters can be kept within the defined limits, and the speed of placement can be increased. In this paper, we propose the P-ART (Predictive-Adaptive Real Time) framework that relies on predictive-deductive features to achieve these objectives. With so much riding on predictions, we include in our framework a novel concept-drift compensation technique to make the predictions closer to reality by taking care of long-term traffic variations. At the same time, near real-time update of the prediction models takes care of sudden short-term variations. These predictions are then used by a new randomized placement heuristic that carries out a fast cloud selection using a least-cost latency-constrained policy. An empirical analysis carried out using datasets from a queuing-theoretic model and also through implementation on CloudLab, proves the effectiveness of the P-ART framework. The placement system works fast, placing thousands of functions in a sub-minute time frame with a high acceptance ratio, making it suitable for dynamic placement. We expect the framework to be an important step in making the deployment of carrier-grade VNS on multi-cloud systems, using network function virtualization (NFV), a reality

    The P-ART framework for placement of virtual network services in a multi-cloud environment

    Get PDF
    Carriers network services are distributed, dynamic, and investment intensive. Deploying them as virtual network services (VNS) brings the promise of low-cost agile deployments, which reduce time to market new services. If these virtual services are hosted dynamically over multiple clouds, greater flexibility in optimizing performance and cost can be achieved. On the flip side, when orchestrated over multiple clouds, the stringent performance norms for carrier services become difficult to meet, necessitating novel and innovative placement strategies. In selecting the appropriate combination of clouds for placement, it is important to look ahead and visualize the environment that will exist at the time a virtual network service is actually activated. This serves multiple purposes clouds can be selected to optimize the cost, the chosen performance parameters can be kept within the defined limits, and the speed of placement can be increased. In this paper, we propose the P-ART (Predictive-Adaptive Real Time) framework that relies on predictive-deductive features to achieve these objectives. With so much riding on predictions, we include in our framework a novel concept-drift compensation technique to make the predictions closer to reality by taking care of long-term traffic variations. At the same time, near real-time update of the prediction models takes care of sudden short-term variations. These predictions are then used by a new randomized placement heuristic that carries out a fast cloud selection using a least-cost latency-constrained policy. An empirical analysis carried out using datasets from a queuing-theoretic model and also through implementation on CloudLab, proves the effectiveness of the P-ART framework. The placement system works fast, placing thousands of functions in a sub-minute time frame with a high acceptance ratio, making it suitable for dynamic placement. We expect the framework to be an important step in making the deployment of carrier-grade VNS on multi-cloud systems, using network function virtualization (NFV), a reality.This publication was made possible by NPRP grant # 8-634-1-131 from the Qatar National Research Fund (a member of Qatar Foundation), National Science Foundation, USA � CNS-1718929 and National Science Foundation, USA � CNS-1547380 .Scopu

    Holistic Network Defense: Fusing Host and Network Features for Attack Classification

    Get PDF
    This work presents a hybrid network-host monitoring strategy, which fuses data from both the network and the host to recognize malware infections. This work focuses on three categories: Normal, Scanning, and Infected. The network-host sensor fusion is accomplished by extracting 248 features from network traffic using the Fullstats Network Feature generator and from the host using text mining, looking at the frequency of the 500 most common strings and analyzing them as word vectors. Improvements to detection performance are made by synergistically fusing network features obtained from IP packet flows and host features, obtained from text mining port, processor, logon information among others. In addition, the work compares three different machine learning algorithms and updates the script required to obtain network features. Hybrid method results outperformed host only classification by 31.7% and network only classification by 25%. The new approach also reduces the number of alerts while remaining accurate compared with the commercial IDS SNORT. These results make it such that even the most typical users could understand alert classification messages

    MFIRE-2: A Multi Agent System for Flow-based Intrusion Detection Using Stochastic Search

    Get PDF
    Detecting attacks targeted against military and commercial computer networks is a crucial element in the domain of cyberwarfare. The traditional method of signature-based intrusion detection is a primary mechanism to alert administrators to malicious activity. However, signature-based methods are not capable of detecting new or novel attacks. This research continues the development of a novel simulated, multiagent, flow-based intrusion detection system called MFIRE. Agents in the network are trained to recognize common attacks, and they share data with other agents to improve the overall effectiveness of the system. A Support Vector Machine (SVM) is the primary classifier with which agents determine an attack is occurring. Agents are prompted to move to different locations within the network to find better vantage points, and two methods for achieving this are developed. One uses a centralized reputation-based model, and the other uses a decentralized model optimized with stochastic search. The latter is tested for basic functionality. The reputation model is extensively tested in two configurations and results show that it is significantly superior to a system with non-moving agents. The resulting system, MFIRE-2, demonstrates exciting new network defense capabilities, and should be considered for implementation in future cyberwarfare applications

    Predicting Flavonoid UGT Regioselectivity with Graphical Residue Models and Machine Learning.

    Get PDF
    Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported

    Simple low cost causal discovery using mutual information and domain knowledge

    Get PDF
    PhDThis thesis examines causal discovery within datasets, in particular observational datasets where normal experimental manipulation is not possible. A number of machine learning techniques are examined in relation to their use of knowledge and the insights they can provide regarding the situation under study. Their use of prior knowledge and the causal knowledge produced by the learners are examined. Current causal learning algorithms are discussed in terms of their strengths and limitations. The main contribution of the thesis is a new causal learner LUMIN that operates with a polynomial time complexity in both the number of variables and records examined. It makes no prior assumptions about the form of the relationships and is capable of making extensive use of available domain information. This learner is compared to a number of current learning algorithms and it is shown to be competitive with them

    A Comprehensive Survey on Rare Event Prediction

    Full text link
    Rare event prediction involves identifying and forecasting events with a low probability using machine learning and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the machine learning pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and machine learning. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.Comment: 44 page
    corecore