59,830 research outputs found

    Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning

    Get PDF
    This paper focuses on the specific problem of Big Data classification of network intrusion traffic. It discusses the system challenges presented by the Big Data problems associated with network intrusion prediction. The prediction of a possible intrusion attack in a network requires continuous collection of traffic data and learning of their characteristics on the fly. The continuous collection of traffic data by the network leads to Big Data problems that are caused by the volume, variety and velocity properties of Big Data. The learning of the network characteristics requires machine learning techniques that capture global knowledge of the traffic patterns. The Big Data properties will lead to significant system challenges to implement machine learning frameworks. This paper discusses the problems and challenges in handling Big Data classification using geometric representation-learning techniques and the modern Big Data networking technologies. In particular this paper discusses the issues related to combining supervised learning techniques, representation-learning techniques, machine lifelong learning techniques and Big Data technologies (e.g. Hadoop, Hive and Cloud) for solving network traffic classification problems

    Selección de discriminadores de tráfico de red para clasificación en tiempo real

    Get PDF
    There are several techniques to select a set of traffic features for traffic classification. However, most studies ignore the domain knowledge where traffic analysis or classification is performed and do not consider the always moving information carried in the networks. This paper describes a selection process of online network-traffic discriminators. We obtained 24 traffic features that can be processed on the fly and propose them as a base attribute set for future domain-aware online analysis, processing, or classification. For the selection of a set of traffic discriminators, and to avoid the inconveniences mentioned, we carried out three steps. The first step is a context knowledge-based manual selection of traffic features that meet the condition of being obtained on the fly from the flow. The second step is focused on the quality analysis of previously selected attributes to ensure the relevance of each one when performing a traffic classification. In the third step, the implementation of several incremental learning algorithms verified the usefulness of such attributes in online traffic classification processes. Existen varias técnicas para seleccionar un conjunto de variables para clasificación del tráfico de red. Sin embargo, muchos estudios ignoran el ámbito del conocimiento en donde el análisis y clasificación del tráfico tiene lugar y no consideran la información, siempre en movimiento, que se transporta en dichas redes. Este artículo describe el proceso de selección de discriminadores tráfico de redes en línea. Se obtuvieron 24 características que pueden procesarse en tiempo real y se proponen como los conjuntos de atributos base para futuros análisis, procesamiento y calificación conscientes del dominio (domain-aware). Para la selección de un conjunto de discriminadores de tráfico y con el fin de evitar los inconvenientes mencionados anteriormente, se llevaron a cabo tres etapas. La primera consiste en la selección manual basada en el conocimiento contextual de las características de tráfico de red que tengan las condiciones de obtener en tiempo real a partir del flujo. La segunda etapa se enfoca en la calidad del análisis de los atributos previamente seleccionados para asegurar la relevancia de cada uno a la hora de efectuar la clasificación del tráfico. En la tercera etapa, la implementación de varios algoritmos de aprendizaje incremental verifican la idoneidad de tales atributos en procesos de clasificación de tráfico en línea

    Highly Abrasion-resistant and Long-lasting Concrete

    Get PDF
    Studded tire usage in Alaska contributes to rutting damage on pavements resulting in high maintenance costs and safety issues. In this study binary, ternary, and quaternary highly-abrasion resistant concrete mix designs, using supplementary cementitious materials (SCMs), were developed. The fresh, mechanical and durability properties of these mix designs were then tested to determine an optimum highly-abrasion resistant concrete mix that could be placed in cold climates to reduce rutting damage. SCMs used included silica fume, ground granulated blast furnace slag, and type F fly ash. Tests conducted measured workability, air content, drying shrinkage, compressive strength, flexural strength, and chloride ion permeability. Resistance to freeze-thaw cycles, scaling due to deicers, and abrasion resistance were also measured. A survey and literature review on concrete pavement practices in Alaska and other cold climates was also conducted. A preliminary construction cost analysis comparing the concrete mix designs developed was also completed

    Dual Averaging Method for Online Graph-structured Sparsity

    Full text link
    Online learning algorithms update models via one sample per iteration, thus efficient to process large-scale datasets and useful to detect malicious events for social benefits, such as disease outbreak and traffic congestion on the fly. However, existing algorithms for graph-structured models focused on the offline setting and the least square loss, incapable for online setting, while methods designed for online setting cannot be directly applied to the problem of complex (usually non-convex) graph-structured sparsity model. To address these limitations, in this paper we propose a new algorithm for graph-structured sparsity constraint problems under online setting, which we call \textsc{GraphDA}. The key part in \textsc{GraphDA} is to project both averaging gradient (in dual space) and primal variables (in primal space) onto lower dimensional subspaces, thus capturing the graph-structured sparsity effectively. Furthermore, the objective functions assumed here are generally convex so as to handle different losses for online learning settings. To the best of our knowledge, \textsc{GraphDA} is the first online learning algorithm for graph-structure constrained optimization problems. To validate our method, we conduct extensive experiments on both benchmark graph and real-world graph datasets. Our experiment results show that, compared to other baseline methods, \textsc{GraphDA} not only improves classification performance, but also successfully captures graph-structured features more effectively, hence stronger interpretability.Comment: 11 pages, 14 figure

    Deep supervised learning using local errors

    Get PDF
    Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers. Learning using delayed and non-local errors makes it hard to reconcile backpropagation with the learning mechanisms observed in biological neural networks as it requires the neurons to maintain a memory of the input long enough until the higher-layer errors arrive. In this paper, we propose an alternative learning mechanism where errors are generated locally in each layer using fixed, random auxiliary classifiers. Lower layers could thus be trained independently of higher layers and training could either proceed layer by layer, or simultaneously in all layers using local error information. We address biological plausibility concerns such as weight symmetry requirements and show that the proposed learning mechanism based on fixed, broad, and random tuning of each neuron to the classification categories outperforms the biologically-motivated feedback alignment learning technique on the MNIST, CIFAR10, and SVHN datasets, approaching the performance of standard backpropagation. Our approach highlights a potential biological mechanism for the supervised, or task-dependent, learning of feature hierarchies. In addition, we show that it is well suited for learning deep networks in custom hardware where it can drastically reduce memory traffic and data communication overheads

    Harvesting Data from Advanced Technologies

    Get PDF
    Data streams are emerging everywhere such as Web logs, Web page click streams, sensor data streams, and credit card transaction flows. Different from traditional data sets, data streams are sequentially generated and arrive one by one rather than being available for random access before learning begins, and they are potentially huge or even infinite that it is impractical to store the whole data. To study learning from data streams, we target online learning, which generates a best–so far model on the fly by sequentially feeding in the newly arrived data, updates the model as needed, and then applies the learned model for accurate real-time prediction or classification in real-world applications. Several challenges arise from this scenario: first, data is not available for random access or even multiple access; second, data imbalance is a common situation; third, the performance of the model should be reasonable even when the amount of data is limited; fourth, the model should be updated easily but not frequently; and finally, the model should always be ready for prediction and classification. To meet these challenges, we investigate streaming feature selection by taking advantage of mutual information and group structures among candidate features. Streaming feature selection reduces the number of features by removing noisy, irrelevant, or redundant features and selecting relevant features on the fly, and brings about palpable effects for applications: speeding up the learning process, improving learning accuracy, enhancing generalization capability, and improving model interpretation. Compared with traditional feature selection, which can only handle pre-given data sets without considering the potential group structures among candidate features, streaming feature selection is able to handle streaming data and select meaningful and valuable feature sets with or without group structures on the fly. In this research, we propose 1) a novel streaming feature selection algorithm (GFSSF, Group Feature Selection with Streaming Features) by exploring mutual information and group structures among candidate features for both group and individual levels of feature selection from streaming data, 2) a lazy online prediction model with data fusion, feature selection and weighting technologies for real-time traffic prediction from heterogeneous sensor data streams, 3) a lazy online learning model (LB, Live Bayes) with dynamic resampling technology to learn from imbalanced embedded mobile sensor data streams for real-time activity recognition and user recognition, and 4) a lazy update online learning model (CMLR, Cost-sensitive Multinomial Logistic Regression) with streaming feature selection for accurate real-time classification from imbalanced and small sensor data streams. Finally, by integrating traffic flow theory, advanced sensors, data gathering, data fusion, feature selection and weighting, online learning and visualization technologies to estimate and visualize the current and future traffic, a real-time transportation prediction system named VTraffic is built for the Vermont Agency of Transportation

    Soil Stabilization Manual 2014 Update

    Get PDF
    Soil Stabilization is used for a variety of activities including temporary wearing curses, working platforms, improving poor subgrade materials, upgrading marginal materials, dust control, and recycling old roads containing marginal materials. There are a number methods of stabilizing soils including modifying the gradation, the use of asphalt or cement stabilizers, geofiber stabilization and chemical stabilization. Selection of the method depends on the soil type, environment and application. This manual provide tools and guidance in the selection of the proper stabilization method and information on how to apply the method. A major portion of this manual is devoted to the use of stabilizing agents. The methods described here are considered best practices for Alaska.State of Alaska, Alaska Dept. of Transportation and Public Facilitie

    Structural Design Guidelines for Pervious Concrete Pavements

    Get PDF
    Pervious pavements have gained popularity in recent years as the transportation industry focuses on sustainability and environmental impact. This research investigated the structural design of pervious concrete pavements. There is no standard design method; therefore, the goal was to lessen ambiguity surrounding the use of pervious concrete for pavement structures. By characterization of the rigid pavement design equation from the 1993 AASHTO Structural Design Guide for Design of Pavement Structures through laboratory exploration and review of existing literature, a guide was created to assist engineers in the design of pervious concrete pavements
    • …
    corecore