111 research outputs found

    A hybrid intrusion detection system

    Get PDF
    Anomaly intrusion detection normally has high false alarm rates, and a high volume of false alarms will prevent system administrators identifying the real attacks. Machine learning methods provide an effective way to decrease the false alarm rate and improve the detection rate of anomaly intrusion detection. In this research, we propose a novel approach using kernel methods and Support Vector Machine (SVM) for improving anomaly intrusion detectors\u27 accuracy. Two kernels, STIDE kernel and Markov Chain kernel, are developed specially for intrusion detection applications. The experiments show the STIDE and Markov Chain kernel based two class SVM anomaly detectors have better accuracy rate than the original STIDE and Markov Chain anomaly detectors.;Generally, anomaly intrusion detection approaches build normal profiles from labeled training data. However, labeled training data for intrusion detection is expensive and not easy to obtain. We propose an anomaly detection approach, using STIDE kernel and Markov Chain kernel based one class SVM, that does not need labeled training data. To further increase the detection rate and lower the false alarm rate, an approach of integrating specification based intrusion detection with anomaly intrusion detection is also proposed.;This research also establish a platform which generates automatically both misuse and anomaly intrusion detection software agents. In our method, a SIFT representing an intrusion is automatically converted to a Colored Petri Net (CPNs) representing an intrusion detection template, subsequently, the CPN is compiled into code for misuse intrusion detection software agents using a compiler and dynamically loaded and launched for misuse intrusion detection. On the other hand, a model representing a normal profile is automatically generated from training data, subsequently, an anomaly intrusion detection agent which carries this model is generated and launched for anomaly intrusion detection. By engaging both misuse and anomaly intrusion detection agents, our system can detect known attacks as well as novel unknown attacks

    35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

    Get PDF

    Compression guidée par automate et noyaux rationnels

    Get PDF
    Due to the expansion of datas, compression algorithms are now crucial algorithms. We address here the problem of finding an optimal compression algorithm with respect to a given Markovian source. To this purpose, we extend the classical Huffman algorithm. The kernels are popular methods to measure the similarity between words for classication and learning. We generalize the definition of rational kernels in order to apply kernels to the comparison of languages. We study this generalization for factor and subsequence kerneland prove that these kernels are defined for parameters chosen in an appropriate interval. We give different methods to build weighted transducers which compute these kernelsEn raison de l'expansion des données, les algorithmes de compression sont désormais cruciaux. Nous abordons ici le problème de trouver des algorithmes de compression optimaux par rapport à une source de Markov donnée. A cet effet, nous étendons l'algorithme de Huffman classique. Pour se faire premièrement on applique Huffman localement à chaque état de la source Markovienne, en donnant le résultat de l'efficacité obtenue pour cet algorithme. Mais pour bien approfondir et optimiser quasiment l'efficacité de l'algorithme, on donne un autre algorithme qui est toujours appliqué localement à chaque états de la source Markovienne, mais cette fois ci en codant les facteurs partant de ces états de la source Markovienne de sorte à ce que la probabilité du facteur soit une puissance de 1/2 (sachant que l'algorithme de Huffman est optimal si et seulement si tous les symboles à coder ont une probabilité puissance de 1/2). En perspective de ce chapitre on donne un autre algorithme (restreint à la compression de l'étoile) pour coder une expression à multiplicité, en attendant dans l'avenir à coder une expression complèt

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

    Get PDF

    Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science (STACS'09)

    Get PDF
    The Symposium on Theoretical Aspects of Computer Science (STACS) is held alternately in France and in Germany. The conference of February 26-28, 2009, held in Freiburg, is the 26th in this series. Previous meetings took place in Paris (1984), Saarbr¨ucken (1985), Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan (1992), W¨urzburg (1993), Caen (1994), M¨unchen (1995), Grenoble (1996), L¨ubeck (1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), Berlin (2003), Montpellier (2004), Stuttgart (2005), Marseille (2006), Aachen (2007), and Bordeaux (2008). ..

    Efficient similarity computations on parallel machines using data shaping

    Get PDF
    Similarity computation is a fundamental operation in all forms of data. Big Data is, typically, characterized by attributes such as volume, velocity, variety, veracity, etc. In general, Big Data variety appears as structured, semi-structured or unstructured forms. The volume of Big Data in general, and semi-structured data in particular, is increasing at a phenomenal rate. Big Data phenomenon is posing new set of challenges to similarity computation problems occurring in semi-structured data. Technology and processor architecture trends suggest very strongly that future processors shall have ten\u27s of thousands of cores (hardware threads). Another crucial trend is that ratio between on-chip and off-chip memory to core counts is decreasing. State-of-the-art parallel computing platforms such as General Purpose Graphics Processors (GPUs) and MICs are promising for high performance as well high throughput computing. However, processing semi-structured component of Big Data efficiently using parallel computing systems (e.g. GPUs) is challenging. Reason being most of the emerging platforms (e.g. GPUs) are organized as Single Instruction Multiple Thread/Data machines which are highly structured, where several cores (streaming processors) operate in lock-step manner, or they require a high degree of task-level parallelism. We argue that effective and efficient solutions to key similarity computation problems need to operate in a synergistic manner with the underlying computing hardware. Moreover, semi-structured form input data needs to be shaped or reorganized with the goal to exploit the enormous computing power of \textit{state-of-the-art} highly threaded architectures such as GPUs. For example, shaping input data (via encoding) with minimal data-dependence can facilitate flexible and concurrent computations on high throughput accelerators/co-processors such as GPU, MIC, etc. We consider various instances of traditional and futuristic problems occurring in intersection of semi-structured data and data analytics. Preprocessing is an operation common at initial stages of data processing pipelines. Typically, the preprocessing involves operations such as data extraction, data selection, etc. In context of semi-structured data, twig filtering is used in identifying (and extracting) data of interest. Duplicate detection and record linkage operations are useful in preprocessing tasks such as data cleaning, data fusion, and also useful in data mining, etc., in order to find similar tree objects. Likewise, tree edit is a fundamental metric used in context of tree problems; and similarity computation between trees another key problem in context of Big Data. This dissertation makes a case for platform-centric data shaping as a potent mechanism to tackle the data- and architecture-borne issues in context of semi-structured data processing on GPU and GPU-like parallel architecture machines. In this dissertation, we propose several data shaping techniques for tree matching problems occurring in semi-structured data. We experiment with real world datasets. The experimental results obtained reveal that the proposed platform-centric data shaping approach is effective for computing similarities between tree objects using GPGPUs. The techniques proposed result in performance gains up to three orders of magnitude, subject to problem and platform

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

    Probabilistic Logic, Probabilistic Regular Expressions, and Constraint Temporal Logic

    Get PDF
    The classic theorems of Büchi and Kleene state the expressive equivalence of finite automata to monadic second order logic and regular expressions, respectively. These fundamental results enjoy applications in nearly every field of theoretical computer science. Around the same time as Büchi and Kleene, Rabin investigated probabilistic finite automata. This equally well established model has applications ranging from natural language processing to probabilistic model checking. Here, we give probabilistic extensions Büchi\\\''s theorem and Kleene\\\''s theorem to the probabilistic setting. We obtain a probabilistic MSO logic by adding an expected second order quantifier. In the scope of this quantifier, membership is determined by a Bernoulli process. This approach turns out to be universal and is applicable for finite and infinite words as well as for finite trees. In order to prove the expressive equivalence of this probabilistic MSO logic to probabilistic automata, we show a Nivat-theorem, which decomposes a recognisable function into a regular language, homomorphisms, and a probability measure. For regular expressions, we build upon existing work to obtain probabilistic regular expressions on finite and infinite words. We show the expressive equivalence between these expressions and probabilistic Muller-automata. To handle Muller-acceptance conditions, we give a new construction from probabilistic regular expressions to Muller-automata. Concerning finite trees, we define probabilistic regular tree expressions using a new iteration operator, called infinity-iteration. Again, we show that these expressions are expressively equivalent to probabilistic tree automata. On a second track of our research we investigate Constraint LTL over multidimensional data words with data values from the infinite tree. Such LTL formulas are evaluated over infinite words, where every position possesses several data values from the infinite tree. Within Constraint LTL on can compare these values from different positions. We show that the model checking problem for this logic is PSPACE-complete via investigating the emptiness problem of Constraint Büchi automata
    corecore