Search CORE

2,774 research outputs found

Automatic Detection of Malware-Generated Domains with Recurrent Neural Models

Author: Lison Pierre
Mavroeidis Vasileios
Publication venue
Publication date: 20/09/2017
Field of study

Modern malware families often rely on domain-generation algorithms (DGAs) to determine rendezvous points to their command-and-control server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning approach based on recurrent neural networks is able to detect domain names generated by DGAs with high precision. The neural models are estimated on a large training set of domains generated by various malwares. Experimental results show that this data-driven approach can detect malware-generated domain names with a F_1 score of 0.971. To put it differently, the model can automatically detect 93 % of malware-generated domain names for a false positive rate of 1:100.Comment: Submitted to NISK 201

arXiv.org e-Print Archive

BIBSYS: Open Journals Systems

Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models

Author: Bibauw Serge
Lison Pierre
Publication venue
Publication date: 01/01/2017
Field of study

Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.Comment: Accepted to SIGDIAL 201

arXiv.org e-Print Archive

Crossref

Redefining Context Windows for Word Embedding Models: An Experimental Study

Author: Kutuzov Andrey
Lison Pierre
Publication venue
Publication date: 01/01/2017
Field of study

Distributional semantic models learn vector representations of words through the contexts they occur in. Although the choice of context (which often takes the form of a sliding window) has a direct influence on the resulting embeddings, the exact role of this model component is still not fully understood. This paper presents a systematic analysis of context windows based on a set of four distinct hyper-parameters. We train continuous Skip-Gram models on two English-language corpora for various combinations of these hyper-parameters, and evaluate them on both lexical similarity and analogy tasks. Notable experimental results are the positive impact of cross-sentential contexts and the surprisingly good performance of right-context windows

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

La communauté de l’Arche

Author: Thibodeau Lison
Publication venue: 'Consortium Erudit'
Publication date: 01/01/2006
Field of study

Érudit

Probabilistic Dialogue Models with Prior Domain Knowledge

Author: Lison Pierre
Publication venue
Publication date: 01/01/2012
Field of study

Probabilistic models such as Bayesian Networks are now in widespread use in spoken dialogue systems, but their scalability to complex interaction domains remains a challenge. One central limitation is that the state space of such models grows exponentially with the problem size, which makes parameter estimation increasingly difficult, especially for domains where only limited training data is available. In this paper, we show how to capture the underlying structure of a dialogue domain in terms of probabilistic rules operating on the dialogue state. The probabilistic rules are associated with a small, compact set of parameters that can be directly estimated from data. We argue that the introduction of this abstraction mechanism yields probabilistic models that are easier to learn and generalise better than their unstructured counterparts. We empirically demonstrate the benefits of such an approach learning a dialogue policy for a human-robot interaction domain based on a Wizard-of-Oz data set. Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pages 179–188, Seoul, South Korea, 5-6 July 2012

NORA - Norwegian Open Research Archives

Échecs et compromis de la justice pénale internationale (Note)

Author: Néel Lison
Publication venue: 'Consortium Erudit'
Publication date: 01/01/1998
Field of study

Depuis longtemps déjà le problème de la création d'un tribunal pénal international permanent est soulevé afin déjuger les individus coupables de crimes de guerre, de crimes contre l'humanité ou de crimes de génocide. La multiplication des guerres et des conflits intra-étatiques remettent à l'ordre du jour ce problème. Les juridictions nationales, soit par une volonté politique insuffisante, soit par manque de moyens, ont laissé échapper la plupart des responsables des violations graves du droit humanitaire depuis la Seconde Guerre mondiale. Les conflits yougoslave et rwandais ont remis en cause l'efficacité de la communauté internationale face au respect du droit international humanitaire et face à la lutte contre l'impunité de ces crimes internationaux.The problem of creating a permanent International Criminal Court to judge individual's crimes of war, crimes against humanity or crimes of genocide have been discussed for a very long time. Ever increasing wars and internal conflicts continuously bring this problem to light. Since the Second World War, perpetrators of crimes against humanity have gone unpunished by national jurisdictions either because of insufficient political will or lack of means. More recently, the Yugoslavian and Rwandan conflicts have brought into question the efficiency of the international community's response to the respect of International Humanitarian Law and the struggle to bring international crimes to justice

Érudit

Model-based Bayesian Reinforcement Learning for Dialogue Management

Author: Lison Pierre
Publication venue
Publication date: 01/01/2013
Field of study

Reinforcement learning methods are increasingly used to optimise dialogue policies from experience. Most current techniques are model-free: they directly estimate the utility of various actions, without explicit model of the interaction dynamics. In this paper, we investigate an alternative strategy grounded in model-based Bayesian reinforcement learning. Bayesian inference is used to maintain a posterior distribution over the model parameters, reflecting the model uncertainty. This parameter distribution is gradually refined as more data is collected and simultaneously used to plan the agent's actions. Within this learning framework, we carried out experiments with two alternative formalisations of the transition model, one encoded with standard multinomial distributions, and one structured with probabilistic rules. We demonstrate the potential of our approach with empirical results on a user simulator constructed from Wizard-of-Oz data in a human-robot interaction scenario. The results illustrate in particular the benefits of capturing prior domain knowledge with high-level rules

arXiv.org e-Print Archive

CiteSeerX

NORA - Norwegian Open Research Archives

Minimal Computing

Author: Lison Andrew
Publication venue: Humboldt-Universität zu Berlin
Publication date: 07/12/2022
Field of study

Not Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin