18 research outputs found

    Graph-based keyword spotting in historical handwritten documents

    Get PDF
    The amount of handwritten documents that is digitally available is rapidly increasing. However, we observe a certain lack of accessibility to these documents especially with respect to searching and browsing. This paper aims at closing this gap by means of a novel method for keyword spotting in ancient handwritten documents. The proposed system relies on a keypoint-based graph representation for individual words. Keypoints are characteristic points in a word image that are represented by nodes, while edges are employed to represent strokes between two keypoints. The basic task of keyword spotting is then conducted by a recent approximation algorithm for graph edit distance. The novel framework for graph-based keyword spotting is tested on the George Washington dataset on which a state-of-the-art reference system is clearly outperformed.Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). S+SSPR 2016: Structural, Syntactic, and Statistical Pattern Recognition pp. 564-573.http://link.springer.combookseries/5582017-11-05hj2017Informatic

    Shades of gray

    No full text

    Generating estimates of classification confidence for a case-based spam filter

    Get PDF
    Producing estimates of classification confidence is surprisingly difficult. One might expect that classifiers that can produce numeric classification scores (e.g. k-Nearest Neighbour, Na¨ıve Bayes or Support Vector Machines) could readily produce confidence estimates based on thresholds. In fact, this proves not to be the case, probably because these are not probabilistic classifiers in the strict sense. The numeric scores coming from k-Nearest Neighbour, Na¨ıve Bayes and Support Vector Machine classifiers are not well correlated with classification confidence. In this paper we describe a case-based spam filtering application that would benefit significantly from an ability to attach confidence predictions to positive classifications (i.e. messages classified as spam). We show that ‘obvious’ confidence metrics for a case-based classifier are not effective. We propose an ensemble-like solution that aggregates a collection of confidence metrics and show that this offers an effective solution in this spam filtering domain

    Secure Kernel Machines against Evasion Attacks

    No full text
    Machine learning is widely used in security-sensitive settings like spam and malware detection, although it has been shown that malicious data can be carefully modified at test time to evade detection. To overcome this limitation, adversaryaware learning algorithms have been developed, exploiting robust optimization and game-theoretical models to incorporate knowledge of potential adversarial data manipulations into the learning algorithm. Despite these techniques have been shown to be effective in some adversarial learning tasks, their adoption in practice is hindered by different factors, including the difficulty of meeting specific theoretical requirements, the complexity of implementation, and scalability issues, in terms of computational time and space required during training. In this work, we aim to develop secure kernel machines against evasion attacks that are not computationally more demanding than their non-secure counterparts. In particular, leveraging recent work on robustness and regularization, we show that the security of a linear classifier can be drastically improved by selecting a proper regularizer, depending on the kind of evasion attack, as well as unbalancing the cost of classification errors. We then discuss the security of nonlinear kernel machines, and show that a proper choice of the kernel function is crucial. We also show that unbalancing the cost of classification errors and varying some kernel parameters can further improve classifier security, yielding decision functions that better enclose the legitimate data. Our results on spam and PDF malware detection corroborate our analysis

    LLRF Commissioning at the European XFEL

    No full text
    The European X-ray Free-Electron Laser (XFEL) at Deutsches Elektronen-Synchrotron (DESY), Hamburg, Germany is a user facility under commissioning, providing ultrashort X-ray flashes with a high brilliance in the near future. All LLRF stations of the injector, covering the normal conducting RF gun, A1 (8 1.3 GHz superconducting cavities (SCs) and AH1 (8 3.9 GHz SCs), were successfully commissioned by the end of 2015. The injector was operated with beam transmission to the injector dump since then. After the conclusion of the construction work in the XFEL accelerator tunnel (XTL), the commissioning of 22 LLRF stations (A2 to A23) started with the beginning of 2017. Every station consists of a semi-distributed LLRF system controlling 32 1.3 GHz SCs. Stable operation with beam transport to the main dump (TLD) was achieved. The commissioning procedure applied, experience gained and performance reached are described
    corecore