259 research outputs found

    Classifier combination schemes in speech impediment therapy systems

    Get PDF
    In the therapy of the hearing impaired one of the key problems is how to deal with the lack of proper auditive feedback which impedes the development of intelligible speech. The effectiveness of the therapy relies heavily on accurate phoneme recognition [1, 4, 17]. Because of the environmental difficulties, simple recognition algorithms may have a weak classification performance, so various techniques such as normalization and classifier combination are applied to increase the recognition accuracy. This paper examines Vocal Tract Length Normalization techniques [5, 13] focusing mainly on the real-time parameter estimation [12], and the majority of classifier combination schemes, including the traditional (Prod, Sum, Min, Max) [7], basic linear (simple, weighted, AHP-based [6] averaging), and some special linear (Bagging, Boosting) combinations. Based on the results we conclude that hybrid combinations can improve the effectiveness of the real-time normalization methods

    DeltaPhish: Detecting Phishing Webpages in Compromised Websites

    Full text link
    The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild, to maximize profit while hindering attack traceability, detection and blacklisting. To the best of our knowledge, this is the first work that specifically leverages this adversarial behavior for detection purposes. We show that phishing webpages can be accurately detected by highlighting HTML code and visual differences with respect to other (legitimate) pages hosted within a compromised website. Our system, named DeltaPhish, can be installed as part of a web application firewall, to detect the presence of anomalous content on a website after compromise, and eventually prevent access to it. DeltaPhish is also robust against adversarial attempts in which the HTML code of the phishing page is carefully manipulated to evade detection. We empirically evaluate it on more than 5,500 webpages collected in the wild from compromised websites, showing that it is capable of detecting more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages. We further show that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system.Comment: Preprint version of the work accepted at ESORICS 201

    Local learning for multi-layer, multi-component predictive system

    Get PDF
    This study introduces a new multi-layer multi-component ensemble. The components of this ensemble are trained locally on subsets of features for disjoint sets of data. The data instances are assigned to local regions using the similarity of their features pairwise squared correlation. Many ensemble methods encourage diversity among their base predictors by training them on different subsets of data or different subsets of features. In the proposed architecture the local regions contain disjoint sets of data and for this data only the most similar features are selected. The pairwise squared correlations of the features are used to weight the predictions of the ensemble's models. The proposed architecture has been tested on a number of data sets and its performance was compared to five benchmark algorithms. The results showed that the testing accuracy of the developed architecture is comparable to the rotation forest and is better than the other benchmark algorithms

    Multi-Class Classification Averaging Fusion for Detecting Steganography

    Get PDF
    Multiple classifier fusion has the capability of increasing classification accuracy over individual classifier systems. This paper focuses on the development of a multi-class classification fusion based on weighted averaging of posterior class probabilities. This fusion system is applied to the steganography fingerprint domain, in which the classifier identifies the statistical patterns in an image which distinguish one steganography algorithm from another. Specifically we focus on algorithms in which jpeg images provide the cover in order to communicate covertly. The embedding methods targeted are F5, JSteg, Model Based, OutGuess, and StegHide. The developed multi-class steganalvsis system consists of three levels: (1) feature preprocessing in which a projection function maps the input vectors into a separable space, (2) classifier system using an ensemble of classifiers, and (3) two weighted fusion techniques are compared, the first is a well known variance weighted fusion and an Gaussian weighted fusion. Results show that through the novel addition of the classifier fusion step to the multi-class steganalysis system, the classification accuracy is improved by up to 12%

    Comparison of Classifier Fusion Methods for Predicting Response to Anti HIV-1 Therapy

    Get PDF
    BACKGROUND: Analysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations. However, commonly used HIV-1 genotype interpretation systems provide only classifications for single drugs. The EuResist initiative has collected data from about 18,500 patients to train three classifiers for predicting response to combination antiretroviral therapy, given the viral genotype and further information. In this work we compare different classifier fusion methods for combining the individual classifiers. PRINCIPAL FINDINGS: The individual classifiers yielded similar performance, and all the combination approaches considered performed equally well. The gain in performance due to combining methods did not reach statistical significance compared to the single best individual classifier on the complete training set. However, on smaller training set sizes (200 to 1,600 instances compared to 2,700) the combination significantly outperformed the individual classifiers (p<0.01; paired one-sided Wilcoxon test). Together with a consistent reduction of the standard deviation compared to the individual prediction engines this shows a more robust behavior of the combined system. Moreover, using the combined system we were able to identify a class of therapy courses that led to a consistent underestimation (about 0.05 AUC) of the system performance. Discovery of these therapy courses is a further hint for the robustness of the combined system. CONCLUSION: The combined EuResist prediction engine is freely available at http://engine.euresist.org
    • …
    corecore