Search CORE

26 research outputs found

Improving the Validity of Decision Trees as Explanations

Author: Marecek Jakub
Nemecek Jiri
Pevny Tomas
Publication venue
Publication date: 13/06/2023
Field of study

In classification and forecasting with tabular data, one often utilizes tree-based models. This can be competitive with deep neural networks on tabular data [cf. Grinsztajn et al., NeurIPS 2022, arXiv:2207.08815] and, under some conditions, explainable. The explainability depends on the depth of the tree and the accuracy in each leaf of the tree. Here, we train a low-depth tree with the objective of minimising the maximum misclassification error across each leaf node, and then ``suspend'' further tree-based models (e.g., trees of unlimited depth) from each leaf of the low-depth tree. The low-depth tree is easily explainable, while the overall statistical performance of the combined low-depth and suspended tree-based models improves upon decision trees of unlimited depth trained using classical methods (e.g., CART) and is comparable to state-of-the-art methods (e.g., well-tuned XGBoost)

arXiv.org e-Print Archive

Is ensemble classifier needed for steganalysis in high-dimensional feature spaces?

Author: Cogranne Rémi
Fridrich Jessica
Pevny Tomas
Sedighi Vahid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

International audienceThe ensemble classifier, based on Fisher Linear Discriminant base learners, was introduced specifically for steganalysis of digital media, which currently uses high-dimensional feature spaces. Presently it is probably the most used method to design supervised classifier for steganalysis of digital images because of its good detection accuracy and small computational cost. It has been assumed by the community that the classifier implements a non-linear boundary through pooling binary decision of individual classifiers within the ensemble. This paper challenges this assumption by showing that linear classifier obtained by various regularizations of the FLD can perform equally well as the ensemble. Moreover it demonstrates that using state of the art solvers linear classifiers can be trained more efficiently and offer certain potential advantages over the original ensemble leading to much lower computational complexity than the ensemble classifier. All claims are supported experimentally on a wide spectrum of stego schemes operating in both the spatial and JPEG domains with a multitude of rich steganalysis feature sets

HAL Descartes

Hal-Diderot

Using High-Dimensional Image Models to Perform Highly Undetectable Steganography

Author: Bas Patrick
Filler Tomas
Pevny Tomas
Publication venue: HAL CCSD
Publication date: 16/06/2010
Field of study

International audienceThis paper presents a complete methodology for designing practical and highly-undetectable stegosystems for real digital media. The main design principle is to minimize a suitably-defined distortion by means of efficient coding algorithm. The distortion is defined as a weighted difference of extended state-of-the-art feature vectors already used in steganalysis. This allows us to "preserve" the model used by steganalyst and thus be undetectable even for large payloads. This framework can be efficiently implemented even when the dimensionality of the feature set used by the embedder is larger than 10^{7}. The high dimensional model is necessary to avoid known security weaknesses. Although high-dimensional models might be problem in steganalysis, we explain, why they are acceptable in steganography. As an example, we introduce HUGO, a new embedding algorithm for spatial-domain digital images and we contrast its performance with LSB matching. On the BOWS2 image database and in contrast with LSB matching, HUGO allows the embedder to hide 7\times longer message with the same level of security level

HAL Descartes

Hal-Diderot