Search CORE

34,734 research outputs found

Exploring signature multiplicity in microarray data using ensembles of randomized trees

Author: Geurts Pierre
Saeys Yvan
Publication venue: Technical University München
Publication date: 01/01/2011
Field of study

A challenging and novel direction for feature selection research in computational biology is the analysis of signature multiplicity. In this work, we propose to investigate the eect of signature multiplicity on feature importance scores derived from tree-based ensemble methods. We show that looking at individual tree rankings in an ensemble could highlight the existence of multiple signatures and we propose a simple post-processing method based on clustering that can return smaller signatures with better predictive performance than signatures derived from the global tree ranking at almost no additional cost

Ghent University Academic Bibliography

Steganographer Identification

Author: Breunig
Chen
Cortes
Erdogmus
Filler
Filler
Filler
Fridrich
Fridrich
Fridrich
Fridrich
Gretton
Guo
Hetzl
Holub
Holub
Holub
Holub
Ker
Ker
Ker
Ker
Ker
Ker
Ker
Ker
Kodovsky
Li
Li
Liu
Muandet
Pearson
Pevny
Pevný
Pevný
Pevný
Pevný
Rokach
Sahu
Sallee
Scholkopf
Shi
Song
Westfeld
Wu
Wu
Wu
Publication venue
Publication date: 16/04/2019
Field of study

Conventional steganalysis detects the presence of steganography within single objects. In the real-world, we may face a complex scenario that one or some of multiple users called actors are guilty of using steganography, which is typically defined as the Steganographer Identification Problem (SIP). One might use the conventional steganalysis algorithms to separate stego objects from cover objects and then identify the guilty actors. However, the guilty actors may be lost due to a number of false alarms. To deal with the SIP, most of the state-of-the-arts use unsupervised learning based approaches. In their solutions, each actor holds multiple digital objects, from which a set of feature vectors can be extracted. The well-defined distances between these feature sets are determined to measure the similarity between the corresponding actors. By applying clustering or outlier detection, the most suspicious actor(s) will be judged as the steganographer(s). Though the SIP needs further study, the existing works have good ability to identify the steganographer(s) when non-adaptive steganographic embedding was applied. In this chapter, we will present foundational concepts and review advanced methodologies in SIP. This chapter is self-contained and intended as a tutorial introducing the SIP in the context of media steganography.Comment: A tutorial with 30 page

arXiv.org e-Print Archive

Crossref

Pairwise meta-rules for better meta-learning-based algorithm ranking

Author: Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2013
Field of study

In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset

Research Commons@Waikato

Simulated evaluation of faceted browsing based on feature selection

Author: Bernejo Lopez P.
Hopfgartner F.
Jose J.M.
Urruty T.
Villa R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness

Enlighten

Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper

Author: A. Mousavi
Ambrosetti
Askin
Banks
Banks
Beylkin
Blum
Borgonovo
Braddock
Brodersen
Buchenneder
Bunke
Buonomo
Charaniya
Chen
Chi
Cloke
Cukier
De Pauw
Duffy
Durkee
Faghihi
Gaweda
Guyon
Hand
He
Hung
Jain
James
Joliffe
Kang
Kim
Kohavi
Krugera
Krzykacz-Hausmann
Kwak
Lallemand
Lavrač
Lemaire
Li
Li
Liu
McRae
Mirkin
Mladenić
Norvig
Park
Quevedo
Ragg
Robert
S. Poslad
S. Tavakoli
Saltelli
Saltelli
Shonkwiler
Sobol
Takagi
Talavera
Tavakoli
Unler
Uysal
Xing
Xu
Yang
Øksendal
Publication venue: 'Elsevier BV'
Publication date: 01/10/2013
Field of study

This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated.E

Crossref

Brunel University Research Archive